From list at der-ingo.de Tue Apr 3 10:13:36 2007 From: list at der-ingo.de (Ingo Schmidt) Date: Tue Apr 3 10:13:39 2007 Subject: Automated conversion script In-Reply-To: <460E72AE.7070503@nogga.de> References: <644610523.20070330115430@der-ingo.de> <460CF971.7000702@nogga.de> <798961065.20070331085202@der-ingo.de> <460E72AE.7070503@nogga.de> Message-ID: <223470932.20070403161336@der-ingo.de> Hi Dirk! > I don't know which version you used before, but I wouldn't expect this > behavoir. You were right. I didn't remember correctly. I checked the behaviour again and now I know what did bug me so much :-) I had used the nightly build of Feb 14th. But have a look at these two examples: Node-copyfrom-path: orphaned/_TRIAAAAA/jsp/screens/de/xyz.txt Node-copyfrom-path: orphaned/_ZRIAAAAA/css/bla.css I did use the VSS History function to find out where these 2 files originally were: oldprojectname/web/jsp/screens/de/xyz.txt oldprojectname/web/css/bla.css In this particular case renaming in the resulting dumpfile is possible, because the orphaned name and real project name have the same folder depth. But look at this example: Node-copyfrom-path: orphaned/_DSKAAAAA/abc.xml Node-copyfrom-path: orphaned/_ESKAAAAA/def.xml Node-copyfrom-path: orphaned/_OIJAAAAA/ghi.tld The original paths of these files were: oldprojectname/config/abc.xml oldprojectname/config/def.xml oldprojectname/config/taglib/ghi.tld So I have two different orphaned names for the same original folder. Renaming in the dumpfile is not so easy anymore because the orphaned name of ghi.tld hasn't got the same folder depth as the original file. This makes renaming in the dumpfile quite a pain. This situation arose because these steps were done in VSS: - Create a new project "container" - Create project "container/newprojectnameA" - Create project "container/newprojectnameB" - Share contents of "oldprojectname" into "newprojectnameA" - Share contents of "oldprojectname" into "newprojectnameB" - Destroy "oldprojectname" This is where all my orphans come from. And this is why I had the idea to actually find out, where a file was shared from and tweak ActionHandler.pm in the way I did. This did work out very well, and for every project I can now see where it once came from. Is there really no way of putting something like this into vss2svn? Maybe with a big warning "use at your own risk?" But you do know a lot more about VSS than I do. But I hope by describing the situation, maybe something can be done for just this situation. I can imagine that other users come across this very same problem. > Destroying unneeded projects is ok. THis is the best way to go in order > to convert only singular projects. But you should use the "-d" switch > also and it is still possible that you will see a few orphans. You mean -d for analyze.exe? This is how I invoke it: analyze -C -D -F -I-Y -V1 X:\path\to\db\data I am still documenting my script. I don't have much time at the moment :-( But it will come :-) Cheers, Ingo =;-> From stephen.lee at hexagonmetrology.com Tue Apr 3 13:52:28 2007 From: stephen.lee at hexagonmetrology.com (Stephen Lee) Date: Tue Apr 3 13:52:40 2007 Subject: Automated conversion script In-Reply-To: <223470932.20070403161336@der-ingo.de> References: <644610523.20070330115430@der-ingo.de> <460CF971.7000702@nogga.de><798961065.20070331085202@der-ingo.de> <460E72AE.7070503@nogga.de> <223470932.20070403161336@der-ingo.de> Message-ID: <4612945C.1030301@wilcoxassoc.com> Ingo Schmidt wrote: > > This did work out very well, and for every project I can now see where > it once came from. Is there really no way of putting something like > this into vss2svn? Maybe with a big warning "use at your own risk?" > Your patch essentially seems to be a much better approach to what I was originally trying to do as a post-processing task with my filterorphans.cpp. There were two basic classes of orphan I encountered: a) 'false' orphans, where the vss2svn script had got confused, and was treating something as an orphan despite its parent still being alive and well (mainly due to confusing sets of moves / shares / etc.) b) 'true' orphans, which includes files that were originally created in a now-deleted project, or which were branches made in a now-deleted project. For the false orphans, I put these back into the correct location. For the true orphans, I found it caused problems to put them back in the location as it had been at the time... and it was better to simply classify these within the orphaned folder... i.e. rather than orphaned/_CAAAAAAA/DongleCheck.cpp orphaned/_DAAAAAAA/DongleCheck.dsp orphaned/_EAAAAAAA/DongleCheck.h ... etc. This became orphaned/DongleCheck/DongleCheck.cpp orphaned/DongleCheck/DongleCheck.dsp orphaned/DongleCheck/DongleCheck.h In fact more recently I didn't bother with either, as the newer versions did not create 'false' orphans on my database, and no relevant orphans seemed to be recent enough to care about. Final conversion is still on hold, waiting for some source code re-arrangement to remove the reliance on the SourceSafe file sharing mechanism, so I might yet change my mind again! This classification is, as far as I can tell, something that can only be done manually because after the DongleCheck project had been deleted there exists no record of where these files were originally created (a human can use heuristics when perusing the PhysicalAction table of "these files were all created at about the same time and just after the DongleCheck project, and oooh, they have the right names too, so they must go there", but an automated script would have more trouble classifying them). Other heuristics may also exist later in the file's history (as viewed in SourceSafe) such as the location of subsequent check-ins, or where the file is subsequently said to have been shared from, but at present this information is not extracted from the database, and even then, coverage would be patchy at best. I do wonder, however, if there might be some way that orphans could be better organised by default, particularly when there is a whole batch of files added at once that "obviously" belong together. Another database I converted a few days ago (based on a couple of projects that had been archived off the main database a long time ago due to swamping the global revision history) generated nearly 4500 subfolders of orphaned due to some projects with large numbers of files having been added, deleted, added again, branched, deleted, branched again, etc. Navigating this orphaned folder or tracking anything that occured within it is, to all intents and purposes, impractical (although the head revision of the trunk does indeed match the sourcesafe version :) ) -- Stephen Lee Software Engineer, Vision Group - Pro-Measure Leader Wilcox Associates Inc. (U.K.) From list at der-ingo.de Tue Apr 3 14:16:20 2007 From: list at der-ingo.de (Ingo Schmidt) Date: Tue Apr 3 14:16:23 2007 Subject: Automated conversion script In-Reply-To: <4612945C.1030301@wilcoxassoc.com> References: <644610523.20070330115430@der-ingo.de> <460CF971.7000702@nogga.de><798961065.20070331085202@der-ingo.de> <460E72AE.7070503@nogga.de> <223470932.20070403161336@der-ingo.de> <4612945C.1030301@wilcoxassoc.com> Message-ID: <484956583.20070403201620@der-ingo.de> Hi Vss2Svn! > Your patch essentially seems to be a much better approach to what I was > originally trying to do as a post-processing task with my filterorphans.cpp. I saw your filterorphans.cpp but since I am no C expert at all, I didn't look to deeply into the source so I can't tell what it really does. But it seemed to me like something like it could/should be directly in the vss2svn script. > There were two basic classes of orphan I encountered: > a) 'false' orphans, where the vss2svn script had got confused, and was > treating something as an orphan despite its parent still being alive and > well (mainly due to confusing sets of moves / shares / etc.) Luckily something like that didn't exist in our database, because moves/shares were never done except in the case I described (share & destroy original). So I didn't encounter any of these. > classify these within the orphaned folder... i.e. rather than > orphaned/_CAAAAAAA/DongleCheck.cpp > orphaned/_DAAAAAAA/DongleCheck.dsp > orphaned/_EAAAAAAA/DongleCheck.h > ... etc. > This became > orphaned/DongleCheck/DongleCheck.cpp > orphaned/DongleCheck/DongleCheck.dsp > orphaned/DongleCheck/DongleCheck.h Exactly! If you find out where these files once really were, I think it is safe to do this! > This classification is, as far as I can tell, something that can only be > done manually [...] Yes, this is what I did! I had vss2svn convert the whole database, then looked in the files it created and checked that in the VSS history. This is a tedious job and if you want to be on the safe side, you have to do it manually, I guess. But I compiled a list of all orphans, classified them and put that into ActionHandler.pm and ran the script again. This time it gave me all the pretty names and it also took care of creating all the folder levels. That's why I liked my idea so much: No messing with the (HUGE!!) dumpfiles! > Navigating this orphaned folder or tracking anything that occured > within it is, to all intents and purposes, impractical (although the > head revision of the trunk does indeed match the sourcesafe version > :) ) Hehe, yes, that's true! I got a lot of orphaned stuff. Throwing out unwanted projects got rid of quite a few of them. And indeed, the head revision of trunk was a 100% match every time! Cheers, Ingo =;-> From stephen.lee at hexagonmetrology.com Tue Apr 3 14:51:21 2007 From: stephen.lee at hexagonmetrology.com (Stephen Lee) Date: Tue Apr 3 14:51:35 2007 Subject: Automated conversion script In-Reply-To: <484956583.20070403201620@der-ingo.de> References: <644610523.20070330115430@der-ingo.de><460CF971.7000702@nogga.de><798961065.20070331085202@der-ingo.de><460E72AE.7070503@nogga.de> <223470932.20070403161336@der-ingo.de><4612945C.1030301@wilcoxassoc.com> <484956583.20070403201620@der-ingo.de> Message-ID: <4612A229.1060805@wilcoxassoc.com> Ingo Schmidt wrote: > > > orphaned/DongleCheck/DongleCheck.cpp > > orphaned/DongleCheck/DongleCheck.dsp > > orphaned/DongleCheck/DongleCheck.h > > Exactly! If you find out where these files once really were, I think > it is safe to do this! > The point I was making was that by keeping them in (better classified) subfolders of orphaned, this avoids conflicting with the proper head revision, and effectively resurrecting projects that had been deleted. Any files that are missing from the head revision without doing this would be "false orphans"... > Yes, this is what I did! I had vss2svn convert the whole database, > then looked in the files it created and checked that in the VSS > history. This is a tedious job and if you want to be on the safe side, > you have to do it manually, I guess. > I also cross-referenced this against the _vss2svn\datacache.PhysicalAction.tmp.txt file (imported into Excel and sorted on the timestamp column). This way you can pull out a whole batch of files that were added together and put the mapping from all these physical names to a new folder. > all the pretty names and it also took care of creating all the > folder levels. That's why I liked my idea so much: > No messing with the (HUGE!!) dumpfiles! > Indeed... and much of the messing around in the cpp file was to deal with: a) C being a terrible language for text processing b) needing to fix-up things like creating (or suppressing creation of) folders. If I decide to bother classifying the orphans again, I'll almost certainly do it similar to your way rather than trying to fix up filterorphan.cpp -- Stephen Lee Software Engineer, Vision Group - Pro-Measure Leader Wilcox Associates Inc. (U.K.) From finnied at aciworldwide.com Tue Apr 17 01:00:52 2007 From: finnied at aciworldwide.com (finnied@aciworldwide.com) Date: Tue Apr 17 01:01:12 2007 Subject: Fw: ssphys error: failed to read necessary amount of data from input file In-Reply-To: <4608DB18.3060905@nogga.de> Message-ID: Dirk, What do you think about this problem ? I was having another look at it, and noticed that: >> 86197 TUADAAAA EDADAAAA 1 ADD >> 1 0 \N is a little odd in that there is no name associated with the "ADD" action. Is that what you meant by vss2svn being "confused" ? Thanks. Dave This e-mail message and any attachments may contain confidential, proprietary or non-public information. This information is intended solely for the designated recipient(s). If an addressing or transmission error has misdirected this e-mail, please notify the sender immediately and destroy this e-mail. Any review, dissemination, use or reliance upon this information by unintended recipients is prohibited. Any opinions expressed in this e-mail are those of the author personally. Dirk Sent by: vss2svn-users-bounces@lists.pumacode.org 27-Mar-2007 06:51 PM Please respond to Vss2Svn Users To Vss2Svn Users cc Subject Re: Fw: ssphys error: failed to read necessary amount of data from input file Dirk schrieb: > >> 86197 TUADAAAA EDADAAAA 1 ADD >> 1 0 \N >> 86198 LUADAAAA TUADAAAA 1 ADD >> /orphaned/_YEDAAAAA/S01/InstallShield/WebGate/Setup >> Files/Uncompressed Files/Language Independent/ 1 >> 0 \N Ok, the problem is, that you managed to add a subproject to a project at exactly the same timestamp as you added the project to its parent project. vss2svn is confused about this fact, and as a result it first adds the subproject, then the project itself. The above two actions must actually be reversed. Here are the corresponding lines from your physical history without any confidential information > 58593 TUADAAAA \N LUADAAAA ADD Language Independent/ 1 1139237015 xxxx 0 \N 1 AAAADAUT 1 \N \N > 98983 EDADAAAA \N TUADAAAA ADD OS Independent/ 1 1139237015 xxxx 0 \N 1 AAAADADE 1 \N \N TUADAAAA is added to LUADAAAA at timestamp 1139237015 . EDADAAAA is added to TUADAAAA also at timestamp 1139237015 . Gimme some time to think about a solution. Dirk _______________________________________________ vss2svn-users mailing list Project homepage: http://www.pumacode.org/projects/vss2svn/ Subscribe/Unsubscribe/Admin: http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org Mailing list web interface (with searchable archives): http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pumacode.org/pipermail/vss2svn-users-lists.pumacode.org/attachments/20070417/bf336b67/attachment.html From oavtal at bezeqint.net Tue Apr 17 09:57:01 2007 From: oavtal at bezeqint.net (Ori Avtalion) Date: Tue Apr 17 09:57:06 2007 Subject: BUG: control character in text input --> not well-formed Message-ID: >As a workaround you can modify the line 1105 in vss2svn.pl [1] to also >include the "x0c" character. >> $gSysOut =~ >s/[\x00-\x09\x0c\x11\x12\x14-\x1F\x81\x8D\x8F\x90\x9D]/_/g; # just to be >sure > >But what is the general solution? Hi, Is it possible to get a special build of vss2svn.exe just with this fix? I get this error too, and setting up a perl environment is not possible. Thanks From toby at etjohnson.us Tue Apr 17 10:52:07 2007 From: toby at etjohnson.us (Toby Johnson) Date: Tue Apr 17 10:50:41 2007 Subject: [PATCH] Fix wrong deleting order In-Reply-To: <001701c772ff$3a3383b0$8ce2b8d5@meleshko> References: <001701c772ff$3a3383b0$8ce2b8d5@meleshko> Message-ID: <4624DF17.3080403@etjohnson.us> Sergei Meleshko wrote: > Hi, > > I would propose a patch that fixes a wrong order of deleting > subfolders that > were deleted at the same second. For instance we have a vss database with > projects: > > $/ > Project > SubProject > SubProject > > It is possible to delete projects $/Project and $/Project/SubProject > at the > same second using ss.exe utility. In that case after converting the > project > $/SubProject also will be deleted in dump file, because vss2svn deletes > $/Project, than $/SubProject instead of $/Project/SubProject. > > Patch file as well as test source vss database attached to this message. Hi Sergei, thanks for the patch; I apologize I haven't had a chance to look at it yet but I've created ticket #50 to make sure it doesn't get lost. toby From kenneth_lakin at yahoo.com Thu Apr 19 02:33:29 2007 From: kenneth_lakin at yahoo.com (Kenneth Lakin) Date: Thu Apr 19 02:33:49 2007 Subject: OOM issues: large files AND many commits in SS databases Message-ID: <383024.46314.qm@web54513.mail.yahoo.com> All, I'm running vss2svn and ssphys (from SVN rev 309) on a Windows XP Pro machine w/ 2GB of RAM. I compiled perl from the ActiveState 5.8.8 sources. I've enabled perl's native memory management. I have two separate issues. The first: *Summary: SanityChecker needs to start writing out its data to disk; perhaps in an SQLite DB or something. *Exposition: I'm processing a SSafe database that has well over 3000 commits. Everything's fine until the IMPORTSVN phase. As we work through the IMPORTSVN phase, perl keeps using more and more memory. 'Round about the time that SVN revision 2947 is being processed, perl's using 1.5GB of ram. vss2svn goes to load a ~250MB file, and cannot allocate the memory to do so. perl halts, on line 719 of Dumpfile.pm ( $node->{text} = do { local( $/ ) ; } ; ) and I'm left with a 9GB SVN dump file that still has two years of commits to go! I've looked all through Vss2Svn::DumpFile and associated classes, and made a short run thru the debugger. Unless I've messed up my perl compilation, I can't find anything other than SanityChecker that might be eating so much memory. I'm going to get to work on making SanityChecker write to an SQLite database today, unless someone has a better idea. The second issue: I have a patch for DumpFile::get_export_contents() that (for me) works better than the one here: http://www.pumacode.org/projects/vss2svn/ticket/25 (I'll submit the patch to the list within 24 hrs.) All in all, this is a really fine tool! Thanks for making it available. Cheers, Kenneth Lakin Simon C. Ion Software __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From toby at etjohnson.us Thu Apr 19 16:31:42 2007 From: toby at etjohnson.us (Toby Johnson) Date: Thu Apr 19 16:31:55 2007 Subject: OOM issues: large files AND many commits in SS databases In-Reply-To: <383024.46314.qm@web54513.mail.yahoo.com> References: <383024.46314.qm@web54513.mail.yahoo.com> Message-ID: <4627D1AE.2050107@etjohnson.us> Kenneth Lakin wrote: > All, > > I'm running vss2svn and ssphys (from SVN rev 309) on a Windows XP Pro machine w/ 2GB of RAM. I compiled perl from the ActiveState 5.8.8 sources. I've enabled perl's native memory management. > I have two separate issues. > The first: > *Summary: > SanityChecker needs to start writing out its data to disk; perhaps in an SQLite DB or something. > > *Exposition: > I'm processing a SSafe database that has well over 3000 commits. > Everything's fine until the IMPORTSVN phase. As we work through the IMPORTSVN phase, perl keeps using more and more memory. 'Round about the time that SVN revision 2947 is being processed, perl's using 1.5GB of ram. vss2svn goes to load a ~250MB file, and cannot allocate the memory to do so. perl halts, on line 719 of Dumpfile.pm ( $node->{text} = do { local( $/ ) ; } ; ) and I'm left with a 9GB SVN dump file that still has two years of commits to go! > I've looked all through Vss2Svn::DumpFile and associated classes, and made a short run thru the debugger. Unless I've messed up my perl compilation, I can't find anything other than SanityChecker that might be eating so much memory. > > I'm going to get to work on making SanityChecker write to an SQLite database today, unless someone has a better idea. > Hi Kenneth, thanks for writing. You're losing me a bit here though. The issue in Dumpfile.pm in which it pulls the entire file into memory is certainly an issue (as documented in ticket 25 which you mention) and one that I would like to fix. Doing a buffered read isn't terribly difficult, but rewriting it to write the data directly from source to target file will take some more coding. However, I don't understand where you're getting to SanityChecker, or what else you're planning on writing to SQLite. It certainly is odd that you are seeing steadily increasing memory though; I've never seen it get to that high of usage although others might have. The number of commits (~3000) you're dealing with is certainly not huge but if you have numerous multi-hundred-megabyte files then that could definitely be a problem. But again, that would be solved by fixing Dumpfile.pm. Is there a reason you compiled Perl from source, instead of using ActiveState's binary version? > The second issue: > I have a patch for DumpFile::get_export_contents() that (for me) works better than the one here: > > http://www.pumacode.org/projects/vss2svn/ticket/25 > (I'll submit the patch to the list within 24 hrs.) > > All in all, this is a really fine tool! Thanks for making it available. > Glad it is working well for you; please feel free to send the patches if you get it working with less memory. My time to work on this project is very rare these days but I would definitely be interested in reducing the overall memory footprint. toby From kenneth_lakin at yahoo.com Thu Apr 19 20:43:26 2007 From: kenneth_lakin at yahoo.com (Kenneth Lakin) Date: Thu Apr 19 20:43:31 2007 Subject: OOM issues: large files AND many commits in SS databases Message-ID: <207694.18820.qm@web54514.mail.yahoo.com> ----- Original Message ---- From: Toby Johnson To: Vss2Svn Users Sent: Thursday, April 19, 2007 3:31:42 PM Subject: Re: OOM issues: large files AND many commits in SS databases > Hi Kenneth, thanks for writing. You're losing me a bit here though. The > issue in Dumpfile.pm in which it pulls the entire file into memory is > certainly an issue (as documented in ticket 25 which you mention) and > one that I would like to fix. Doing a buffered read isn't terribly > difficult, but rewriting it to write the data directly from source to > target file will take some more coding. Please take everything that I say here with a grain of salt. I've been looking at perl -and your script- for about a week. I'm uncertain that the file write-out in Dumpfile.pm is at the core of my issue. As I read it, the routine that is performing the un-buffered read is Dumpfile::get_export_contents. This only gets called from Dumpfile::add_handler and commit_handler. So, if we have an ADD or a COMMIT, we'll eat up more memory, until we get rid of those nodes. So, the memory that we consume with the un-buffered read should be released when each revision gets written out. Right? Along those lines: Aren't all data structures in Dumpfile flushed after each revision, except for those in SanityChecker? > However, I don't understand where you're getting to SanityChecker > The number of commits (~3000) you're dealing with is certainly not huge ... > but if you have numerous multi-hundred-megabyte files > then that could definitely be a > problem. But again, that would be solved by fixing Dumpfile.pm. I mis-typed. According to the SvnRevisionVssAction table in vss_data.db, I have 70,344 VSS actions, and 4,901 SVN revisions for this database. However, I don't have many multi-hundred-meg files in this database. Out of 34,593 files in the latest revision, six are greater than 100M and 34 are larger than 10M. > Is there a reason you compiled Perl from source, instead of using > ActiveState's binary version? When I ran the binary version two days ago, it used the OS's native memory allocation routines. So, when perl fails to allocate memory, I get a crash that's handled by the Windows error reporting mechanisms; not perl's. This means that I have no idea what line of the script caused the interpreter to fail, as the backtrace that I get from Windows starts me off deep in the C code for Perl's memory allocation routines, and works up to the C code for *starting* the interpreter... When using perl's mem alloc routines in the same situation, I get a message from perl that says something like "Invalid request for memory on line 300 in file.pl". This is much more informative! > The second issue: > I have a patch for DumpFile::get_export_contents() that (for me) works better than the one here: > > http://www.pumacode.org/projects/vss2svn/ticket/25 > (I'll submit the patch to the list within 24 hrs.) The patch is attached. It does two things: 1) It patches output_node to take a reference to the incoming node and output_content to take a reference to the data that it's going to write out. 2) It syswrite instead of print to write out that data. Both of these changes reduce the memory footprint, and enabled me to process another database that required 1GB of RAM really early in the IMPORTSVN phase. > My time to work on this project is > very rare these days. Aye, I read as much from the archives. Thanks for your input! > but I would definitely be interested in reducing > the overall memory footprint. As would I. I can't convert this particular DB until the footprint is reduced! -Kenneth __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- A non-text attachment was scrubbed... Name: =?utf-8?q?DumpfileLargeFile.patch?= Type: application/octet-stream Size: 1929 bytes Desc: not available Url : http://lists.pumacode.org/pipermail/vss2svn-users-lists.pumacode.org/attachments/20070419/30489f0c/utf-8qDumpfileLargeFile.obj From toby at etjohnson.us Fri Apr 20 16:13:02 2007 From: toby at etjohnson.us (Toby Johnson) Date: Fri Apr 20 16:13:55 2007 Subject: OOM issues: large files AND many commits in SS databases In-Reply-To: <207694.18820.qm@web54514.mail.yahoo.com> References: <207694.18820.qm@web54514.mail.yahoo.com> Message-ID: <46291ECE.1000109@etjohnson.us> Kenneth Lakin wrote: > So, the memory that we consume with the un-buffered read should be released when each revision gets written out. Right? > Along those lines: Aren't all data structures in Dumpfile flushed after each revision, except for those in SanityChecker? > As far as I know, both counts are correct. So the fact that it keeps getting bigger over time seems to indicate that either the sanity checker is indeed at fault, or your perl interpreter itself has a memory leak. That's why I asked about trying the AS binary; I didn't realize you already had w/ the same results... so I agree that the sanity checker seems to be the main culprit. You may want to try running the old 0.11.0-alpha1 version[1] to at least test this theory, since the sanity checker was much less ambitious and therefore kept much less state data at that point. [1]http://www.pumacode.org/download/vss2svn/vss2svn-0.11.0-alpha1.zip > The patch is attached. It does two things: > 1) It patches output_node to take a reference to the incoming node and output_content to take a reference to > the data that it's going to write out. > 2) It syswrite instead of print to write out that data. > Both of these changes reduce the memory footprint, and enabled me to process another database that > required 1GB of RAM really early in the IMPORTSVN phase. > Thanks, looking at that again I don't know why on earth I was passing the whole text contents directly, that doesn't make any sense at all w/ Perl's pass-by-value strings! toby From amos.shapira at gmail.com Fri Apr 20 20:52:30 2007 From: amos.shapira at gmail.com (Amos Shapira) Date: Fri Apr 20 20:52:34 2007 Subject: OOM issues: large files AND many commits in SS databases In-Reply-To: <46291ECE.1000109@etjohnson.us> References: <207694.18820.qm@web54514.mail.yahoo.com> <46291ECE.1000109@etjohnson.us> Message-ID: <9c2cca270704201752u7490e5c9t9b2a21c65560f954@mail.gmail.com> On 21/04/07, Toby Johnson wrote: > > You may want to try running the old 0.11.0-alpha1 version[1] to at least > test this theory, since the sanity checker was much less ambitious and > therefore kept much less state data at that point. > > [1]http://www.pumacode.org/download/vss2svn/vss2svn-0.11.0-alpha1.zip I've missed the beginning of this thread, and only tried to run vss2svn once on a Debian machine and for what it's worth my trial just gobbled up the entire RAM and ran for a long time before I killed it. I'll install an extra 1Gb RAM on top of the existing 512 and try to run it again as well as the version you suggest above. A side note - It's hard to tell the version I have, there is no "--version" option and no mention of any SVN keyword in the source code. It would be useful if you could "svn propset svn:keywords 'HeadURL Id'" and include "$HeadURL$" and "$Id$" in the script. Cheers, --Amos -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pumacode.org/pipermail/vss2svn-users-lists.pumacode.org/attachments/20070421/c5f2c289/attachment.html From kenneth_lakin at yahoo.com Mon Apr 23 19:18:03 2007 From: kenneth_lakin at yahoo.com (Kenneth Lakin) Date: Mon Apr 23 19:18:24 2007 Subject: OOM issues: large files AND many commits in SS databases Message-ID: <545632.38927.qm@web54513.mail.yahoo.com> ----- Original Message ---- From: Toby Johnson To: Vss2Svn Users Sent: Friday, April 20, 2007 3:13:02 PM Subject: Re: OOM issues: large files AND many commits in SS databases > You may want to try running the old 0.11.0-alpha1 version[1] to at least > test this theory, since the sanity checker was much less ambitious and > therefore kept much less state data at that point. I gave this revision a try... it's still using too much RAM. This revision died just before processing an SVN ADD of a ~220MB file. (Windows XP SP2). SVN HEAD died some time shortly after processing the same file. (IIRC) Something just occured to me... This rev of vss2svn doesn't have the patch for ticket #25 applied to it, does it? I'll grab the source for this revision (I assume that it's tagged as 0.10.0-a1 in SVN), apply my patch to DumpFile.pm and see what happens. Also, I haven't yet had time to make any progress on pushing out SanityChecker to a database. Thanks for your input and assistance, -Kenneth __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From toby at etjohnson.us Mon Apr 23 23:25:46 2007 From: toby at etjohnson.us (Toby Johnson) Date: Mon Apr 23 23:25:51 2007 Subject: OOM issues: large files AND many commits in SS databases In-Reply-To: <9c2cca270704201752u7490e5c9t9b2a21c65560f954@mail.gmail.com> References: <207694.18820.qm@web54514.mail.yahoo.com> <46291ECE.1000109@etjohnson.us> <9c2cca270704201752u7490e5c9t9b2a21c65560f954@mail.gmail.com> Message-ID: <462D78BA.50103@etjohnson.us> Amos Shapira wrote: > It's hard to tell the version I have, there is no "--version" option > and no mention of any SVN keyword in the source code. > It would be useful if you could "svn propset svn:keywords 'HeadURL > Id'" and include "$HeadURL$" and "$Id$" in the script. This is already included in the output of the latest version, but there probably should be a dedicated --version option. toby From list at der-ingo.de Wed Apr 25 17:10:30 2007 From: list at der-ingo.de (Ingo Schmidt) Date: Wed Apr 25 17:10:32 2007 Subject: Automated conversion script In-Reply-To: <644610523.20070330115430@der-ingo.de> References: <644610523.20070330115430@der-ingo.de> Message-ID: <1286330289.20070425231030@der-ingo.de> Hi! I have finally had the time to finish documenting my conversion script that I created and used to convert the VSS database at my company. You can find the script here: http://www.der-ingo.de/bin/vss2svn/convert.zip The included readme.txt is hopefully detailed enough to help everyone get it going. I want to point out, that I am no expert in writing bash scripts, so the script may not be perfect. However, it did get the job done very well. Any feedback is of course welcome. vss2svn guys: If you think this script is useful, feel free to put it into the contribution section. Cheers, Ingo =;-> From jamesh at ali.com.au Wed Apr 25 18:36:57 2007 From: jamesh at ali.com.au (Hall, James) Date: Wed Apr 25 18:37:07 2007 Subject: Hello Message-ID: <6CB8A668476DF44987D826AA4CCCF5F4014E20F2@AURB-EXCH-V1.ali.local> Hey Guys I am new to this mailing list and need a little help Currently we have a VSS database that we are using to store graphical content. Currently we are using the following features of VSS - Sharing - Versioning - Labeling While I admire what SVN is, we feel that SVN is not the right choice. We are selecting Alienbrain (from Avid) as our SCM for graphical assets. Now this is where the questions start. Alienbrain comes with a vss to AB migration script but it is not very robust. I would prefer to use vss2svn as the tool to help us with the migration of the data. My approaches are as follows 1. Hack the current vss2svn script to push content into AB 2. Create a script to take the intermediate data that the vss2svn scripts creates and use that to create AB actions to push data into it. 3. Push the vss data into svn and then write a script to take it out and push it into AB Option 2 is the approach I am most comfortable with, but I will need a break of what the intermediate data is (the data found in the "_vss2svn" folder). I am also open to other suggestions. One critical point is that our VSS database does have some issues. But vss2svn seemed to parse it ok (unlike the Alienbrain conversion script) Cheers James IMPORTANT CONFIDENTIALITY NOTICE: This E-mail (including any documents referred to in, or attached, to this E-mail) may contain information that is personal, confidential or the subject of copyright, privilege or other proprietary rights in favor of Aristocrat, its affiliates or third parties. This E-mail is intended only for the named addressee. Any privacy, confidence, legal professional privilege, copyright or other proprietary rights in favor of Aristocrat, its affiliates or third parties, is not lost if this E-mail was sent to you by mistake. If you received this E-mail by mistake you should: (i) not copy, disclose, distribute or otherwise use it, or its contents, without the consent of Aristocrat or the owner of the relevant rights; (ii) let us know of the mistake by reply E-mail or by telephone (AUS +61 2 9413 6300 or USA 1-877-274-9661); and (iii) delete it from your system and destroy all copies. Any personal information contained in this E-mail must be handled in accordance with applicable privacy laws. Electronic and internet communications can be interfered with or affected by viruses and other defects. As a result, such communications may not be successfully received or, if received, may cause interference with the integrity of receiving, processing or related systems (including hardware, software and data or information on, or using, that hardware or software). Aristocrat gives no assurances in relation to these matters. If you have any doubts about the veracity or integrity of any electronic communication we appear to have sent you, please call (AUS +61 2 9413 6300 or USA 1-877-274-9661) for clarification. From finnied at aciworldwide.com Wed Apr 25 18:38:53 2007 From: finnied at aciworldwide.com (finnied@aciworldwide.com) Date: Wed Apr 25 18:39:02 2007 Subject: Dave Finnie is out of the office. Message-ID: I will be out of the office starting 25/04/2007 and will not return until 02/05/2007. I will *not* be checking my emails. From toby at etjohnson.us Thu Apr 26 10:01:52 2007 From: toby at etjohnson.us (Toby Johnson) Date: Thu Apr 26 10:01:43 2007 Subject: Hello In-Reply-To: <6CB8A668476DF44987D826AA4CCCF5F4014E20F2@AURB-EXCH-V1.ali.local> References: <6CB8A668476DF44987D826AA4CCCF5F4014E20F2@AURB-EXCH-V1.ali.local> Message-ID: <4630B0D0.1010102@etjohnson.us> Hall, James wrote: > Hey Guys > > I am new to this mailing list and need a little help > > Currently we have a VSS database that we are using to store graphical > content. Currently we are using the following features of VSS > - Sharing > - Versioning > - Labeling > > While I admire what SVN is, we feel that SVN is not the right choice. > We are selecting Alienbrain (from Avid) as our SCM for graphical assets. > Now this is where the questions start. Alienbrain comes with a vss to > AB migration script but it is not very robust. > > I would prefer to use vss2svn as the tool to help us with the migration > of the data. > > My approaches are as follows > 1. Hack the current vss2svn script to push content into AB > 2. Create a script to take the intermediate data that the vss2svn > scripts creates and use that to create AB actions to push data into it. > 3. Push the vss data into svn and then write a script to take it out and > push it into AB > > Option 2 is the approach I am most comfortable with, but I will need a > break of what the intermediate data is (the data found in the "_vss2svn" > folder). > Hello James, In my opinion, option 1 is probably your best bet. I originally designed the script with the intent of a modular "output" approach whereby the data gleaned from VSS could be output in any manner. My plan was to allow a "direct to SVN repository" approach (in addition to the SVN dumpfile approach) but there's no reason it wouldn't also work for a completely different target system. The mile-high overview of what you would need to do is: * Add logic to call function other than ImportToSvn as the last step of the import (which currently just calls CreateSvnDumpfile); and * Use the "CreateSvnDumpfile" subroutine as a guide to writing an AB-specific output module. If you are more interested in option 2, there really isn't much documentation on what the intermediate files are. The .txt files are all "cache" files which are in turn imported to the Sqlite database, so that database contains all the data you should need. Its schema is relatively straightforward, so using the existing script for guidance you should be able to figure out what is there. toby