From brucetwilson at gmail.com Mon Jun 2 14:38:31 2008 From: brucetwilson at gmail.com (Bruce Wilson) Date: Mon Jun 2 14:38:40 2008 Subject: Migration complete Message-ID: <48443E27.7080905@toomuchblue.com> Hi, all. I just wanted to report that I have successfully migrated my repository using the 2008-04-30 nightly build. My VSS repository was about 500MB, representing the (off and on) work of about 8 developers over 13 years. The dump file contained about 2900 revs. The folders other than "orphaned" and "labels" were byte-for-byte matches. The migration was loaded into an existing repository with a "vendor" tree without any reported errors. I tripped over some of those "orphaned" problems which may have been addressed by Alexander's 0007 patch, but in my case they all turned out to be caused by "destroy permanently" and were disposable. Patches 0001 and 0002 sure would have made the migration more enjoyable. I performed the migration using a batch file script I developed for the purpose, based on suggestions from here and the wiki, adding in some of my own preferences about how I wanted to test. I plan to annotate it and post it to the wiki for future reference by others. Things I wish I had done differently: * Run ANALYZE using VSS 2005 instead of 6.0d. This idea occurred to me only the day of the migration. It eliminated many errors that slipped through with 6.0d that I was planning to "just deal with". * Interleaved the releases by date with the vendor branch. My repository now contains rev 362 (2008) followed by rev 363 (1995) which could be confusing. The effort may have outweighed the benefit, so this was a low priority. * Edited the dumpfile to make the developer names consistent. (We used first name, then employee number, then first-initial-last-name at various times.) Changing the "length" entry right before each developer name would have been trivial in the same search-and-replace. * Not spent so much time experimenting with rev-interval settings. It didn't make enough difference to be worth the effort. * Migrated about two years ago. We've already reaped significant benefits from the move. But also... * Since I waited this long, it might have been worth waiting a little longer for Subversion 1.5 to take advantage of sharding. A huge shout-out to Toby for your ongoing work on this product. I know it's not something of personal value to you anymore, but please don't be discouraged. Your efforts on the product are of immeasureable value for those of us who haven't yet crossed that bridge. I plan to stay on the list to contribute where I can. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pumacode.org/pipermail/vss2svn-users-lists.pumacode.org/attachments/20080602/950452da/attachment.html From nathan-svn at spicycrypto.ca Mon Jun 2 15:04:05 2008 From: nathan-svn at spicycrypto.ca (Nathan Kidd) Date: Mon Jun 2 15:04:11 2008 Subject: Migration complete In-Reply-To: <48443E27.7080905@toomuchblue.com> References: <48443E27.7080905@toomuchblue.com> Message-ID: <48444425.5020101@spicycrypto.ca> Bruce Wilson wrote: > Things I wish I had done differently: I think this is a very helpful list. Thanks. > * Run ANALYZE using VSS 2005 instead of 6.0d. This idea occurred to > me only the day of the migration. It eliminated many errors that > slipped through with 6.0d that I was planning to "just deal with". > * Interleaved the releases by date with the vendor branch. My > repository now contains rev 362 (2008) followed by rev 363 (1995) > which could be confusing. The effort may have outweighed the > benefit, so this was a low priority. Of course this only works if it's okay for the existing repo's numbers to change. (I.e. any logs, records, etc. that reference revision numbers will suddenly be wrong if you interleave the changes). > * Edited the dumpfile to make the developer names consistent. (We > used first name, then employee number, then > first-initial-last-name at various times.) Changing the "length" > entry right before each developer name would have been trivial in > the same search-and-replace. With only 2,900 revisions, an after-migration 'svn propset' would actually be easier, IMO. Remember author names are revision properties, and not versioned, so your history won't be full of 2,900 "author changed" revisions. > * Not spent so much time experimenting with rev-interval settings. > It didn't make enough difference to be worth the effort. > * Migrated about two years ago. We've already reaped significant > benefits from the move. But also... > * Since I waited this long, it might have been worth waiting a > little longer for Subversion 1.5 to take advantage of sharding. When (if? :) 1.5 comes out you can easily shard by a dump-load cycle, so there's no real benefit from waiting. -Nathan From brucetwilson at gmail.com Mon Jun 2 16:00:56 2008 From: brucetwilson at gmail.com (Bruce Wilson) Date: Mon Jun 2 16:01:06 2008 Subject: Migration complete In-Reply-To: <48444425.5020101@spicycrypto.ca> References: <48443E27.7080905@toomuchblue.com> <48444425.5020101@spicycrypto.ca> Message-ID: <48445178.5000303@toomuchblue.com> Nathan Kidd wrote: > Bruce Wilson wrote: >> * Interleaved the releases by date with the vendor branch. My >> repository now contains rev 362 (2008) followed by rev 363 (1995) >> which could be confusing. The effort may have outweighed the >> benefit, so this was a low priority. > Of course this only works if it's okay for the existing repo's numbers > to change. (I.e. any logs, records, etc. that reference revision > numbers will suddenly be wrong if you interleave the changes). This wasn't the case in my migration (no revs named in comments), but I can see how that would be important to consider. Best approach I can think of to achieve this would have been to split the migration dump file every time I wanted to inject a Vendor commit (about 150 in all). Too much work. >> * Edited the dumpfile to make the developer names consistent. (We >> used first name, then employee number, then >> first-initial-last-name at various times.) Changing the "length" >> entry right before each developer name would have been trivial in >> the same search-and-replace. > > With only 2,900 revisions, an after-migration 'svn propset' would > actually be easier, IMO. Remember author names are revision > properties, and not versioned, so your history won't be full of 2,900 > "author changed" revisions. Hmm, that sounds interesting. The tricky thing would seem to be identifying which revs need to be updated, right? I could grep (a copy of...) the revprops folder for clues I suppose. Know of any scripts or recipes for this? >> * Not spent so much time experimenting with rev-interval >> settings. It didn't make enough difference to be worth the effort. >> * Migrated about two years ago. We've already reaped significant >> benefits from the move. But also... >> * Since I waited this long, it might have been worth waiting a >> little longer for Subversion 1.5 to take advantage of sharding. > When (if? :) 1.5 comes out you can easily shard by a dump-load cycle, > so there's no real benefit from waiting. True, and I've seen an in-place reshard script is around, too. If 1.5 isn't far off, I can save myself the reshard as well. My main job is billable work for clients, so anything involving "reloading the repository" (which is already apparently working) doesn't win many points. From nathan-svn at spicycrypto.ca Mon Jun 2 17:12:29 2008 From: nathan-svn at spicycrypto.ca (Nathan Kidd) Date: Mon Jun 2 17:12:34 2008 Subject: Migration complete In-Reply-To: <48445178.5000303@toomuchblue.com> References: <48443E27.7080905@toomuchblue.com> <48444425.5020101@spicycrypto.ca> <48445178.5000303@toomuchblue.com> Message-ID: <4844623D.3080404@spicycrypto.ca> Bruce Wilson wrote: > Nathan Kidd wrote: >> Bruce Wilson wrote: >>> * Edited the dumpfile to make the developer names consistent. (We >>> used first name, then employee number, then >>> first-initial-last-name at various times.) Changing the "length" >>> entry right before each developer name would have been trivial in >>> the same search-and-replace. >> >> With only 2,900 revisions, an after-migration 'svn propset' would >> actually be easier, IMO. Remember author names are revision >> properties, and not versioned, so your history won't be full of 2,900 >> "author changed" revisions. > > Hmm, that sounds interesting. The tricky thing would seem to be > identifying which revs need to be updated, right? I could grep (a copy > of...) the revprops folder for clues I suppose. Know of any scripts or > recipes for this? Roughly: for rev in 1 .. 2900: author = svn propget svn:author --revprop svn://repo/eod -r$rev if author == "foo": author = "bar" svn propset svn:author $author --revprop svn://repo/eod -r$rev But you've got to have enabled (an at least empty) prerevprop hook script first (check the svn book). -Nathan From agavrilov at tepkom.ru Mon Jun 2 18:31:36 2008 From: agavrilov at tepkom.ru (Alexander N. Gavrilov) Date: Mon Jun 2 18:31:59 2008 Subject: Migration complete In-Reply-To: <48443E27.7080905@toomuchblue.com> References: <48443E27.7080905@toomuchblue.com> Message-ID: <200806030231.37902.agavrilov@tepkom.ru> On Monday 02 June 2008 22:38:31 Bruce Wilson wrote: > I tripped over some of those "orphaned" problems which may have been > addressed by Alexander's 0007 patch, but in my case they all turned out > to be caused by "destroy permanently" and were disposable. Hello, Just to clarify, patch 0007 does not fix any problems with orphans, it is in fact meant as an easy way to safely remove all such projects at once. It works by rewriting all actions that interact both with orphaned and live folders so that they don't touch orphans. It also runs an additional pass of action processing to find orphans that have some of their subfolders moved into the live area -- removing them would require rewriting one move action into multiple adds, which is way too complex and error-prone. In case of our repository, I found three causes of orphans: 1) Projects that I destroyed before the conversion. 2) Perhaps, some projects that were removed a long time ago (e.g. old releases), but which were linked to by newer projects. 3) Projects recovered from an archive. I didn't care about orphans from cases 1 and 2, and the third case is detected by the additional pass. Also, the main problem that I had with them was coming from the fact that all orphans are put into one big directory in the repository. It seems that Subversion cannot deltify directories, so every time a file in one of the orphans changes, it stores a complete list of the /orphaned folder. I estimated that if I didn't interrupt svnadmin, the final repository size would have amounted to more than 20GB. Without the orphans it's mere 1.6GB. Alexander. From agavrilov at tepkom.ru Mon Jun 2 19:14:10 2008 From: agavrilov at tepkom.ru (Alexander N. Gavrilov) Date: Mon Jun 2 19:14:29 2008 Subject: More patches for vss2svn In-Reply-To: <483DE96F.5060804@etjohnson.us> References: <16823841.post@talk.nabble.com> <200805280123.31903.agavrilov@tepkom.ru> <483DE96F.5060804@etjohnson.us> Message-ID: <200806030314.11593.agavrilov@tepkom.ru> On Thursday 29 May 2008 03:23:27 Toby Johnson wrote: > Thanks for all the hard work you've put into your patches! I wish I > would have known ahead of time that you would be submitting so many, I > could have given you commit access on your own branch to make managing > and reviewing them easier. In fact I will still probably create a branch > for these just so I can keep track while I commit them. I'll reply back > once I've got them committed and hopefully others can help with > reviewing/testing. Hello, Well, I didn't know that myself. I just tweaked one thing, then another, and so on. =) Actually, this conversion was some sort of a feasibility study. If the big bosses agree to do an actual switch, I'll be converting everything afresh, probably with a slightly different set of projects; besides, there is a completely separate repository of the QA group, undoubtely with it's own quirks. In that case I might have to do some more changes, and committing them directly to a separate branch would indeed be more convenient. However, I don't expect that to be very soon. In the meanwhile, after some hacking on git-svnimport I also succeeded in further converting the repository to Git, and during this weekend wrote a script for bidirectional exchange of simple changes between Git and VSS, using VSS's logging facility. It's still full of bugs, though... Alexander.