Gah. New rule for Subversion repository maintenance – run “svnadmin verify” as often as you run your off-site backup process, and arguably don’t do the backup if it fails. One of our support repositories (as opposed to a development repository which I’m a bit more paranoid about) has had a dodgy revision in for a few months now which would have bitten us had we had to restore from scratch. It looks like it was a failure while checking in a massive binary file – it doesn’t affect the day to day running of the repository, but it means that we can’t dump or load (and correspondingly can’t effectively restore from backups).
Since rebuilding the repository revision by revision is always a massive pain, I’ve done some mucking around in the guts of the repository to get around the problem. And since my Googling of the issue has been less than helpful, I thought I’d post here to give a reference for anyone else with a similar issue.
Running svnadmin verify on the repository results in a “Checksum mismatch while reading representation”. The output here is misleading, because it will say something like “* Verified revision 23” on the line before the error message. This means that it is in fact revision 24 which is bad. You will also find that if you try to dump the repository, it will successfully dump revisions 0 through 23, but then fail on 24. If you try to dump revisions 0:23 and then 25:HEAD like I did, you’ll probably find that the 25:HEAD revision doesn’t work.
One (or more) of the changes to files in the revision that is causing problems has a different checksum than the one that the revision file recorded at the time. So when svnadmin verify looks over the contents of the revision and recalculates the checksum it finds that they don’t match and tells you. This means one of two things: 1) the checksum recorded at the time was wrong, and the data in the revision/file is valid, or 2) the data in the revision/file is corrupt, and the checksum at the time was correct.
If the file generating the bad checksum is a text file, you might be able to look at the contents of the revision file and check if it’s noticeably corrupt. If the file is binary as mine was, that’s probably not an option. Even more so if the file is large (mine was several hundred MB).
2) seems to me more likely, so chances are the file in question is corrupt and you need to fix the data. But if 1) is the case, then all you need to do is fix the checksum. Either way you probably can’t tell at this point – so best to assume it’s gone and work from there, or at least treat it as suspicious and verify it against other sources for the data if possible.
If you’re happy to assume that file is corrupt, then you can get your repo back to a verifiable step by changing the checksum saved in the revision file to match the checksum which will be generated from the data as it is now. The data won’t change so you’ll still have to verify it manually or delete it later, but at least you can persuade the repository that you don’t care.
I’m assuming here you’re working directly with the server on Linux. I use Debian, so tools like grep and hexedit are usually available (although I had to install hexedit). The same principles would apply on Windows, but the tools would have to change.
1) Identify the revision which is corrupt. This is straightforward – it’s the revision after the last successfully verified revision
2) Identify the file in the revision which has the bad checksum, and find the bad checksum in the revision. This is harder – the revision files (stored in /repository/db/revs) are binary, and in my case, huge. But grep is your friend here. svnadmin verify gives you the checksum that is currently recorded – this is stored in the revision file, right next to a description of the file. Here’s a grep command that searches the particular revision file for the checksum we’ve been given:
grep -e "79a1686d0dfb8618b8ccfc9eb7d74759" -A 3 -B 3 -b -a main/db/revs/24
384989609-id: 5cu.0.r24/384989609 384989633-type: file 384989644-count: 0 384989653:text: 24 75689685 293851064 294285337 79a1686d0dfb8618b8ccfc9eb7d74759 384989724-props: 24 384989543 53 0 113136892f2137aa0116093a524ade0b 384989782-cpath: /path/to/the/bad/file.exe 384989842-copyroot: 0 /
The number at the start of each line is the offset, we’ll use that soon. The cpath line is most interesting – this is the file you can expect to be corrupt. But it’s the :text: line that we need to change to get things working. As described here, (look for the section on the revision file format) this line is of the form “<rev> <offset> <length> <size> <digest>”. We don’t want to change the first 4 parameters – they’re most likely just fine. But the 5th parameter is the bad checksum, and we’ll need that in the next step.
3) Change the bad checksum to match the “actual” checksum which the svnadmin verify process is coming up with. Again, this is printed out when you run the verify. To make the change, I used hexedit, which thankfully doesn’t try to load the entire (huge) revision file into memory. You just fire it up, and press Return to enter the offset within the file to jump to. It wants it in hex, so a quick conversion turns 384989653 into 16F279D5. From there you can press Tab to switch to ASCII editing, quickly find the offending checksum and overwrite it with the new, valid checksum; then press Ctrl-X to save out the file and exit.
4) Re-run svnadmin verify. It should now successfully verify the broken revision and move on. If it doesn’t, check to see if the revision and checksum it’s failing on are the same – if they’re not then you have more broken files/revisions, and you should repeat steps 1 to 3 until they’re all gone. Hopefully there won’t be too many of them. And remember – just because your repository is now verifiable, doesn’t mean that your data is valid. All you’ve done is told the svnadmin tool that the checksum for the data you have is the same as the checksum it expects.