Work-around for failing "git svn clone" (requiring full history)

28,023

Solution 1

I ran into this problem when I had identically-named subdirectories within branches or tags.

For example, I had tags candidates/1.0.0 and releases/1.0.0, and this caused the documented error because subdirectory 1.0.0 appears within both candidates and releases.

Per git-svn docs:

When using multiple --branches or --tags, git svn does not automatically handle name collisions (for example, if two branches from different paths have the same name, or if a branch and a tag have the same name). In these cases, use init to set up your Git repository then, before your first fetch, edit the $GIT_DIR/config file so that the branches and tags are associated with different name spaces.

So while the following command failed due to similarly named candidates and releases tags:

git svn clone --authors-file=../authors.txt --no-metadata \
    --trunk=/trunk --branches=/branches --tags=/candidates \
    --tags=/releases --tags=/tags -r 100:HEAD \
    --prefix=origin/ \
    svn://example.com:3692/my-repos/path/to/project/

the following sequence of commands did work:

git svn init --no-metadata \
    --trunk=/trunk --branches=/branches --tags=/tags \
    --prefix=origin/ \
    'svn://example.com:3692/my-repos/path/to/project/'

git config --add svn-remote.svn.tags \
    'path/to/project/candidates/*:refs/remotes/origin/tags/Candidates/*'

git config --add svn-remote.svn.tags \
    'path/to/project/releases/*:refs/remotes/origin/tags/Releases/*'

git svn fetch --authors-file=../authors.txt -r100:HEAD

Note that this only worked because there were no other conflicts within branches and tags. If there were, I would have had to resolve them similarly.

After successfully cloning the SVN repository, I then executed the following steps in order to: turn SVN tags into GIT tags; turn trunk into master; turn other references into branches; and relocate remote paths:

# Make tags into true tags
cp -Rf .git/refs/remotes/origin/tags/* .git/refs/tags/
rm -Rf .git/refs/remotes/origin/tags

# Make other references into branches
cp -Rf .git/refs/remotes/origin/* .git/refs/heads/
rm -Rf .git/refs/remotes/origin
cp -Rf .git/refs/remotes/* .git/refs/heads/ # May be missing; that's okay
rm -Rf .git/refs/remotes

# Change 'trunk' to 'master'
git checkout trunk
git branch -d master
git branch -m trunk master

Solution 2

Not a full answer, but perhaps the snippet you are missing (I am interested in migrating as well, so I have found that part of the puzzle).

When you look at the documentation of git-svn, you will find the following option:

--no-minimize-url 

When tracking multiple directories (using --stdlayout, --branches, or --tags options), git svn will attempt to connect to the root (or highest allowed level) of the Subversion repository. This default allows better tracking of history if entire projects are moved within a repository, but may cause issues on repositories where read access restrictions are in place. Passing --no-minimize-url will allow git svn to accept URLs as-is without attempting to connect to a higher level directory. This option is off by default when only one URL/branch is tracked (it would do little good).

This fits to the situation you have, so that git svn does not try to read a higher level of the directory tree (which will be blocked).

At least you could give it a try ...

Solution 3

I recently migrated a long list of SVN repositories into Git and towards the end ran into this problem. Our SVN structure was pretty sloppy, so I had to use --no-minimize-url quite a bit. Typically, I'd run a command like:

$ git svn clone http://[url]/svn/[repo]/[path-to-code] \
            -s --no-minimize-url \
            -A authors.txt

The last few migrations I ran had a space in the URL. I don't know if it was the space or something else, but I was getting the same error you were seeing. I didn't want to get into modifying config files if I didn't have to, and luckily I ended up finding a solution. I ended up skipping the -s --no-minimize-url options in favor of explicitly declaring the paths differently.

$ git svn clone http://[url]/svn/[repo]/ \
            --trunk="/[path-to-code]/trunk" \
            --branches="/[path-to-code]/branches" \
            --tags="/[path-to-code]/tags" \
            -A authors.txt \
            --follow-parent
  • Note that I added --follow-parent from your example, but I'm also not sure that it made any difference.
  • Remember that these repos had spaces in them, hence the "" around the trunk/branches/tags paths.

Solution 4

[I realize this should be a comment on Jeff Fairley's answer but I don't have the reputation to post it as such. Since the original poster did ask for confirmation the approach worked I'm providing it as an answer.]

I can confirm that his solution works for the problem he (and I) ran into caused by spaces in the path. I had the same requirements (clone a single module from an SVN repo with history) except that I had no branches or tags to worry about whatsoever.

I tried several permutations of providing the full path to the module in the URL (e.g. using --no-minimise-url, specifying --trunk or --stdlayout) with no success. For me the result was usually a git repo with a full history log but no files whatsoever. This may or may not be the same problem FooF encountered (no read access in SVN) but it was certainly caused by having a space in the path to my module.

Trying again with only the SVN repo base as the URL and the path to my module in --trunk worked flawlessly. Afterwards my .git/config looks like this:

[core]
        repositoryformatversion = 0
        filemode = false
        bare = false
        loggallrefupdates = true
        symlinks = false
        ignorecase = true
        hideDotFiles = dotGitOnly
[svn-remote "svn"]
        url = https://[url]/svn/[repo]
        fetch = trunk/[path-to-code]:refs/remotes/trunk
[svn]
        authorsfile = ~/working/authors-transform.txt

and subsequent git and git svn commands are throwing no errors at all. Thanks Jeff!

Solution 5

[This is the original poster speaking writing. The below used to be update to the question, but as it solved the case - albeit unsatisfactorily to my taste - I will post it as an answer lacking a better solution.]

I do not like this, but I ended up doing clone splitted into init and fetch with some editing of .git/config between (repopath=apps/module, gitreponame=module):

$ git svn init--username=mysvnusername \
            --branches=/src/branches/ \
            --trunk=/src/trunk/${repopath} \
            --tags=/src/tags/ \
            http://svnserver/svn/src ${gitreponame}
$ cd ${gitreponame}
$ sed -i.bak "s|*:|*/${repopath}:|" .git/config
$ git svn fetch --authors-file=../authors.txt --follow-parent

I could not find how to specify the branches for subdirectory migration with git svn - hence the editing of the .git/config file. The following unified diff illustrates the effect of the editing with sed:

 [svn-remote "svn"]
        url = http://svnserver/svn/src
        fetch = trunk/apps/module:refs/remotes/trunk
-       branches = branches/*:refs/remotes/*
-       tags = tags/*:refs/remotes/tags/*
+       branches = branches/*/apps/module:refs/remotes/*
+       tags = tags/*/apps/module:refs/remotes/tags/*

As the actual desired HEAD was in an another URL, I ended just adding another [svn-remote] section to .git/config:

+ [svn-remote "svn-newest"]
+       url = http://svnserver/svn/src
+       fetch = branches/x/y/apps/module:refs/remotes/trunk
+       branches = branches/*/apps/module:refs/remotes/*
+       tags = tags/*/apps/module:refs/remotes/tags/*

(in real life experiment I also added here some branches that were not picked up by the first fetch), and fetching again:

$ git svn fetch --authors-file=../authors.txt --follow-parent svn-newest

This way I ended having the full Subversion history migrated to the newly generated git repository.

Note-1 : I probably could have just told my "trunk" to be branches/x/y/apps/module as the meaning of "trunk" for git-svn seems to basically have the meaning of git HEAD (Subversion concepts of trunk, branches, tags have no deep technical basis, they are matter of socially agreed convention).

Note-2 : probably --follow-parent is not required for git svn fetch, but I have no way of knowing or experimenting now.

Note-3 : While earlier reading of svn2git which seems to be a wrapper over git-svn I failed to see the motivation, but seeing the messy presentation of tags I kind of get it now. I would try svn2git next time if I had to try doing this again.

P.S. This is rather awkward way of doing the operation. Secondary problem here (why the editing of the .git/config by external was required) seems to be that

  1. Subversion branches do not have any essential technical meaning (branches and tags in Subversion are just a socially agreed labels for a versioned file system copy together with "standard" or otherwise socially agreed convention where the copies are done - trunk also has no technical meaning), and
  2. git svn implementation strictly assumes the social Subversion conventions to be followed to a degree (which is not possible if you just want to migrate a subdirectory and not the whole Subversion repository).

TODO: It would be helpful to have the format of the .git/config file explained here as it relates to git svn - for example I have now (after one and half year of writing the original answer) no idea what the [svn-remote "svn-newest"] means above. Also the approach could be automated by writing a script, but this is beyond my current interest in the problem and I do not have access to the original Subversion repository or replication of the issue.

Share:
28,023
FooF
Author by

FooF

Insert some text here...

Updated on March 21, 2020

Comments

  • FooF
    FooF about 4 years

    I want to convert a Subversion repository sub-directory (denoted by module here) into a git repository with full history. There are many svn copy operations (Subversion people call them branches) in the history of my Subversion repository. The release policy has been that after each release or other branches created, the old URL is left unused and the new URL replaces the old one for containing the work.

    Optimally, by my reading, it seems like this should do the trick:

    $ git svn clone --username=mysvnusername --authors-file=authors.txt \
        --follow-parent \
        http://svnserver/svn/src/branches/x/y/apps/module module
    

    (where branches/x/y/ depicts the newest branch). But I got an error, which looks something like this:

    W: Ignoring error from SVN, path probably does not exist: (160013): Filesystem has no item: '/svn/src/!svn/bc/100/branches/x/y/apps/module' path not found
    W: Do not be alarmed at the above message git-svn is just searching aggressively for old history.
    

    (Update: Adding option --no-minimize-url to the above does not remove the error message.)

    The directory module get created and populated, but the Subversion history past the newest svn copy commit is not imported (the git repository created ends up having just two commits when I expected hundreds).

    The question is, how to export the full Subversion history in the presence of this situation?

    Possible Cause

    1. Searching for the error message, I found this: git-svn anonymous checkout fails with -s which linked to this Subversion issue: http://subversion.tigris.org/issues/show_bug.cgi?id=3242

      What I understand by my reading, something in Subversion 1.5 changed about how the client accesses the repository. With newer Subversion, if there is no read access to some super directory of the URL path (true for me, svn ls http://svnserver/svn fails with 403 Forbidden), then we fail with some Subversion operations.

    2. Jeff Fairley in his answer points out that spaces in the Subversion URL might also cause this error message (confirmed by user Owen). Have a look at his solution to see how he solved the case if your git svn clone is failing for the same resson.

    3. Dejay Clayton in his answer reveals that if the deepest subdirectory components in branch and tag svn urls are equally named (e.g. .../tags/release/1.0.0 and .../branches/release-candidates/1.0.0) then this error could occur.