When an import is restarted the first new note commit must use
refs/notes/hg^0 as the parent. As refs/notes/hg is only updated at the
end of a session we cannot have it present in all note commits. Neither
can we generate new marks for note commits as that would require a new
mapping scheme from hg versions numbers to git marks. A new mapping
scheme would break existing incremental import setups.
We therefore restructure the code to do the notes at the end of an
import session, thus only requiring a refs/notes/hg^0 reference in the
first commit.
Branch and tag names can now be renamed using a mechanism similar to the
-A option for author names.
-B specifies a mapping file for branch names, and -T a mapping file for
tags.
Apparently a bug (http://bz.selenic.com/show_bug.cgi?id=3511) in
multiple released versions of Mercurial could produce commits where
files had absolute paths.
As a "healthy" repo should not contain any absolute paths, it should be
safe to always strip a leading '/' from the path and let the conversion
continue.
When a mercurial repository does not use utf-8 for encoding author
strings and commit messages the "-e <encoding>" command line option
can be used to force fast-export to convert incoming meta data from
<encoding> to utf-8.
When "-e <encoding>" is given, we use Python's string
decoding/encoding API to convert meta data on the fly when processing
commits.
If there is a tag with the same name as a tag, "git rev-parse <name>"
can give the hash of the tag instead of the branch. "git rev-parse
refs/heads/<name>" must be used to make sure we only find branches.
If you run the commands listed in usage
```bash
mkdir repo-git # or whatever
cd repo-git
git init
hg-fast-export.sh -r <repo>
```
you are not given a working directory to start working in. I was
caught off-guard by this when I ran `git status` and everything in the
repo was listed as deleted. A quick google search indicates I'm not
the only one who was surprised.
If the --hg-hash argument is given, the converted commits are
annotated with the original hg hash as a git note in the "hg"
namespace.
The notes can be shown by git log using the "--notes=hg" argument.
According to the POSIX standard, egrep is an obsolescent equivalent
of grep -E. In fact, the patterns actually being used with egrep do
not require use of extended regular expressions at all, so a plain
'grep' can be used rather than 'grep -E'.
Replace egrep with grep to improve compatibility across systems.
In a merge commit, the first parent is always the same parent that
would be recorded if the commit were not a merge and the other
parent(s) record the commit(s) being merged in.
Preserving this order is important so that log --first-parent works
properly and also so that the merge history is not distorted by an
incorrect permutation of the DAG.
Remove the code that sorts the merge parents based on node id so
that the correct DAG order is preserved.
The authors file format accepted by git-svnimport and git-cvsimport
actually allows blank lines and comment lines that start with '#'.
Ignore blank lines and lines starting with '#' as the first
non-whitespace character to be compatible with the authors file
format accepted by the referenced tools.
Intercept -h/--help before git-sh-setup so the proper script name
can be shown instead of "hg fast-export.sh" which is wrong.
Reorder the long option descriptions to be in the same order as
the short usage since, as the help says, "argument order matters."
Add support for a new --hgtags option. When given, any .hgtags
files that may be present are exported.
Normally this is not desirable. However, when attempting to mimic
the actions of other hg exporters that always export any .hgtags
files this option can help produce matching export data.
If the file mode changes (for example from 10644 to 10755), but the
actual text of the file itself does not, then the change could be
missed since the hashes would remain the same.
If the hashes match, also compare the gitmode values before deciding
the file is unchanged.
Since hg runs and supports older versions of python, hg-fast-export.py
should too. Replace dictionary comprehension with equivalent code that
supports versions of python older than 2.7.
Originally 9643aa5d did this by using a bashism even though the
/bin/sh interpreter is being used.
Then ea55929e attempted to compensate for this by disabling the
bashism when the interpreter was not actually bash which results
in the hg-fast-export.py exit code still being ignored in that case.
Instead check the error code without requiring a bashism.
hg-fast-export.sh always passes the --repo flag to hg-fast-export.py.
If, for some reason, we have a state file where the repo-url is an
empty string the checks in hg-fast-export.py will not work and the
user will be confused. Therefore we check that the url is specified
before calling hg-fast-export.py.
Because on Windows sys.stdout is initially in text mode, any LF
characters written to it will be transformed to CRLF, which causes git
to blow up. This change uses Windows platform-specific code to change
sys.stdout to binary mode.
After an update to Mercurial 2.3 the module 'repo' was removed and the
program crashed when trying to convert a repository. I checked the
imports with 'pyflakes' and removed all unused ones, repo (among
others) was never used.
http://www.selenic.com/repo/hg/rev/1ac628cd7113#l9.1