10 Commits

Author SHA1 Message Date
Frej Drejhammar
1181a0af47 Allow name sanitizer to be disabled with --no-auto-sanitize
Make it possible to completely disable the name sanitizer by the
--no-auto-sanitize flag. Previously the sanitizer was run on user
remapped names. As the sanitizer rewrites perfectly legal git
names (such as __.*) this is probably not what the user wants.

Closes #155.
2019-09-13 14:56:32 +02:00
Frej Drejhammar
7ab47e002f Merge branch 'jpaugh-patch-1'
Closes #164
2019-09-12 20:14:42 +02:00
Jonathan Paugh
96762f5474 README: Fix broken links
Use "footnote" style links to prevent future issues whenever the text is formatted to a specific length.
2019-09-11 16:46:55 -05:00
Frej Drejhammar
fcdc91634a Merge branch 'be-non-pep349-tolerant'
Closes: #143
Closes: #160
2019-09-01 18:31:46 +02:00
Frej Drejhammar
f57fba000b Try to do the right thing on non PEP394 compliant systems
PEP 394 [1] tells us that on systems with both a python 2 and 3
installed, the python 2 interpreter should be installed as python2.

Unfortunately not all distributions adheres to PEP 394 (I'm looking at
you, Windows) so to handle that we first try to find a 'python2', then
fall back on plain 'python'. In order to not silently pick a python 3
by mistake, we check sys.version_info using the the interpreter we
found.

[1] https://www.python.org/dev/peps/pep-0394/
2019-09-01 18:31:18 +02:00
Frej Drejhammar
b25cbd6753 Merge branch 'pr/157-v3'
Closes #156
2019-08-18 11:57:53 +02:00
MokhamedDakhraui
581b1b3d17 Remove git submodules if .hgsubstate file was removed or emptied 2019-08-18 05:46:46 +03:00
MokhamedDakhraui
7df01ac323 Refactor refresh_gitmodules()
Use the change context substate field instead of manually parsing the `.hgsubstate` file.
2019-08-16 02:42:03 +03:00
MokhamedDakhraui
914f5a0dbe Replaced several lambdas by one loop 2019-08-16 02:41:54 +03:00
MokhamedDakhraui
8779cb5e95 Extract operations with submodules to separated methods 2019-08-16 02:40:44 +03:00
3 changed files with 99 additions and 52 deletions

View File

@@ -4,26 +4,28 @@ hg-fast-export.(sh|py) - mercurial to git converter using git-fast-import
Legal Legal
----- -----
Most hg-* scripts are licensed under the [MIT license] Most hg-* scripts are licensed under the [MIT license] and were written
(http://www.opensource.org/licenses/mit-license.php) and were written
by Rocco Rutte <pdmef@gmx.net> with hints and help from the git list and by Rocco Rutte <pdmef@gmx.net> with hints and help from the git list and
\#mercurial on freenode. hg-reset.py is licensed under GPLv2 since it \#mercurial on freenode. hg-reset.py is licensed under GPLv2 since it
copies some code from the mercurial sources. copies some code from the mercurial sources.
The current maintainer is Frej Drejhammar <frej.drejhammar@gmail.com>. The current maintainer is Frej Drejhammar <frej.drejhammar@gmail.com>.
[MIT license]: http://www.opensource.org/licenses/mit-license.php
Support Support
------- -------
If you have problems with hg-fast-export or have found a bug, please If you have problems with hg-fast-export or have found a bug, please
create an issue at the [github issue tracker] create an issue at the [github issue tracker]. Before creating a new
(https://github.com/frej/fast-export/issues). Before creating a new
issue, check that your problem has not already been addressed in an issue, check that your problem has not already been addressed in an
already closed issue. Do not contact the maintainer directly unless already closed issue. Do not contact the maintainer directly unless
you want to report a security bug. That way the next person having the you want to report a security bug. That way the next person having the
same problem can benefit from the time spent solving the problem the same problem can benefit from the time spent solving the problem the
first time. first time.
[github issue tracker]: https://github.com/frej/fast-export/issues
System Requirements System Requirements
------------------- -------------------
@@ -99,6 +101,12 @@ name the -B and -T options allow a mapping file to be specified to
rename branches and tags (respectively). The syntax of the mapping rename branches and tags (respectively). The syntax of the mapping
file is the same as for the author mapping. file is the same as for the author mapping.
When the -B and -T flags are used, you will probably want to use the
-n flag to disable the built-in (broken in many cases) sanitizing of
branch/tag names. In the future -n will become the default, but in
order to not break existing incremental conversions, the default
remains with the old behavior.
Content filtering Content filtering
----------------- -----------------

View File

@@ -31,6 +31,10 @@ cfg_export_boundary=1000
subrepo_cache={} subrepo_cache={}
submodule_mappings=None submodule_mappings=None
# True if fast export should automatically try to sanitize
# author/branch/tag names.
auto_sanitize = None
def gitmode(flags): def gitmode(flags):
return 'l' in flags and '120000' or 'x' in flags and '100755' or '100644' return 'l' in flags and '120000' or 'x' in flags and '100755' or '100644'
@@ -127,52 +131,55 @@ def get_author(logmessage,committer,authors):
return r return r
return committer return committer
def remove_gitmodules(ctx):
"""Removes all submodules of ctx parents"""
# Removing all submoduies coming from all parents is safe, as the submodules
# of the current commit will be re-added below. A possible optimization would
# be to only remove the submodules of the first parent.
for parent_ctx in ctx.parents():
for submodule in parent_ctx.substate.keys():
wr('D %s' % submodule)
wr('D .gitmodules')
def refresh_gitmodules(ctx):
"""Updates list of ctx submodules according to .hgsubstate file"""
remove_gitmodules(ctx)
gitmodules=""
# Create the .gitmodules file and all submodules
for name,subrepo_info in ctx.substate.items():
gitRepoLocation=submodule_mappings[name] + "/.git"
# Populate the cache to map mercurial revision to git revision
if not name in subrepo_cache:
subrepo_cache[name]=(load_cache(gitRepoLocation+"/hg2git-mapping"),
load_cache(gitRepoLocation+"/hg2git-marks",
lambda s: int(s)-1))
(mapping_cache,marks_cache)=subrepo_cache[name]
subrepo_hash=subrepo_info[1]
if subrepo_hash in mapping_cache:
revnum=mapping_cache[subrepo_hash]
gitSha=marks_cache[int(revnum)]
wr('M 160000 %s %s' % (gitSha,name))
sys.stderr.write("Adding/updating submodule %s, revision %s->%s\n"
% (name,subrepo_hash,gitSha))
gitmodules+='[submodule "%s"]\n\tpath = %s\n\turl = %s\n' % (name,name,
submodule_mappings[name])
else:
sys.stderr.write("Warning: Could not find hg revision %s for %s in git %s\n" %
(subrepo_hash,name,gitRepoLocation))
if len(gitmodules):
wr('M 100644 inline .gitmodules')
wr('data %d' % (len(gitmodules)+1))
wr(gitmodules)
def export_file_contents(ctx,manifest,files,hgtags,encoding='',plugins={}): def export_file_contents(ctx,manifest,files,hgtags,encoding='',plugins={}):
count=0 count=0
max=len(files) max=len(files)
for file in files: for file in files:
if submodule_mappings and ctx.substate and file==".hgsubstate": if submodule_mappings and file==".hgsubstate":
# Remove all submodules as we don't detect deleted submodules properly refresh_gitmodules(ctx)
# in any other way. We will add the ones not deleted back again below.
for module in submodule_mappings.keys():
wr('D %s' % module)
# Read .hgsubstate file in order to find the revision of each subrepo
data=ctx.filectx(file).data()
subHashes={}
for line in data.split('\n'):
if line.strip()=="":
continue
cols=line.split(' ')
subHashes[cols[1]]=cols[0]
gitmodules=""
# Create the .gitmodules file and all submodules
for name in ctx.substate:
gitRepoLocation=submodule_mappings[name] + "/.git"
# Populate the cache to map mercurial revision to git revision
if not name in subrepo_cache:
subrepo_cache[name]=(load_cache(gitRepoLocation+"/hg2git-mapping"),
load_cache(gitRepoLocation+"/hg2git-marks",
lambda s: int(s)-1))
(mapping_cache, marks_cache)=subrepo_cache[name]
if subHashes[name] in mapping_cache:
revnum=mapping_cache[subHashes[name]]
gitSha=marks_cache[int(revnum)]
wr('M 160000 %s %s' % (gitSha, name))
sys.stderr.write("Adding submodule %s, revision %s->%s\n"
% (name,subHashes[name],gitSha))
gitmodules+='[submodule "%s"]\n\tpath = %s\n\turl = %s\n' % (name, name, submodule_mappings[name])
else:
sys.stderr.write("Warning: Could not find hg revision %s for %s in git %s\n" % (subHashes[name],name,gitRepoLocation))
if len(gitmodules):
wr('M 100644 inline .gitmodules')
wr('data %d' % (len(gitmodules)+1))
wr(gitmodules)
# Skip .hgtags files. They only get us in trouble. # Skip .hgtags files. They only get us in trouble.
if not hgtags and file == ".hgtags": if not hgtags and file == ".hgtags":
sys.stderr.write('Skip %s\n' % (file)) sys.stderr.write('Skip %s\n' % (file))
@@ -223,6 +230,8 @@ def sanitize_name(name,what="branch", mapping={}):
if name[0] == '.': return '_'+name[1:] if name[0] == '.': return '_'+name[1:]
return name return name
if not auto_sanitize:
return mapping.get(name,name)
n=mapping.get(name,name) n=mapping.get(name,name)
p=re.compile('([[ ~^:?\\\\*]|\.\.)') p=re.compile('([[ ~^:?\\\\*]|\.\.)')
n=p.sub('_', n) n=p.sub('_', n)
@@ -307,12 +316,14 @@ def export_commit(ui,repo,revision,old_marks,max,count,authors,
sys.stderr.write('%s: Exporting %s revision %d/%d with %d/%d/%d added/changed/removed files\n' % sys.stderr.write('%s: Exporting %s revision %d/%d with %d/%d/%d added/changed/removed files\n' %
(branch,type,revision+1,max,len(added),len(changed),len(removed))) (branch,type,revision+1,max,len(added),len(changed),len(removed)))
if fn_encoding: for filename in removed:
removed=[r.decode(fn_encoding).encode('utf8') for r in removed] if fn_encoding:
filename=filename.decode(fn_encoding).encode('utf8')
filename=strip_leading_slash(filename)
if filename=='.hgsubstate':
remove_gitmodules(ctx)
wr('D %s' % filename)
removed=[strip_leading_slash(x) for x in removed]
map(lambda r: wr('D %s' % r),removed)
export_file_contents(ctx,man,added,hgtags,fn_encoding,plugins) export_file_contents(ctx,man,added,hgtags,fn_encoding,plugins)
export_file_contents(ctx,man,changed,hgtags,fn_encoding,plugins) export_file_contents(ctx,man,changed,hgtags,fn_encoding,plugins)
wr() wr()
@@ -527,6 +538,9 @@ if __name__=='__main__':
parser=OptionParser() parser=OptionParser()
parser.add_option("-n", "--no-auto-sanitize",action="store_false",
dest="auto_sanitize",default=True,
help="Do not perform built-in (broken in many cases) sanitizing of names")
parser.add_option("-m","--max",type="int",dest="max", parser.add_option("-m","--max",type="int",dest="max",
help="Maximum hg revision to import") help="Maximum hg revision to import")
parser.add_option("--mapping",dest="mappingfile", parser.add_option("--mapping",dest="mappingfile",
@@ -575,6 +589,7 @@ if __name__=='__main__':
(options,args)=parser.parse_args() (options,args)=parser.parse_args()
m=-1 m=-1
auto_sanitize = options.auto_sanitize
if options.max!=None: m=options.max if options.max!=None: m=options.max
if options.marksfile==None: bail(parser,'--marks') if options.marksfile==None: bail(parser,'--marks')

View File

@@ -26,7 +26,29 @@ SFX_MARKS="marks"
SFX_HEADS="heads" SFX_HEADS="heads"
SFX_STATE="state" SFX_STATE="state"
GFI_OPTS="" GFI_OPTS=""
PYTHON=${PYTHON:-python2}
if [ -z "${PYTHON}" ]; then
# $PYTHON is not set, so we try to find a working python 2.7 to
# use. PEP 394 tells us to use 'python2', otherwise try plain
# 'python'.
if command -v python2 > /dev/null; then
PYTHON="python2"
elif command -v python > /dev/null; then
PYTHON="python"
else
echo "Could not find any python interpreter, please use the 'PYTHON'" \
"environment variable to specify the interpreter to use."
exit 1
fi
fi
# Check that the python specified by the user or autodetected above is
# >= 2.7 and < 3.
if ! ${PYTHON} -c 'import sys; v=sys.version_info; exit(0 if v.major == 2 and v.minor >= 7 else 1)' > /dev/null 2>&1 ; then
echo "${PYTHON} is not a working python 2.7 interpreter, please use the" \
"'PYTHON' environment variable to specify the interpreter to use."
exit 1
fi
USAGE="[--quiet] [-r <repo>] [--force] [-m <max>] [-s] [--hgtags] [-A <file>] [-B <file>] [-T <file>] [-M <name>] [-o <name>] [--hg-hash] [-e <encoding>]" USAGE="[--quiet] [-r <repo>] [--force] [-m <max>] [-s] [--hgtags] [-A <file>] [-B <file>] [-T <file>] [-M <name>] [-o <name>] [--hg-hash] [-e <encoding>]"
LONG_USAGE="Import hg repository <repo> up to either tip or <max> LONG_USAGE="Import hg repository <repo> up to either tip or <max>
@@ -48,6 +70,8 @@ Options:
-B <file> Read branch map from file -B <file> Read branch map from file
-T <file> Read tags map from file -T <file> Read tags map from file
-M <name> Set the default branch name (defaults to 'master') -M <name> Set the default branch name (defaults to 'master')
-n Do not perform built-in (broken in many cases) sanitizing
of branch/tag names.
-o <name> Use <name> as branch namespace to track upstream (eg 'origin') -o <name> Use <name> as branch namespace to track upstream (eg 'origin')
--hg-hash Annotate commits with the hg hash as git notes in the --hg-hash Annotate commits with the hg hash as git notes in the
hg namespace. hg namespace.