10 Commits

Author SHA1 Message Date
Frej Drejhammar
1181a0af47 Allow name sanitizer to be disabled with --no-auto-sanitize
Make it possible to completely disable the name sanitizer by the
--no-auto-sanitize flag. Previously the sanitizer was run on user
remapped names. As the sanitizer rewrites perfectly legal git
names (such as __.*) this is probably not what the user wants.

Closes #155.
2019-09-13 14:56:32 +02:00
Frej Drejhammar
7ab47e002f Merge branch 'jpaugh-patch-1'
Closes #164
2019-09-12 20:14:42 +02:00
Jonathan Paugh
96762f5474 README: Fix broken links
Use "footnote" style links to prevent future issues whenever the text is formatted to a specific length.
2019-09-11 16:46:55 -05:00
Frej Drejhammar
fcdc91634a Merge branch 'be-non-pep349-tolerant'
Closes: #143
Closes: #160
2019-09-01 18:31:46 +02:00
Frej Drejhammar
f57fba000b Try to do the right thing on non PEP394 compliant systems
PEP 394 [1] tells us that on systems with both a python 2 and 3
installed, the python 2 interpreter should be installed as python2.

Unfortunately not all distributions adheres to PEP 394 (I'm looking at
you, Windows) so to handle that we first try to find a 'python2', then
fall back on plain 'python'. In order to not silently pick a python 3
by mistake, we check sys.version_info using the the interpreter we
found.

[1] https://www.python.org/dev/peps/pep-0394/
2019-09-01 18:31:18 +02:00
Frej Drejhammar
b25cbd6753 Merge branch 'pr/157-v3'
Closes #156
2019-08-18 11:57:53 +02:00
MokhamedDakhraui
581b1b3d17 Remove git submodules if .hgsubstate file was removed or emptied 2019-08-18 05:46:46 +03:00
MokhamedDakhraui
7df01ac323 Refactor refresh_gitmodules()
Use the change context substate field instead of manually parsing the `.hgsubstate` file.
2019-08-16 02:42:03 +03:00
MokhamedDakhraui
914f5a0dbe Replaced several lambdas by one loop 2019-08-16 02:41:54 +03:00
MokhamedDakhraui
8779cb5e95 Extract operations with submodules to separated methods 2019-08-16 02:40:44 +03:00
3 changed files with 99 additions and 52 deletions

View File

@@ -4,26 +4,28 @@ hg-fast-export.(sh|py) - mercurial to git converter using git-fast-import
Legal
-----
Most hg-* scripts are licensed under the [MIT license]
(http://www.opensource.org/licenses/mit-license.php) and were written
Most hg-* scripts are licensed under the [MIT license] and were written
by Rocco Rutte <pdmef@gmx.net> with hints and help from the git list and
\#mercurial on freenode. hg-reset.py is licensed under GPLv2 since it
copies some code from the mercurial sources.
The current maintainer is Frej Drejhammar <frej.drejhammar@gmail.com>.
[MIT license]: http://www.opensource.org/licenses/mit-license.php
Support
-------
If you have problems with hg-fast-export or have found a bug, please
create an issue at the [github issue tracker]
(https://github.com/frej/fast-export/issues). Before creating a new
create an issue at the [github issue tracker]. Before creating a new
issue, check that your problem has not already been addressed in an
already closed issue. Do not contact the maintainer directly unless
you want to report a security bug. That way the next person having the
same problem can benefit from the time spent solving the problem the
first time.
[github issue tracker]: https://github.com/frej/fast-export/issues
System Requirements
-------------------
@@ -99,6 +101,12 @@ name the -B and -T options allow a mapping file to be specified to
rename branches and tags (respectively). The syntax of the mapping
file is the same as for the author mapping.
When the -B and -T flags are used, you will probably want to use the
-n flag to disable the built-in (broken in many cases) sanitizing of
branch/tag names. In the future -n will become the default, but in
order to not break existing incremental conversions, the default
remains with the old behavior.
Content filtering
-----------------

View File

@@ -31,6 +31,10 @@ cfg_export_boundary=1000
subrepo_cache={}
submodule_mappings=None
# True if fast export should automatically try to sanitize
# author/branch/tag names.
auto_sanitize = None
def gitmode(flags):
return 'l' in flags and '120000' or 'x' in flags and '100755' or '100644'
@@ -127,52 +131,55 @@ def get_author(logmessage,committer,authors):
return r
return committer
def remove_gitmodules(ctx):
"""Removes all submodules of ctx parents"""
# Removing all submoduies coming from all parents is safe, as the submodules
# of the current commit will be re-added below. A possible optimization would
# be to only remove the submodules of the first parent.
for parent_ctx in ctx.parents():
for submodule in parent_ctx.substate.keys():
wr('D %s' % submodule)
wr('D .gitmodules')
def refresh_gitmodules(ctx):
"""Updates list of ctx submodules according to .hgsubstate file"""
remove_gitmodules(ctx)
gitmodules=""
# Create the .gitmodules file and all submodules
for name,subrepo_info in ctx.substate.items():
gitRepoLocation=submodule_mappings[name] + "/.git"
# Populate the cache to map mercurial revision to git revision
if not name in subrepo_cache:
subrepo_cache[name]=(load_cache(gitRepoLocation+"/hg2git-mapping"),
load_cache(gitRepoLocation+"/hg2git-marks",
lambda s: int(s)-1))
(mapping_cache,marks_cache)=subrepo_cache[name]
subrepo_hash=subrepo_info[1]
if subrepo_hash in mapping_cache:
revnum=mapping_cache[subrepo_hash]
gitSha=marks_cache[int(revnum)]
wr('M 160000 %s %s' % (gitSha,name))
sys.stderr.write("Adding/updating submodule %s, revision %s->%s\n"
% (name,subrepo_hash,gitSha))
gitmodules+='[submodule "%s"]\n\tpath = %s\n\turl = %s\n' % (name,name,
submodule_mappings[name])
else:
sys.stderr.write("Warning: Could not find hg revision %s for %s in git %s\n" %
(subrepo_hash,name,gitRepoLocation))
if len(gitmodules):
wr('M 100644 inline .gitmodules')
wr('data %d' % (len(gitmodules)+1))
wr(gitmodules)
def export_file_contents(ctx,manifest,files,hgtags,encoding='',plugins={}):
count=0
max=len(files)
for file in files:
if submodule_mappings and ctx.substate and file==".hgsubstate":
# Remove all submodules as we don't detect deleted submodules properly
# in any other way. We will add the ones not deleted back again below.
for module in submodule_mappings.keys():
wr('D %s' % module)
# Read .hgsubstate file in order to find the revision of each subrepo
data=ctx.filectx(file).data()
subHashes={}
for line in data.split('\n'):
if line.strip()=="":
continue
cols=line.split(' ')
subHashes[cols[1]]=cols[0]
gitmodules=""
# Create the .gitmodules file and all submodules
for name in ctx.substate:
gitRepoLocation=submodule_mappings[name] + "/.git"
# Populate the cache to map mercurial revision to git revision
if not name in subrepo_cache:
subrepo_cache[name]=(load_cache(gitRepoLocation+"/hg2git-mapping"),
load_cache(gitRepoLocation+"/hg2git-marks",
lambda s: int(s)-1))
(mapping_cache, marks_cache)=subrepo_cache[name]
if subHashes[name] in mapping_cache:
revnum=mapping_cache[subHashes[name]]
gitSha=marks_cache[int(revnum)]
wr('M 160000 %s %s' % (gitSha, name))
sys.stderr.write("Adding submodule %s, revision %s->%s\n"
% (name,subHashes[name],gitSha))
gitmodules+='[submodule "%s"]\n\tpath = %s\n\turl = %s\n' % (name, name, submodule_mappings[name])
else:
sys.stderr.write("Warning: Could not find hg revision %s for %s in git %s\n" % (subHashes[name],name,gitRepoLocation))
if len(gitmodules):
wr('M 100644 inline .gitmodules')
wr('data %d' % (len(gitmodules)+1))
wr(gitmodules)
if submodule_mappings and file==".hgsubstate":
refresh_gitmodules(ctx)
# Skip .hgtags files. They only get us in trouble.
if not hgtags and file == ".hgtags":
sys.stderr.write('Skip %s\n' % (file))
@@ -223,6 +230,8 @@ def sanitize_name(name,what="branch", mapping={}):
if name[0] == '.': return '_'+name[1:]
return name
if not auto_sanitize:
return mapping.get(name,name)
n=mapping.get(name,name)
p=re.compile('([[ ~^:?\\\\*]|\.\.)')
n=p.sub('_', n)
@@ -307,12 +316,14 @@ def export_commit(ui,repo,revision,old_marks,max,count,authors,
sys.stderr.write('%s: Exporting %s revision %d/%d with %d/%d/%d added/changed/removed files\n' %
(branch,type,revision+1,max,len(added),len(changed),len(removed)))
if fn_encoding:
removed=[r.decode(fn_encoding).encode('utf8') for r in removed]
for filename in removed:
if fn_encoding:
filename=filename.decode(fn_encoding).encode('utf8')
filename=strip_leading_slash(filename)
if filename=='.hgsubstate':
remove_gitmodules(ctx)
wr('D %s' % filename)
removed=[strip_leading_slash(x) for x in removed]
map(lambda r: wr('D %s' % r),removed)
export_file_contents(ctx,man,added,hgtags,fn_encoding,plugins)
export_file_contents(ctx,man,changed,hgtags,fn_encoding,plugins)
wr()
@@ -527,6 +538,9 @@ if __name__=='__main__':
parser=OptionParser()
parser.add_option("-n", "--no-auto-sanitize",action="store_false",
dest="auto_sanitize",default=True,
help="Do not perform built-in (broken in many cases) sanitizing of names")
parser.add_option("-m","--max",type="int",dest="max",
help="Maximum hg revision to import")
parser.add_option("--mapping",dest="mappingfile",
@@ -575,6 +589,7 @@ if __name__=='__main__':
(options,args)=parser.parse_args()
m=-1
auto_sanitize = options.auto_sanitize
if options.max!=None: m=options.max
if options.marksfile==None: bail(parser,'--marks')

View File

@@ -26,7 +26,29 @@ SFX_MARKS="marks"
SFX_HEADS="heads"
SFX_STATE="state"
GFI_OPTS=""
PYTHON=${PYTHON:-python2}
if [ -z "${PYTHON}" ]; then
# $PYTHON is not set, so we try to find a working python 2.7 to
# use. PEP 394 tells us to use 'python2', otherwise try plain
# 'python'.
if command -v python2 > /dev/null; then
PYTHON="python2"
elif command -v python > /dev/null; then
PYTHON="python"
else
echo "Could not find any python interpreter, please use the 'PYTHON'" \
"environment variable to specify the interpreter to use."
exit 1
fi
fi
# Check that the python specified by the user or autodetected above is
# >= 2.7 and < 3.
if ! ${PYTHON} -c 'import sys; v=sys.version_info; exit(0 if v.major == 2 and v.minor >= 7 else 1)' > /dev/null 2>&1 ; then
echo "${PYTHON} is not a working python 2.7 interpreter, please use the" \
"'PYTHON' environment variable to specify the interpreter to use."
exit 1
fi
USAGE="[--quiet] [-r <repo>] [--force] [-m <max>] [-s] [--hgtags] [-A <file>] [-B <file>] [-T <file>] [-M <name>] [-o <name>] [--hg-hash] [-e <encoding>]"
LONG_USAGE="Import hg repository <repo> up to either tip or <max>
@@ -48,6 +70,8 @@ Options:
-B <file> Read branch map from file
-T <file> Read tags map from file
-M <name> Set the default branch name (defaults to 'master')
-n Do not perform built-in (broken in many cases) sanitizing
of branch/tag names.
-o <name> Use <name> as branch namespace to track upstream (eg 'origin')
--hg-hash Annotate commits with the hg hash as git notes in the
hg namespace.