29 Commits

Author SHA1 Message Date
Frej Drejhammar
8762fee403 Merge branch 'gh/337' 2025-03-30 13:47:22 +02:00
Frank Zingsheim
bd707b5d6e Fix: Largefiles ignored #141
Import mercurial large files as ordinary files into git

The basic idea to this fix is based on
https://github.com/planestraveler/fast-export/tree/add-lfs-support-v2
from PR #65

Closes #141
2025-03-29 18:39:27 +01:00
Frej Drejhammar
0afd336d6f Merge branch 'gh/333' 2024-07-13 19:37:00 +02:00
Thalia Archibald
dd1c8f219b Disable core.ignoreCase in tests
When core.ignoreCase is set in the global config, hg-fast-export.sh
warns the user and exits. Override this for tests.
2024-07-06 02:46:07 -07:00
Thalia Archibald
f947189dcc Consistently terminate commit messages with LF
When the length logic for fast-import 'data' commands was updated in
4c10270 (Fix data handling, 2023-03-02), one branch was missed, so
commit messages now do not have a final LF appended in most cases. This
changed the longtime behavior, which had been consistent since the first
commit of hg2git, 9832035 (Initial import, 2007-03-06), and is expected
by some applications which compare against old conversions from
Mercurial.
2024-07-05 05:20:35 -07:00
Frej Drejhammar
2a3806576c Merge branch 'gh/328' 2024-04-07 15:30:23 +02:00
Frej Drejhammar
08e2297853 CI: Add tests to avoid a repeat of #328
Extend tests to cover the file content filter example plugins in order
to avoid a repeat of #328.
2024-04-07 15:25:04 +02:00
Frej Drejhammar
893d6302b7 Fix errors resulting from #318
When commit ddfc3a8300 ("Run file_data_filter on deleted files")
started calling the file_data_filter plugin method, in order to make
deletion of plugin-renamed files work, the example plugins were not
updated. This commit updates the example plugins to not crash when the
file context is None.

Thanks to @hetas discovering this.

Closes 328
2024-04-07 15:23:08 +02:00
Frej Drejhammar
3de7bcfc18 CI: Remove run-tests script
The script should have been removed in 90c6ad5f87 ("test: use make
to run the tests").
2024-03-02 20:25:29 +01:00
Frej Drejhammar
d72e96b202 Drop manual CodeQL actions
Use default configuration as configured in the web interface instead
of hand-configured ci-actions which gives warnings.
2024-02-23 18:11:10 +01:00
Frej Drejhammar
fb225c4700 Merge branch 'gh/321' 2024-02-23 17:07:02 +01:00
Frej Drejhammar
997e8e1a8c Merge branch 'gh/320'
Fixes warnings appearing with Python 3.12.

hg-fast-export.py:231: SyntaxWarning: invalid escape sequence '\.'
2024-02-23 17:04:28 +01:00
Stephan Hohe
ddb574004f Add tests for plugins setting file content to None 2024-02-23 13:43:28 +01:00
Stephan Hohe
e63feee1b9 Don't add file if plugin sets content to None 2024-02-20 17:07:23 +01:00
Stephan Hohe
7b4bb7ff1d Fix escape in regular expression 2024-02-19 23:40:05 +01:00
Frej Drejhammar
53bbe05278 Merge branch 'frej/gh318'
Closes #318
2024-02-16 17:56:17 +01:00
Frej Drejhammar
ddfc3a8300 Run file_data_filter on deleted files
The `file_data_filter` method should be called when files are deleted.
In this case the `data` and `file_ctx` keys map to None. This is so
that a filter which modifies file names can apply the same name
transformations before files are deleted.
2024-02-16 17:12:49 +01:00
Frej Drejhammar
21ab3f347b Make plugin loader look in directories relative to cwd
Make the plugin loader also look for plugins using a path relative to
the current working directory.
2024-02-16 17:06:51 +01:00
Frej Drejhammar
878ba44f48 Merge branch 'frej/run-tests-with-different-python-versions' 2023-12-28 13:48:02 +01:00
Frej Drejhammar
2476d08517 Run tests with multiple Python versions
Run the CI tests with both the earliest supported Python version and
the latest stable release.

The intent is to quickly notice when new features require adjusting
the oldest supported Python version and also detect when the latest
stable version breaks old code (as when 3.12 removed `imp` and we
witched to `importlib` in #311).
2023-12-28 13:40:48 +01:00
Frej Drejhammar
d4298a0906 Check for a supported Python version on startup
Check that hg-fast-export is running on a supported version of Python
on startup. This is an attempt to avoid problems like #314 in the
future.
2023-12-28 13:40:48 +01:00
Frej Drejhammar
efe934e16b Update required version of Python to 3.7
Due to problems with handling of Unicode input in Python < 3.7, bump
the required version of Python to 3.7.
2023-12-28 13:40:48 +01:00
Frej Drejhammar
59675eca22 Add command line flag to dump found versions
Add `--debug` command line flag which dumps the detected versions of
Mercurial and Python. This will probably help future debugging when
unexpected versions are used.
2023-12-28 13:40:48 +01:00
Frej Drejhammar
3c694243c4 Merge branch 'frej/fix-314' 2023-12-28 13:39:42 +01:00
Frej Drejhammar
1bbf7028b4 Don't look for a Python 2 interpreter
Don't look for a Python 2 interpreter as Python is no longer
supported. If there is a Python 2 available and it had the Mercurial
modules available, hg-fast-export would use it and fail to import
`importlib.machinery`. This is probably the cause of #314.

Closes #314.
2023-12-27 13:18:56 +01:00
Frej Drejhammar
c8fa290adf Merge branch 'PR/312' 2023-11-18 20:39:44 +01:00
Ekin Dursun
c49dd0cf60 Remove Python 2 compatibility code
Python 2 support was removed recently, so we don't need the
compatibility code anymore.
2023-11-18 20:22:18 +03:00
Frej Drejhammar
4f94d61d84 Merge branch 'PR/311'
Closes #311
2023-11-18 14:54:53 +01:00
Ekin Dursun
a3d0562737 Make pluginloader use importlib instead imp
Python 3.12 has removed imp and it's recommended to use importlib
instead. Python 2.7 doesn't have importlib, so Python 2.7 support is
ceased (not a big deal since it's been more than 3 years since it was
EOLed) as a part of this change.
2023-11-12 20:41:43 +03:00
22 changed files with 566 additions and 175 deletions

1
.github/requirements-earliest.txt vendored Normal file
View File

@@ -0,0 +1 @@
mercurial==5.2

2
.github/requirements-latest.txt vendored Normal file
View File

@@ -0,0 +1,2 @@
mercurial

View File

@@ -8,24 +8,64 @@ on:
branches: [master]
jobs:
test:
name: Run test suite
test-earliest:
name: Run test suite on the earliest supported Python version
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v4
name: Checkout repository
with:
fetch-depth: 1
submodules: 'recursive'
- uses: actions/setup-python@v5
id: earliest
with:
python-version: '3.7.x'
check-latest: true
cache: 'pip'
cache-dependency-path: '**/requirements-earliest.txt'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r .github/requirements-earliest.txt
- name: Report selected versions
run: |
echo Selected '${{ steps.earliest.outputs.python-version }}'
./hg-fast-export.sh --debug
- name: Run tests on earliest supported Python version
run: make -C t
test-latest:
name: Run test suite on the latest supported python version
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
with:
fetch-depth: 1
submodules: 'recursive'
- uses: actions/checkout@v4
name: Checkout repository
with:
fetch-depth: 1
submodules: 'recursive'
- uses: actions/setup-python@v5
id: latest
with:
python-version: '3.x'
check-latest: true
cache: 'pip'
cache-dependency-path: '**/requirements-latest.txt'
- name: Run tests
run: make -C t
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r .github/requirements-latest.txt
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: python
- name: Report selected version
run: |
echo Selected '${{ steps.latest.outputs.python-version }}'
./hg-fast-export.sh --debug
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
- name: Run tests on 3.x
run: make -C t

View File

@@ -29,10 +29,9 @@ first time.
System Requirements
-------------------
This project depends on Python 2.7 or 3.5+, and the Mercurial >= 4.6
package (>= 5.2, if Python 3.5+). If Python is not installed, install
it before proceeding. The Mercurial package can be installed with `pip
install mercurial`.
This project depends on Python (>=3.7) and the Mercurial package (>=
5.2). If Python is not installed, install it before proceeding. The
Mercurial package can be installed with `pip install mercurial`.
On windows the bash that comes with "Git for Windows" is known to work
well.
@@ -110,8 +109,8 @@ branch/tag names. In the future -n will become the default, but in
order to not break existing incremental conversions, the default
remains with the old behavior.
By default, the `default` mercurial branch is renamed to the `master`
branch on git. If your mercurial repo contains both `default` and
By default, the `default` mercurial branch is renamed to the `master`
branch on git. If your mercurial repo contains both `default` and
`master` branches, you'll need to override this behavior. Use
`-M <newName>` to specify what name to give the `default` branch.
@@ -139,6 +138,15 @@ if [ "$3" == "1" ]; then cat; else dos2unix -q; fi
-- End of crlf-filter.sh --
```
Mercurial Largefiles Extension
------------------------------
Mercurial largefiles are exported as ordinary files into git, i.e. not
as git lfs files. In order to make the export work, make sure that
you have all largefiles of all mercurial commits available locally.
This can be ensured by either cloning the mercurial repository with
the option --all-largefiles or by executing the command
'hg lfpull --rev "all()"' inside the mercurial repository.
Plugins
-----------------
@@ -180,7 +188,7 @@ values in the dictionary after filters have been run are used to create the git
commit.
```
file_data = {'filename':filename,'file_ctx':file_ctx,'d':d}
file_data = {'filename':filename,'file_ctx':file_ctx,'data':file_contents}
def file_data_filter(self,file_data):
```
@@ -190,6 +198,11 @@ can be modified by any filter. `file_ctx` is the filecontext from the
mercurial python library. After all filters have been run, the values
are used to add the file to the git commit.
The `file_data_filter` method is also called when files are deleted,
but in this case the `data` and `file_ctx` keys map to None. This is
so that a filter which modifies file names can apply the same name
transformations when files are deleted.
Submodules
----------
See README-SUBMODULES.md for how to convert subrepositories into git

View File

@@ -1,6 +1,7 @@
#!/usr/bin/env python2
#!/usr/bin/env python3
# Copyright (c) 2007, 2008 Rocco Rutte <pdmef@gmx.net> and others.
# Copyright (c) 2025 Siemens
# License: MIT <http://www.opensource.org/licenses/mit-license.php>
from hg2git import setup_repo,fixup_user,get_branch,get_changeset
@@ -11,17 +12,7 @@ import sys
import os
from binascii import hexlify
import pluginloader
PY2 = sys.version_info.major == 2
if PY2:
str = unicode
if PY2 and sys.platform == "win32":
# On Windows, sys.stdout is initially opened in text mode, which means that
# when a LF (\n) character is written to sys.stdout, it will be converted
# into CRLF (\r\n). That makes git blow up, so use this platform-specific
# code to change the mode of sys.stdout to binary.
import msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
from hgext.largefiles import lfutil
# silly regex to catch Signed-off-by lines in log message
sob_re=re.compile(b'^Signed-[Oo]ff-[Bb]y: (.+)$')
@@ -37,16 +28,13 @@ submodule_mappings=None
# author/branch/tag names.
auto_sanitize = None
stdout_buffer = sys.stdout if PY2 else sys.stdout.buffer
stderr_buffer = sys.stderr if PY2 else sys.stderr.buffer
def gitmode(flags):
return b'l' in flags and b'120000' or b'x' in flags and b'100755' or b'100644'
def wr_no_nl(msg=b''):
assert isinstance(msg, bytes)
if msg:
stdout_buffer.write(msg)
sys.stdout.buffer.write(msg)
def wr(msg=b''):
wr_no_nl(msg + b'\n')
@@ -59,7 +47,7 @@ def wr_data(data):
def checkpoint(count):
count=count+1
if cfg_checkpoint_count>0 and count%cfg_checkpoint_count==0:
stderr_buffer.write(b"Checkpoint after %d commits\n" % count)
sys.stderr.buffer.write(b"Checkpoint after %d commits\n" % count)
wr(b'checkpoint')
wr()
return count
@@ -128,7 +116,7 @@ def remove_gitmodules(ctx):
def refresh_git_submodule(name,subrepo_info):
wr(b'M 160000 %s %s' % (subrepo_info[1],name))
stderr_buffer.write(
sys.stderr.buffer.write(
b"Adding/updating submodule %s, revision %s\n" % (name, subrepo_info[1])
)
return b'[submodule "%s"]\n\tpath = %s\n\turl = %s\n' % (name, name, subrepo_info[0])
@@ -148,14 +136,14 @@ def refresh_hg_submodule(name,subrepo_info):
revnum=mapping_cache[subrepo_hash]
gitSha=marks_cache[int(revnum)]
wr(b'M 160000 %s %s' % (gitSha,name))
stderr_buffer.write(
sys.stderr.buffer.write(
b"Adding/updating submodule %s, revision %s->%s\n"
% (name, subrepo_hash, gitSha)
)
return b'[submodule "%s"]\n\tpath = %s\n\turl = %s\n' % (name,name,
submodule_mappings[name])
else:
stderr_buffer.write(
sys.stderr.buffer.write(
b"Warning: Could not find hg revision %s for %s in git %s\n"
% (subrepo_hash, name, gitRepoLocation,)
)
@@ -176,6 +164,32 @@ def refresh_gitmodules(ctx):
wr(b'M 100644 inline .gitmodules')
wr_data(gitmodules)
def is_largefile(filename):
return filename[:6] == b'.hglf/'
def largefile_orig_name(filename):
return filename[6:]
def largefile_data(ctx, file, filename):
lf_file_ctx=ctx.filectx(file)
lf_hash=lf_file_ctx.data().strip(b'\n')
sys.stderr.write("Detected large file hash %s\n" % lf_hash.decode())
#should detect where the large files are located
file_with_data = lfutil.findfile(ctx.repo(), lf_hash)
if file_with_data is None:
# Autodownloading from the mercurial repository would be an issue as there
# is a good chance that we may need to input some username and password.
# This will surely break fast-export as there will be some unexpected
# output.
sys.stderr.write("Large file wasn't found in local cache.\n")
sys.stderr.write("Please clone with --all-largefiles\n")
sys.stderr.write("or pull all large files with 'hg lfpull --rev "
"\"all()\"'\n")
# closing in the middle of import will revert everything to the last checkpoint
sys.exit(3)
with open(os.path.normpath(file_with_data), 'rb') as file_with_data_handle:
return file_with_data_handle.read()
def export_file_contents(ctx,manifest,files,hgtags,encoding='',plugins={}):
count=0
max=len(files)
@@ -186,19 +200,23 @@ def export_file_contents(ctx,manifest,files,hgtags,encoding='',plugins={}):
refresh_gitmodules(ctx)
# Skip .hgtags files. They only get us in trouble.
if not hgtags and file == b".hgtags":
stderr_buffer.write(b'Skip %s\n' % file)
sys.stderr.buffer.write(b'Skip %s\n' % file)
continue
if encoding:
filename=file.decode(encoding).encode('utf8')
else:
filename=file
if b'.git' in filename.split(b'/'): # Even on Windows, the path separator is / here.
stderr_buffer.write(
sys.stderr.buffer.write(
b'Ignoring file %s which cannot be tracked by git\n' % filename
)
continue
file_ctx=ctx.filectx(file)
d=file_ctx.data()
if is_largefile(filename):
filename = largefile_orig_name(filename)
d = largefile_data(ctx, file, filename)
else:
file_ctx=ctx.filectx(file)
d=file_ctx.data()
if plugins and plugins['file_data_filters']:
file_data = {'filename':filename,'file_ctx':file_ctx,'data':d}
@@ -208,15 +226,16 @@ def export_file_contents(ctx,manifest,files,hgtags,encoding='',plugins={}):
filename=file_data['filename']
file_ctx=file_data['file_ctx']
wr(b'M %s inline %s' % (gitmode(manifest.flags(file)),
strip_leading_slash(filename)))
wr(b'data %d' % len(d)) # had some trouble with size()
wr(d)
count+=1
if count%cfg_export_boundary==0:
stderr_buffer.write(b'Exported %d/%d files\n' % (count,max))
if d is not None:
wr(b'M %s inline %s' % (gitmode(manifest.flags(file)),
strip_leading_slash(filename)))
wr(b'data %d' % len(d)) # had some trouble with size()
wr(d)
count+=1
if count%cfg_export_boundary==0:
sys.stderr.buffer.write(b'Exported %d/%d files\n' % (count,max))
if max>cfg_export_boundary:
stderr_buffer.write(b'Exported %d/%d files\n' % (count,max))
sys.stderr.buffer.write(b'Exported %d/%d files\n' % (count,max))
def sanitize_name(name,what="branch", mapping={}):
"""Sanitize input roughly according to git-check-ref-format(1)"""
@@ -242,7 +261,7 @@ def sanitize_name(name,what="branch", mapping={}):
if not auto_sanitize:
return mapping.get(name,name)
n=mapping.get(name,name)
p=re.compile(b'([\\[ ~^:?\\\\*]|\.\.)')
p=re.compile(b'([\\[ ~^:?\\\\*]|\\.\\.)')
n=p.sub(b'_', n)
if n[-1:] in (b'/', b'.'): n=n[:-1]+b'_'
n=b'/'.join([dot(s) for s in n.split(b'/')])
@@ -250,7 +269,7 @@ def sanitize_name(name,what="branch", mapping={}):
n=p.sub(b'_', n)
if n!=name:
stderr_buffer.write(
sys.stderr.buffer.write(
b'Warning: sanitized %s [%s] to [%s]\n' % (what.encode(), name, n)
)
return n
@@ -294,7 +313,7 @@ def export_commit(ui,repo,revision,old_marks,max,count,authors,
parents = commit_data['parents']
author = commit_data['author']
user = commit_data['committer']
desc = commit_data['desc'] + b'\n'
desc = commit_data['desc']
if len(parents)==0 and revision != 0:
wr(b'reset refs/heads/%s' % branch)
@@ -304,7 +323,7 @@ def export_commit(ui,repo,revision,old_marks,max,count,authors,
if sob:
wr(b'author %s %d %s' % (author,time,timezone))
wr(b'committer %s %d %s' % (user,time,timezone))
wr_data(desc)
wr_data(desc + b'\n')
man=ctx.manifest()
@@ -320,17 +339,28 @@ def export_commit(ui,repo,revision,old_marks,max,count,authors,
modified,removed=get_filechanges(repo,revision,parents,files)
stderr_buffer.write(
sys.stderr.buffer.write(
b'%s: Exporting %s revision %d/%d with %d/%d modified/removed files\n'
% (branch, type.encode(), revision + 1, max, len(modified), len(removed))
)
for filename in removed:
for file in removed:
if fn_encoding:
filename=filename.decode(fn_encoding).encode('utf8')
filename=file.decode(fn_encoding).encode('utf8')
else:
filename=file
if plugins and plugins['file_data_filters']:
file_data = {'filename':filename, 'file_ctx':None, 'data':None}
for filter in plugins['file_data_filters']:
filter(file_data)
filename=file_data['filename']
filename=strip_leading_slash(filename)
if filename==b'.hgsub':
remove_gitmodules(ctx)
if is_largefile(filename):
filename=largefile_orig_name(filename)
wr(b'D %s' % filename)
export_file_contents(ctx,man,modified,hgtags,fn_encoding,plugins)
@@ -366,18 +396,18 @@ def export_tags(ui,repo,old_marks,mapping_cache,count,authors,tagsmap):
if tag==b'tip': continue
# ignore tags to nodes that are missing (ie, 'in the future')
if hexlify(node) not in mapping_cache:
stderr_buffer.write(b'Tag %s refers to unseen node %s\n' % (tag, hexlify(node)))
sys.stderr.buffer.write(b'Tag %s refers to unseen node %s\n' % (tag, hexlify(node)))
continue
rev=int(mapping_cache[hexlify(node)])
ref=revnum_to_revref(rev, old_marks)
if ref==None:
stderr_buffer.write(
sys.stderr.buffer.write(
b'Failed to find reference for creating tag %s at r%d\n' % (tag, rev)
)
continue
stderr_buffer.write(b'Exporting tag [%s] at [hg r%d] [git %s]\n' % (tag, rev, ref))
sys.stderr.buffer.write(b'Exporting tag [%s] at [hg r%d] [git %s]\n' % (tag, rev, ref))
wr(b'reset refs/tags/%s' % tag)
wr(b'from %s' % ref)
wr()
@@ -411,8 +441,8 @@ def load_mapping(name, filename, mapping_is_raw):
def parse_quoted_line(line):
m=quoted_regexp.match(line)
if m==None:
return
return
return (process_unicode_escape_sequences(m.group(1)),
process_unicode_escape_sequences(m.group(5)))
@@ -464,12 +494,12 @@ def verify_heads(ui,repo,cache,force,ignore_unnamed_heads,branchesmap):
sha1=get_git_sha1(sanitized_name)
c=cache.get(sanitized_name)
if not c and sha1:
stderr_buffer.write(
sys.stderr.buffer.write(
b'Error: Branch [%s] already exists and was not created by hg-fast-export, '
b'export would overwrite unrelated branch\n' % b)
if not force: return False
elif sha1!=c:
stderr_buffer.write(
sys.stderr.buffer.write(
b'Error: Branch [%s] modified outside hg-fast-export:'
b'\n%s (repo) != %s (cache)\n' % (b, b'<None>' if sha1 is None else sha1, c)
)
@@ -481,7 +511,7 @@ def verify_heads(ui,repo,cache,force,ignore_unnamed_heads,branchesmap):
for h in repo.filtered(b'visible').heads():
branch=get_branch(repo[h].branch())
if t.get(branch,False):
stderr_buffer.write(
sys.stderr.buffer.write(
b'Error: repository has an unnamed head: hg r%d\n'
% repo.changelog.rev(h)
)

View File

@@ -29,7 +29,7 @@ GFI_OPTS=""
if [ -z "${PYTHON}" ]; then
# $PYTHON is not set, so we try to find a working python with mercurial:
for python_cmd in python2 python python3; do
for python_cmd in python3 python; do
if command -v $python_cmd > /dev/null; then
$python_cmd -c 'from mercurial.scmutil import revsymbol' 2> /dev/null
if [ $? -eq 0 ]; then
@@ -45,6 +45,14 @@ if [ -z "${PYTHON}" ]; then
exit 1
fi
"${PYTHON}" -c 'import sys; exit(sys.version_info.major==3 and sys.version_info.minor >= 7)'
if [ $? -eq 0 ]; then
echo "Could not find an interpreter for a supported Python version (>= 3.7)" \
"Please use the 'PYTHON' environment variable to specify the interpreter to use."
exit 1
fi
USAGE="[--quiet] [-r <repo>] [--force] [--ignore-unnamed-heads] [-m <max>] [-s] [--hgtags] [-A <file>] [-B <file>] [-T <file>] [-M <name>] [-o <name>] [--hg-hash] [-e <encoding>]"
LONG_USAGE="Import hg repository <repo> up to either tip or <max>
If <repo> is omitted, use last hg repository as obtained from state file,
@@ -86,6 +94,14 @@ case "$1" in
echo ""
echo "$LONG_USAGE"
exit 0
;;
--debug)
echo -n "Using Python: "
"${PYTHON}" --version
echo -n "Using Mercurial: "
hg --version
exit 0
esac
IS_BARE=$(git rev-parse --is-bare-repository) \

View File

@@ -1,4 +1,4 @@
#!/usr/bin/env python
#!/usr/bin/env python3
# Copyright (c) 2007, 2008 Rocco Rutte <pdmef@gmx.net> and others.
# License: GPLv2

View File

@@ -1,4 +1,4 @@
#!/usr/bin/env python2
#!/usr/bin/env python3
# Copyright (c) 2007, 2008 Rocco Rutte <pdmef@gmx.net> and others.
# License: MIT <http://www.opensource.org/licenses/mit-license.php>
@@ -12,13 +12,6 @@ import os
import sys
import subprocess
PY2 = sys.version_info.major < 3
if PY2:
str = unicode
fsencode = lambda s: s.encode(sys.getfilesystemencoding())
else:
from os import fsencode
# default git branch name
cfg_master=b'master'
# default origin name
@@ -44,7 +37,7 @@ def setup_repo(url):
myui.setconfig(b'ui', b'interactive', b'off')
# Avoids a warning when the repository has obsolete markers
myui.setconfig(b'experimental', b'evolution.createmarkers', True)
return myui,hg.repository(myui, fsencode(url)).unfiltered()
return myui,hg.repository(myui, os.fsencode(url)).unfiltered()
def fixup_user(user,authors):
user=user.strip(b"\"")

View File

@@ -1,19 +1,23 @@
import os
import imp
import importlib.machinery
import importlib.util
PluginFolder = os.path.join(os.path.dirname(os.path.realpath(__file__)),"..","plugins")
MainModule = "__init__"
def get_plugin(name, plugin_path):
search_dirs = [PluginFolder]
search_dirs = [PluginFolder, '.']
if plugin_path:
search_dirs = [plugin_path] + search_dirs
for dir in search_dirs:
location = os.path.join(dir, name)
if not os.path.isdir(location) or not MainModule + ".py" in os.listdir(location):
continue
info = imp.find_module(MainModule, [location])
return {"name": name, "info": info, "path": location}
spec = importlib.machinery.PathFinder.find_spec(MainModule, [location])
return {"name": name, "spec": spec, "path": location}
raise Exception("Could not find plugin with name " + name)
def load_plugin(plugin):
return imp.load_module(MainModule, *plugin["info"])
spec = plugin["spec"]
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module

View File

@@ -6,6 +6,8 @@ class Filter():
pass
def file_data_filter(self,file_data):
if file_data['file_ctx'] == None:
return
file_ctx = file_data['file_ctx']
if not file_ctx.isbinary():
file_data['data'] = file_data['data'].replace(b'\r\n', b'\n')

View File

@@ -15,6 +15,8 @@ class Filter:
d = file_data['data']
file_ctx = file_data['file_ctx']
filename = file_data['filename']
if file_ctx == None:
return
filter_cmd = self.filter_contents + [filename, node.hex(file_ctx.filenode()), '1' if file_ctx.isbinary() else '0']
try:
filter_proc = subprocess.Popen(filter_cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE)

View File

@@ -1,49 +0,0 @@
#!/bin/bash
READLINK="readlink"
if command -v greadlink > /dev/null; then
READLINK="greadlink" # Prefer greadlink over readlink
fi
if ! $READLINK -f "$(which "$0")" > /dev/null 2>&1 ; then
ROOT="$(dirname "$(which "$0")")"
if [ ! -f "$ROOT/hg-fast-export.py" ] ; then
echo "test runner requires a readlink implementation which knows" \
" how to canonicalize paths in order to be called via a symlink."
exit 1
fi
else
ROOT="$(dirname "$($READLINK -f "$(which "$0")")")"
fi
export SHARNESS_TEST_SRCDIR="${SHARNESS_TEST_SRCDIR:-$ROOT/t/sharness}"
TESTS=$(find $ROOT/t -maxdepth 1 -name \*.t -executable -type f)
failed=0
type parallel >& /dev/null
if [ $? -eq 0 ]; then
echo "Using parallel to run tests"
function F() {
echo "Running test $1"
$1
}
export -f F
parallel F ::: $TESTS || failed=1
else
for i in $TESTS ; do
echo "Running test $i"
$i || failed=1
done
fi
if [ "$failed" -eq "0" ]; then
echo "All tests passed";
else
echo "There were failed tests";
fi
exit $failed

View File

@@ -0,0 +1,30 @@
blob
mark :1
data 7
good_a
reset refs/heads/master
commit refs/heads/master
mark :2
author Grevious Bodily Harmsworth <gbh@example.com> 1679014800 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679014800 +0000
data 3
r0
M 100644 :1 good_a.txt
commit refs/heads/master
mark :3
author Grevious Bodily Harmsworth <gbh@example.com> 1679018400 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679018400 +0000
data 3
r1
from :2
commit refs/heads/master
mark :4
author Grevious Bodily Harmsworth <gbh@example.com> 1679022000 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679022000 +0000
data 3
r2
from :3

View File

@@ -0,0 +1,91 @@
#!/bin/bash
#
# Copyright (c) 2023 Felipe Contreras
# Copyright (c) 2023 Frej Drejhammar
# Copyright (c) 2024 Stephan Hohe
#
# Check that files that file_data_filter sets to None are removed from repository
#
test_description='Remove files from file_data_filter plugin test'
. "${SHARNESS_TEST_SRCDIR-$(dirname "$0")/sharness}"/sharness.sh || exit 1
check() {
echo "$3" > expected &&
git -C "$1" show -q --format='%s' "$2" > actual &&
test_cmp expected actual
}
git_create() {
git init -q "$1" &&
git -C "$1" config core.ignoreCase false
}
git_convert() {
(
cd "$2" &&
hg-fast-export.sh --repo "../$1" \
-s --hgtags -n \
--plugin ../../plugins/removefiles_test_plugin
)
}
setup() {
cat > "$HOME"/.hgrc <<-EOF
[ui]
username = Grevious Bodily Harmsworth <gbh@example.com>
EOF
}
commit0() {
(
# Test inital revision with suppressed file
cd hgrepo &&
echo "good_a" > good_a.txt &&
echo "bad_a" > bad_a.txt &&
hg add good_a.txt bad_a.txt &&
hg commit -d "2023-03-17 01:00Z" -m "r0"
)
}
commit1() {
(
# Test modifying suppressed file
# Test adding suppressed file
cd hgrepo &&
echo "bad_a_modif" > bad_a.txt &&
echo "bad_b" > bad_b.txt &&
hg add bad_b.txt &&
hg commit -d "2023-03-17 02:00Z" -m "r1"
)
}
commit2() {
(
# Test removing suppressed file
cd hgrepo &&
hg rm bad_a.txt &&
hg commit -d "2023-03-17 03:00Z" -m "r2"
)
}
setup
test_expect_success 'all in one' '
test_when_finished "rm -rf hgrepo gitrepo" &&
(
hg init hgrepo &&
commit0 &&
commit1 &&
commit2
) &&
git_create gitrepo &&
git_convert hgrepo gitrepo &&
git -C gitrepo fast-export --all > actual &&
test_cmp "$SHARNESS_TEST_DIRECTORY"/file_data_filter-removefiles.expected actual
'
test_done

View File

@@ -0,0 +1,29 @@
blob
mark :1
data 7
a_file
blob
mark :2
data 17
a_file_to_rename
reset refs/heads/master
commit refs/heads/master
mark :3
author Grevious Bodily Harmsworth <gbh@example.com> 1679014800 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679014800 +0000
data 3
r0
M 100644 :1 a.txt
M 100644 :2 c.txt
commit refs/heads/master
mark :4
author Grevious Bodily Harmsworth <gbh@example.com> 1679018400 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679018400 +0000
data 3
r1
from :3
D c.txt

84
t/file_data_filter.t Executable file
View File

@@ -0,0 +1,84 @@
#!/bin/bash
#
# Copyright (c) 2023 Felipe Contreras
# Copyright (c) 2023 Frej Drejhammar
#
# Check that the file_data_filter is called for removed files.
#
test_description='Smoke test'
. "${SHARNESS_TEST_SRCDIR-$(dirname "$0")/sharness}"/sharness.sh || exit 1
check() {
echo "$3" > expected &&
git -C "$1" show -q --format='%s' "$2" > actual &&
test_cmp expected actual
}
git_create() {
git init -q "$1" &&
git -C "$1" config core.ignoreCase false
}
git_convert() {
(
cd "$2" &&
hg-fast-export.sh --repo "../$1" \
-s --hgtags -n \
--plugin ../../plugins/rename_file_test_plugin \
--plugin dos2unix \
--plugin shell_filter_file_contents=../../plugins/id
)
}
setup() {
cat > "$HOME"/.hgrc <<-EOF
[ui]
username = Grevious Bodily Harmsworth <gbh@example.com>
EOF
}
commit0() {
(
cd hgrepo &&
echo "a_file" > a.txt &&
echo "a_file_to_rename" > b.txt &&
hg add a.txt b.txt &&
hg commit -d "2023-03-17 01:00Z" -m "r0"
)
}
commit1() {
(
cd hgrepo &&
hg remove b.txt &&
hg commit -d "2023-03-17 02:00Z" -m "r1"
)
}
make-branch() {
hg branch "$1"
FILE=$(echo "$1" | sha1sum | cut -d " " -f 1)
echo "$1" > $FILE
hg add $FILE
hg commit -d "2023-03-17 $2:00Z" -m "Added file in branch $1"
}
setup
test_expect_success 'all in one' '
test_when_finished "rm -rf hgrepo gitrepo" &&
(
hg init hgrepo &&
commit0 &&
commit1
) &&
git_create gitrepo &&
git_convert hgrepo gitrepo &&
git -C gitrepo fast-export --all > actual &&
test_cmp "$SHARNESS_TEST_DIRECTORY"/file_data_filter.expected actual
'
test_done

View File

@@ -17,6 +17,7 @@ git_clone() {
(
git init -q "$2" &&
cd "$2" &&
git config core.ignoreCase false &&
hg-fast-export.sh --repo "../$1"
)
}
@@ -91,4 +92,53 @@ test_expect_success 'merge' '
test_cmp expected actual
'
test_expect_success 'hg large file' '
test_when_finished "rm -rf hgrepo gitrepo" &&
(
hg init hgrepo &&
cd hgrepo &&
echo "[extensions]" >> .hg/hgrc
echo "largefiles =" >> .hg/hgrc
echo a > content &&
echo a > file1 &&
hg add content &&
hg add --large file1 &&
hg commit -m "origin" &&
echo b > content &&
echo b > file2 &&
hg add --large file2 &&
hg rm file1 &&
hg commit -m "right" &&
hg update -r0 &&
echo c > content &&
hg commit -m "left" &&
HGMERGE=true hg merge -r1 &&
hg commit -m "merge"
) &&
git_clone hgrepo gitrepo &&
cat > expected <<-EOF &&
left
c
tree @:
content
file2
EOF
(
cd gitrepo
git show -q --format='%s' @^ &&
git show @:content &&
git show @:
) > actual &&
test_cmp expected actual
'
test_done

2
t/plugins/id Executable file
View File

@@ -0,0 +1,2 @@
#!/bin/bash
cat

View File

@@ -0,0 +1,15 @@
import subprocess
import shlex
import sys
from mercurial import node
def build_filter(args):
return Filter(args)
class Filter:
def __init__(self, args):
self.filter_contents = shlex.split(args)
def file_data_filter(self,file_data):
if file_data['filename'].startswith(b'bad'):
file_data['data'] = None

View File

@@ -0,0 +1,15 @@
import subprocess
import shlex
import sys
from mercurial import node
def build_filter(args):
return Filter(args)
class Filter:
def __init__(self, args):
self.filter_contents = shlex.split(args)
def file_data_filter(self,file_data):
if file_data['filename'] == b'b.txt':
file_data['filename'] = b'c.txt'

View File

@@ -13,8 +13,9 @@ commit refs/heads/master
mark :3
author Grevious Bodily Harmsworth <gbh@example.com> 1679014800 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679014800 +0000
data 2
r0M 100644 :1 a.txt
data 3
r0
M 100644 :1 a.txt
M 100644 :2 b.txt
blob
@@ -31,8 +32,9 @@ commit refs/tags/2019_Spring_R2
mark :6
author Grevious Bodily Harmsworth <gbh@example.com> 1679018400 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679018400 +0000
data 2
r1from :3
data 3
r1
from :3
M 100644 :4 c.txt
M 100644 :5 d.txt
@@ -45,8 +47,9 @@ commit refs/heads/mainline
mark :8
author Grevious Bodily Harmsworth <gbh@example.com> 1679019000 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679019000 +0000
data 51
Added tag 2019 Spring R2 for changeset e92e41dde44ffrom :6
data 52
Added tag 2019 Spring R2 for changeset e92e41dde44f
from :6
M 100644 :7 .hgtags
blob
@@ -63,8 +66,9 @@ commit refs/heads/mainline
mark :11
author Grevious Bodily Harmsworth <gbh@example.com> 1679022000 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679022000 +0000
data 2
r2from :8
data 3
r2
from :8
M 100644 :9 e.txt
M 100644 :10 f.txt
@@ -72,8 +76,9 @@ commit refs/heads/mainline
mark :12
author badly-formed-user <devnull@localhost> 1679025600 +0000
committer badly-formed-user <devnull@localhost> 1679025600 +0000
data 2
r3from :11
data 3
r3
from :11
M 100644 :9 g.txt
M 100644 :10 h.txt
@@ -91,8 +96,9 @@ commit refs/heads/renamed-feature
mark :15
author Grevious Bodily Harmsworth <gbh@example.com> 1679029200 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679029200 +0000
data 7
featurefrom :12
data 8
feature
from :12
M 100644 :13 feature-a.txt
M 100644 :14 feature-b.txt
@@ -105,8 +111,9 @@ commit refs/heads/valid-0
mark :17
author Grevious Bodily Harmsworth <gbh@example.com> 1679032800 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679032800 +0000
data 23
Added file in branch a?from :15
data 24
Added file in branch a?
from :15
M 100644 :16 c1086ce03e4f52aadd1c93b1d097da510138522a
blob
@@ -118,8 +125,9 @@ commit refs/heads/valid-1
mark :19
author Grevious Bodily Harmsworth <gbh@example.com> 1679036400 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679036400 +0000
data 23
Added file in branch a/from :17
data 24
Added file in branch a/
from :17
M 100644 :18 85ed6fbb96d655df9f194bc9107f2d86210b9263
blob
@@ -131,8 +139,9 @@ commit refs/heads/valid-2
mark :21
author Grevious Bodily Harmsworth <gbh@example.com> 1679040000 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679040000 +0000
data 24
Added file in branch a/bfrom :19
data 25
Added file in branch a/b
from :19
M 100644 :20 aae42d317509399fdda80c4d8e46774d152dbd04
blob
@@ -144,8 +153,9 @@ commit refs/heads/valid-3
mark :23
author Grevious Bodily Harmsworth <gbh@example.com> 1679043600 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679043600 +0000
data 24
Added file in branch a/?from :21
data 25
Added file in branch a/?
from :21
M 100644 :22 ba54a8de7fe91c5e6e0a2dd1b9b37de0976ff5a7
blob
@@ -157,8 +167,9 @@ commit refs/heads/valid-4
mark :25
author Grevious Bodily Harmsworth <gbh@example.com> 1679047200 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679047200 +0000
data 23
Added file in branch ?afrom :23
data 24
Added file in branch ?a
from :23
M 100644 :24 d4cde16119b586025976741e87775762a2598984
blob
@@ -170,8 +181,9 @@ commit refs/heads/valid-5
mark :27
author Grevious Bodily Harmsworth <gbh@example.com> 1679050800 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679050800 +0000
data 23
Added file in branch a.from :25
data 24
Added file in branch a.
from :25
M 100644 :26 b4ce96ddcee0706a8c51130917f910b2b29faf77
blob
@@ -183,8 +195,9 @@ commit refs/heads/valid-6
mark :29
author Grevious Bodily Harmsworth <gbh@example.com> 1679054400 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679054400 +0000
data 24
Added file in branch a.bfrom :27
data 25
Added file in branch a.b
from :27
M 100644 :28 97051191e1a92daa11165ef10770bf964268c58b
blob
@@ -196,8 +209,9 @@ commit refs/heads/valid-7
mark :31
author Grevious Bodily Harmsworth <gbh@example.com> 1679058000 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679058000 +0000
data 23
Added file in branch .afrom :29
data 24
Added file in branch .a
from :29
M 100644 :30 a667f8feec02fdfa6649772f844a24cf1ad5ebec
blob
@@ -209,8 +223,9 @@ commit refs/heads/valid-8
mark :33
author Grevious Bodily Harmsworth <gbh@example.com> 1679061600 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679061600 +0000
data 22
Added file in branch /from :31
data 23
Added file in branch /
from :31
M 100644 :32 8f27084b6294ddbe28dbcbf98f798730e8a79289
blob
@@ -222,8 +237,9 @@ commit refs/heads/___a
mark :35
author Grevious Bodily Harmsworth <gbh@example.com> 1679065200 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679065200 +0000
data 25
Added file in branch ___3from :33
data 26
Added file in branch ___3
from :33
M 100644 :34 9b171494eb6e5ce325934b1656e286ca0510a697
blob
@@ -235,8 +251,9 @@ commit refs/heads/__b
mark :37
author Grevious Bodily Harmsworth <gbh@example.com> 1679068800 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679068800 +0000
data 24
Added file in branch __2from :35
data 25
Added file in branch __2
from :35
M 100644 :36 5dca703b71d2613c6bb3262b9b1741d6165e4a2f
blob
@@ -248,8 +265,9 @@ commit refs/heads/_c
mark :39
author Grevious Bodily Harmsworth <gbh@example.com> 1679072400 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679072400 +0000
data 23
Added file in branch _1from :37
data 24
Added file in branch _1
from :37
M 100644 :38 2fee90e148a2afbd911b67ced9b6240151f904ec
blob
@@ -261,8 +279,9 @@ commit refs/heads/venom
mark :41
author Grevious Bodily Harmsworth <gbh@example.com> 1679076000 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679076000 +0000
data 45
Added file in branch Feature- 12V Vac "Venom"from :39
data 46
Added file in branch Feature- 12V Vac "Venom"
from :39
M 100644 :40 b01def8779aed4be2f4b7325a89992a9aa566fec
blob
@@ -274,7 +293,8 @@ commit refs/heads/abc
mark :43
author Grevious Bodily Harmsworth <gbh@example.com> 1679079600 +0000
committer Grevious Bodily Harmsworth <gbh@example.com> 1679079600 +0000
data 27
Added file in branch åäöfrom :41
data 28
Added file in branch åäö
from :41
M 100644 :42 a0d01fcbff5d86327d542687dcfd8b299d054147

View File

@@ -17,7 +17,8 @@ check() {
}
git_create() {
git init -q "$1"
git init -q "$1" &&
git -C "$1" config core.ignoreCase false
}
git_convert() {