44 Commits

Author SHA1 Message Date
Günther Nußmüller
d77765a23e Fix UnboundLocalError with plugins and largefiles
When Plugins are used in a repository that contains largefiles,
the following exception is thrown as soon as the first largefile
is converted:

```
Traceback (most recent call last):
  File "fast-export/hg-fast-export.py", line 728, in <module>
    sys.exit(hg2git(options.repourl,m,options.marksfile,options.mappingfile,
  File "fast-export/hg-fast-export.py", line 581, in hg2git
    c=export_commit(ui,repo,rev,old_marks,max,c,authors,branchesmap,
  File "fast-export/hg-fast-export.py", line 366, in export_commit
    export_file_contents(ctx,man,modified,hgtags,fn_encoding,plugins)
  File "fast-export/hg-fast-export.py", line 222, in export_file_contents
    file_data = {'filename':filename,'file_ctx':file_ctx,'data':d}
UnboundLocalError: local variable 'file_ctx' referenced before assignment
```

This commit fixes the error by:

 * initializing the file_ctx before the largefile handling takes place
 * Providing a new `is_largefile` value for plugins so they can detect
    if largefile handling was applied (and therefore the file_ctx
    object may no longer be in sync with the git version of the file)
2025-08-11 08:30:17 +02:00
Frank Zingsheim
bd707b5d6e Fix: Largefiles ignored #141
Import mercurial large files as ordinary files into git

The basic idea to this fix is based on
https://github.com/planestraveler/fast-export/tree/add-lfs-support-v2
from PR #65

Closes #141
2025-03-29 18:39:27 +01:00
Frej Drejhammar
ddfc3a8300 Run file_data_filter on deleted files
The `file_data_filter` method should be called when files are deleted.
In this case the `data` and `file_ctx` keys map to None. This is so
that a filter which modifies file names can apply the same name
transformations before files are deleted.
2024-02-16 17:12:49 +01:00
Frej Drejhammar
efe934e16b Update required version of Python to 3.7
Due to problems with handling of Unicode input in Python < 3.7, bump
the required version of Python to 3.7.
2023-12-28 13:40:48 +01:00
Ekin Dursun
a3d0562737 Make pluginloader use importlib instead imp
Python 3.12 has removed imp and it's recommended to use importlib
instead. Python 2.7 doesn't have importlib, so Python 2.7 support is
ceased (not a big deal since it's been more than 3 years since it was
EOLed) as a part of this change.
2023-11-12 20:41:43 +03:00
Felipe Contreras
88defe7fd1 README: cleanup initial instructions
The `git init` command can create the directory, and HEAD doesn't need
to be specified in `git checkout` (it's the default).

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
2023-03-04 09:53:25 -06:00
df
268299a358 Fix typo in README
Added dash to match the actual usage of the 'ignore-unnamed-heads' option
2022-11-19 18:15:04 +01:00
Nicolas Vanhoren
38e236962d Update README.md to change recommandation for crlf filtering 2022-09-21 01:37:39 +02:00
Frej Drejhammar
4227621eed Update contribution guidelines and make github display them
Try to make it clear that sloppy, throw it over the fence, patches
won't be accepted without revision and try to make sure a potential
contributor sees the warning while creating a pull request.
2021-07-29 15:28:01 +02:00
SirIntellegence
20c22a3110 Add plugin support for the 'extra' field
Permits plugins to import other information such as svn conversion revisions
2021-02-22 13:09:48 -07:00
Ray Luo
056756f193 Remove some ".py" wording
Avoid confusion about which file is the main entry point to fast-export,
in order to avoid the issue mentioned here

https://github.com/frej/fast-export/issues/158#issuecomment-754482516

Also fix a typo
2021-01-09 02:06:52 -08:00
Jason Winnebeck
89da4ad8af Document --ignore-unnamed-heads option 2020-11-14 21:24:54 -05:00
Frej Drejhammar
787e8559b9 Fix typo in README 2020-10-29 19:00:30 +01:00
Frej Drejhammar
7057ce2c2b Allow plugins to modify the committer
Plugins have since they were introduced been able to modify the author
of a commit, but not the committer. This patch adds the necessary
support for allowing them to also modify the committer.
2020-09-30 17:47:33 +02:00
Frej Drejhammar
2b6f735b8c Update section about submitting patches in README
Try to cover the most common reasons for requesting changes in PRs.
2020-09-09 14:08:00 +02:00
Ondrej Stanek
9c6dea9fd4 Pass original hg commit hash to plugins 2020-07-31 10:50:51 +02:00
Ethan Furman
21827a53f7 Add head2branch plugin
Support converting unnamed heads to named branches during mercurial
conversions.

Co-Authored-By:	ostan89@gmail.com
2020-07-31 10:49:08 +02:00
Ethan Furman
5c1cbf82b0 Add revision to commit_data for commit plugins
Co-Authored-By: ostan89@gmail.com
2020-07-31 10:48:33 +02:00
chrisjbillington
3b3f86b71e Allow utf8 in mappings
We were previously processing entries in mapping files (when
`--mappings-are-raw` is not given) with
`.decode('unicode_escape').encode('utf8')` to replace backslash escape
sequences in bytestrings with the utf-8 encoded characters they
represent. However, it turns out that `.decode
('unicode_escape')` assumes latin-1 encoding if it encounters non-ascii
bytes: https://bugs.python.org/issue21331. So this gave incorrect
results if non-ascii utf8 data was present in the mapping.

To fix this, we now add an extra layer of `.decode('utf8').encode
('unicode-escape')` in order to convert any non-ascii characters into
their backslash escape sequences. Then the subsequent
`.decode('unicode_escape')` only encounters ascii characters and gives
correct results.
2020-03-25 12:33:42 -04:00
Pi Delport
b37420f404 Fix link markup for hg-export-tool 2020-03-09 16:41:26 +02:00
Frej Drejhammar
160aa3c9ef Add a reference to hg-export-tool in the documentation
Add pointers to hg-export-tool as a way to batch convert multiple
Mercurial repos, and deal with duplicate heads.
2020-02-14 17:16:18 +01:00
chrisjbillington
b961f146df Support Python 3
Port hg-fast-import to Python 2/3 polyglot code.

Since mercurial accepts and returns bytestrings for all repository data,
the approach I've taken here is to use bytestrings throughout the
hg-fast-import code. All strings pertaining to repository data are
bytestrings. This means the code is using the same string datatype for
this data on Python 3 as it did (and still does) on Python 2.

Repository data coming from subprocess calls to git, or read from files,
is also left as the bytestrings either returned from
subprocess.check_output or as read from the file in 'rb' mode.

Regexes and string literals that are used with repository data have
all had a b'' prefix added.

When repository data is used in error/warning messages, it is decoded
with the UTF8 codec for printing.

With this patch, hg-fast-export.py writes binary output to
sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it
still uses sys.stdout.

The only strings that are left as "native" strings and not coerced to
bytestrings are filepaths passed in on the command line, and dictionary
keys for internal data structures used by hg-fast-import.py, that do
not originate in repository data.

Mapping files are read in 'rb' mode, and thus bytestrings are read from
them. When an encoding is given, their contents are decoded with that
encoding, but then immediately encoded again with UTF8 and they are
returned as the resulting bytestrings

Other necessary changes were:

 - indexing byestrings with a single index returns an integer on Python.
   These indexing operations have been replaced with a one-element
   slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring.

 - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash)

 - str(integer) -> b'%d' % integer

 - 'string_escape' codec replaced with 'unicode_escape' (which was
    backported to python 2.7). Strings decoded with this codec were then
    immediately re-encoded with UTF8.

 - Calls to map() intended to execute their contents immediately were
   unwrapped or converted to list comprehensions, since map() is an
   iterator and does not execute until iterated over.

hg-fast-export.sh has been modified to not require Python 2. Instead, if
PYTHON has not been defined, it checks python2, python, then python3,
and uses the first one that exists and can import the mercurial module.
2020-02-13 14:35:19 -05:00
Frej Drejhammar
29a457eccf Merge branch 'PR/198'
Closes 198
2020-02-08 16:08:56 +01:00
Frej Drejhammar
fa8ebd994d Add link to what's expected for commit messages to the README 2020-02-08 15:50:17 +01:00
Frej Drejhammar
e83501d30d Make README issue tracker link a Markdown link 2020-02-08 15:43:10 +01:00
chrisjbillington
8d135fe700 Ignore files and directories called .git
Git cannot track these files. Print a warning if encountering one.

Fixes #166
2020-02-07 17:52:57 -05:00
Justin Murray
e8a681121b Document default branch behavior
Document the default behavior of renaming the `default` hg branch to `master`
on git, and how to override from the command line when this causes problems.

See also: #182
2019-12-21 15:34:30 -05:00
Frej Drejhammar
3af916d664 Clarify requirements
Make it clear that python 2.7.x is a hard requirement and that
Mercurial >= 4.6 is required. Also clean up an old editing artefact.
2019-11-12 17:46:08 +01:00
Frej Drejhammar
243100eea4 Add a section on frequent problems to the README
This tries to preemptively avoid recurrence of issues #148, #152,
 #155, #165 and #168.
2019-09-19 16:41:04 +02:00
Frej Drejhammar
1181a0af47 Allow name sanitizer to be disabled with --no-auto-sanitize
Make it possible to completely disable the name sanitizer by the
--no-auto-sanitize flag. Previously the sanitizer was run on user
remapped names. As the sanitizer rewrites perfectly legal git
names (such as __.*) this is probably not what the user wants.

Closes #155.
2019-09-13 14:56:32 +02:00
Jonathan Paugh
96762f5474 README: Fix broken links
Use "footnote" style links to prevent future issues whenever the text is formatted to a specific length.
2019-09-11 16:46:55 -05:00
Johannes Carlsson
47d330de83 Add support for mercurial subrepos
This adds a new command line option (--subrepo-map) that will
map mercurial subrepos to git submodules.

The --subrepo-map takes a mapping file as an argument that will
be used to map a subrepo folder to a git submodule.

For more information see the README-SUBMODULES.md.

This commit is inspired by the changes made by daolis in PR#38
that was never merged.

Closes: #51
Closes: #147
2019-01-07 18:41:19 +01:00
Johan Henkens
5e7895ca6b Add branch_name_in_commit plugin 2018-12-05 13:25:48 -08:00
Johan Henkens
679103795b Add dos2unix plugin 2018-12-05 13:25:48 -08:00
Johan Henkens
e895ce087f Add plugin system 2018-12-05 13:25:47 -08:00
Anton Tykhyy
89db1d93cf Add --filter-contents 2018-06-17 21:09:59 +03:00
Frej Drejhammar
e200cec39f Adapt to changes in Mercurial 4.6
Starting with Mercurial 4.6 repo.lookup() no longer accepts raw hashes
for lookups.
2018-06-10 15:51:09 +02:00
Gabriel
51d5f893db Add a section about system requirements to the README
Add @rinu's suggestion on how to run fast-export on Windows to the
README, this fixes #121.
2018-06-10 15:44:46 +02:00
ceqi
19aa906308 Update usage section example commands
Change <repo> to <local-repo> so that it's clear that we invoke from a local repository;
Add 'git checkout HEAD' command as we need to run it as the final step.

Thanks
2018-02-13 13:37:58 +00:00
Frej Drejhammar
cc8fefe008 Change syntax of mapping files
This is done to allow escape sequences in the key and value strings.
2017-10-02 13:05:14 +02:00
Frej Drejhammar
c252e6748e documentation: Point users to the issue tracker for support questions 2017-06-02 16:18:43 +02:00
Mark Raymond
042f0728cc Use backquotes 2015-12-12 10:34:02 +00:00
Mark Raymond
e7ea819a1f Use GitHub markdown 2015-12-12 10:25:31 +00:00
Mark Raymond
7d26b1a212 Rename README to README.md 2015-12-12 10:13:52 +00:00