Fast-Export

mirror of https://github.com/frej/fast-export.git synced 2026-06-20 23:21:09 +02:00

Author	SHA1	Message	Date
Frej Drejhammar	b0d5e56c8d	Merge branch 'PR/247' v201029	2020-10-29 19:01:04 +01:00
Frej Drejhammar	787e8559b9	Fix typo in README	2020-10-29 19:00:30 +01:00
Henrik Tunedal	ab500a24a7	Add plugin for dropping commits from output	2020-10-29 12:04:27 +01:00
Frej Drejhammar	ead75895b0	Enable code analysis Merge github generated workflow into master	2020-10-10 16:26:53 +02:00
Frej Drejhammar	bf5f14ddab	Create codeql-analysis.yml	2020-10-10 13:15:54 +00:00
Frej Drejhammar	7057ce2c2b	Allow plugins to modify the committer Plugins have since they were introduced been able to modify the author of a commit, but not the committer. This patch adds the necessary support for allowing them to also modify the committer.	2020-09-30 17:47:33 +02:00
Frej Drejhammar	2b6f735b8c	Update section about submitting patches in README Try to cover the most common reasons for requesting changes in PRs.	2020-09-09 14:08:00 +02:00
Frej Drejhammar	71acb42a09	Merge branch 'PR/236-v2' into master Implement a plugin converting unnamed heads to branches	2020-07-31 17:08:04 +02:00
Ondrej Stanek	a7955bc49b	Update head2branch plugin to accept hg commit hash The revision number isn't a unique identifier of commits across repository clones and forks, while the hg hash is guaranteed to be stable.	2020-07-31 10:50:57 +02:00
Ondrej Stanek	9c6dea9fd4	Pass original hg commit hash to plugins	2020-07-31 10:50:51 +02:00
Ethan Furman	21827a53f7	Add head2branch plugin Support converting unnamed heads to named branches during mercurial conversions. Co-Authored-By: ostan89@gmail.com	2020-07-31 10:49:08 +02:00
Ethan Furman	5c1cbf82b0	Add revision to commit_data for commit plugins Co-Authored-By: ostan89@gmail.com	2020-07-31 10:48:33 +02:00
Ondrej Stanek	50631c4b34	Add option --ignore-unnamed-heads This option allows the user to ignore only unnamed heads (compared to --force which ignores all non-fatal issues). The intended use is for a future plugin converting unnamed heads to named branches.	2020-07-31 10:30:53 +02:00
Ethan Furman	2a9dd53d14	Show all unnamed heads at once Co-Authored-By: ostan89@gmail.com	2020-07-31 10:27:07 +02:00
Frej Drejhammar	597093eaf1	Merge branch 'fix-233' Closes #233	2020-07-10 16:52:17 +02:00
Frej Drejhammar	3910044a97	Avoid crash during rev-parse when the default encoding is ascii In some locales the default encoding is ascii in which case subprocess.check_output() will fail if it is given a non-ascii ref as one of the arguments. By forcing the ref to be utf8 we will avoid a crash while still behaving correctly when the default encoding is utf8. The credits for this fix go to Nikita Bazhinov for discovering the fix and Chris J Billington for explaining it. Co-Authored-By: Nikita Bazhinov <nbazhinov@syntellect.ru> Co-Authored-By: Chris J Billington <chrisjbillington@gmail.com>	2020-07-10 16:41:38 +02:00
Frej Drejhammar	44c50d0fae	Merge branch 'PR/226'	2020-05-07 20:10:24 +02:00
chrisjbillington	d29d30363b	Fix backward incompatible change for hg < 5.1 The port to Python 3 in `b961f146` changed `repo.branchmap().iteritems()` to use `.items()` instead. However, the object returned by mercurial isn't a dictionary and its `.items()` method was only introduced (as an alias for `iteritems`) in hg 5.1. `iteritems()` still exists, so let's keep using it for now to retain compatibility with hg < 5.1.	2020-05-06 11:59:49 -04:00
Frej Drejhammar	f102d2a69f	Merge branch 'PR/223' Closes #223	2020-05-06 16:31:13 +02:00
Ondrej Stanek	cf0e5837b6	Allow converting a repository with git and hg subrepos In the verification phase, fast-export falsely expects that both hg and git subrepositories should have the appropriate line in the subrepo-map file. The case is, that only hg subrepos need a line in subrepo-map that references a converted subrepo, while git subrepositories do not.	2020-05-06 16:30:05 +02:00
Frej Drejhammar	61d22307af	Merge branch 'PR/217' Closes: #215	2020-03-26 20:17:20 +01:00
chrisjbillington	3b3f86b71e	Allow utf8 in mappings We were previously processing entries in mapping files (when `--mappings-are-raw` is not given) with `.decode('unicode_escape').encode('utf8')` to replace backslash escape sequences in bytestrings with the utf-8 encoded characters they represent. However, it turns out that `.decode ('unicode_escape')` assumes latin-1 encoding if it encounters non-ascii bytes: https://bugs.python.org/issue21331. So this gave incorrect results if non-ascii utf8 data was present in the mapping. To fix this, we now add an extra layer of `.decode('utf8').encode ('unicode-escape')` in order to convert any non-ascii characters into their backslash escape sequences. Then the subsequent `.decode('unicode_escape')` only encounters ascii characters and gives correct results.	2020-03-25 12:33:42 -04:00
Frej Drejhammar	e51844cd65	Merge branch 'PR/214' Closes: #213	2020-03-25 16:09:01 +01:00
Toni Sissala	90eeef2ff4	Fix TypeError when using -M command line argument hg-fast-export.sanitize_name expects branch name to be a bytes object. Command line parser gives out str objects. Convert possible str object to bytes in hg2git.set_default_branch().	2020-03-25 11:19:25 +02:00
Frej Drejhammar	7f4d9c3ad4	Merge branch 'PR/211'	2020-03-10 17:51:47 +01:00
Pi Delport	b37420f404	Fix link markup for hg-export-tool	2020-03-09 16:41:26 +02:00
Frej Drejhammar	f2aa47fdf7	Merge branch 'PR/210' Closes #210.	2020-03-08 19:43:23 +01:00
chrisjbillington	6361b44c33	Fix bug in ignoring .git files/folders on Windows Mercurial internally stores (most) filepaths using forward slashes, and returns them as such from its Python API, even on Windows. So the splitting up of filepaths with `os.path.sep` was incorrect, resulting in `.git` files (those within a subdirectory, anyway) not being ignored on Windows as intended. Splitting on `b'/'` regardless of OS fixes this.	2020-03-08 19:40:50 +01:00
Frej Drejhammar	afeb58ae95	Merge branch 'PR/209'	2020-03-06 17:30:52 +01:00
chrisjbillington	48508ee299	Fix failure to print error message in verify_heads On Python 3, `b'%s' % None` fails with a TypeError. In verify_heads, an error message prints the sha1 of a git commit, but that sha1 can be None. This commit instead prints `b'<None>'` if sha1 is None.	2020-03-06 11:02:38 -05:00
Frej Drejhammar	56da62847a	Merge branch 'PR/208' Closes #207.	2020-03-01 14:34:38 +01:00
Max Fuqua	750fe6d3e1	Resolve type error resulting from passing an int to b'%s' in python3	2020-02-29 14:55:15 -05:00
Frej Drejhammar	e4d6d433ec	Merge branch 'PR/206'	2020-02-29 14:48:46 +01:00
Steven Peters	058c791b75	Check python's mercurial version for compatibility When checking that python has the mercurial package in hg-fast-export.sh, use the same import statement that is used in hg-fast-export.py. hg-fast-export.py imports revsymbol from mercurial.scmutil, which was introduced in mercurial 4.6, but Ubuntu 18.04 only has mercurial 4.5.3 using python2, so an incompatible python version may be chosen without this change.	2020-02-28 15:41:24 -08:00
Frej Drejhammar	13010f7a25	Merge branch 'PR/204' Closes #203.	2020-02-21 16:34:03 +01:00
chrisjbillington	4071f720b0	Fix issue #203 : Resolve stderr encoding issues In Python 3, `sys.stderr.write()` requires unicode strings, and all output on standard streams is UTF8 encoded. Therefore in the port to Python 3, we `.decode()`d all strings that are used in `%` formatting of strings to be printed to stderr. However, in Python 2, `sys.stderr` accepts either bytestrings or unicode strings, and: - `%s` formatting of a bytestring with a unicode string, i.e `"%s" % u"foo"` results in a unicode string. - Writing a unicode string to stderr/stdout uses that stream's encoding - When the output of the process is being piped somewhere other than a terminal (as it is when called with pipes and shell redirection from hg-fast-export.sh), that encoding is None, which implies ASCII. - This raises UnicodeEncodeError if the unicode strings passed to `stderr.write()` have non-ascii characters. We cannot fix this problem simply by encoding UTF8 again before writing to stderr on Python 2. This is because the decoding of filenames with the UTF8 codec may fail - filenames may not even be valid UTF8 desite this being the declared filesystem encoding. We could `fsdecode()` filenames on Python 3, which would use the `surrogateescape` error handler, but stderr does not use this error handler for output, meaning we would just have to encode again (with the same error handler) anyway. And Python 2 lacks the `surrogateescape` error handler in any case - we would need to reimplement it just to do a round-trip decode and encode for no reason. This commit leaves filenames and other repository data as bytestrings, and simply writes them to `sys.stderr.buffer` on Python 3 or `sys.stderr` on Python 2 as-is, after `%` formatting with bytestring literals. This avoids encoding issues of filenames altogether. Other writing to stderr that does not involve repository data has been left with "native" strings, i.e. `sys.stderr.write("a string literal %s" % a_command_line_arg)`. These will still fail on Python 3 if the user passes a non-UTF filename as a command line argument or similar. This is acceptable IMHO - although `hg-fast-export` may encounter invalid UTF8 in mercurial repositories, it is not too much to impose that the user name their branch mapping files etc with valid UTF8!	2020-02-19 12:18:00 -05:00
Frej Drejhammar	160aa3c9ef	Add a reference to hg-export-tool in the documentation Add pointers to hg-export-tool as a way to batch convert multiple Mercurial repos, and deal with duplicate heads.	2020-02-14 17:16:18 +01:00
Frej Drejhammar	883474184d	Merge branch 'PR/201' Closes 201	2020-02-14 17:01:35 +01:00
chrisjbillington	b961f146df	Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.	2020-02-13 14:35:19 -05:00
Frej Drejhammar	595587b245	Merge branch 'PR/197' Closes #197, #185, #196 v200213	2020-02-09 19:39:21 +01:00
Matthijs van der Burgh	0b6b83c3de	Adapt to status becoming an object in Mercurial 5.3 Status has always been a tuple, but since 5.3, commit: https://www.mercurial-scm.org/repo/hg/rev/c5548b0b6847, it is an object. Therefore the __getitem__ of the tuple isn't available anymore. This fix is compatible with mercurial>=4.6, as the old status tuple still has the same properties.	2020-02-08 17:23:30 +01:00
Frej Drejhammar	29a457eccf	Merge branch 'PR/198' Closes 198	2020-02-08 16:08:56 +01:00
Frej Drejhammar	4bc6dec5eb	Merge branch 'PR/199' Closes #199	2020-02-08 16:05:01 +01:00
Frej Drejhammar	fa8ebd994d	Add link to what's expected for commit messages to the README	2020-02-08 15:50:17 +01:00
Frej Drejhammar	e83501d30d	Make README issue tracker link a Markdown link	2020-02-08 15:43:10 +01:00
chrisjbillington	8efbb57822	Add additional options to branch_name_in_commit plugin - Allow skipping writing the branch name if the branch is 'master'. - Allow writing the branch name on the same line as the first line of the commit message separated by a colon, instead of it having its own line.	2020-02-07 20:48:49 -05:00
chrisjbillington	8d135fe700	Ignore files and directories called .git Git cannot track these files. Print a warning if encountering one. Fixes #166	2020-02-07 17:52:57 -05:00
Frej Drejhammar	ed36227c62	Merge branch 'PR/192' Closes #192	2020-01-31 17:12:30 +01:00
Frej Drejhammar	507c17cc1b	Revert "Handle `--force` option correctly in any position" This reverts commit `0c5617bf8d`. The changes turned out to require bash. Traditionally we have tried to stay compatible with plain old sh, so this is a revert. Closes #195.	2020-01-31 17:01:04 +01:00
James Douglass	1841ba4be9	Add a plugin to prefix an issue number with a user-defined string.	2020-01-29 14:18:17 -08:00

1 2 3 4 5 ...

532 Commits