Programmatically edit text files with Python. Useful for source to source transformations.

Overview
PyPi version Python compatibility GitHub Workflow Python application AppVeyor status PyPi Libraries.io dependency status for latest release Coverage Codacy

massedit

formerly known as Python Mass Editor

Implements a python mass editor to process text files using Python code. The modification(s) is (are) shown on stdout as a diff output. One can then modify the target file(s) in place with the -w/--write option. This is very similar to 2to3 tool that ships with Python 3.

WARNING: A word of caution about the usage of eval()

This tool is useful as far as it goes but it does rely on the python eval() function and does not check the code being executed. It is a major security risk and one should not use this tool in a production environment.

See Ned Batchelder's article for a thorough discussion of the dangers linked to eval() and ways to circumvent them. Note that None of the counter-measure suggested in the article are implemented at this time.

Usage

You probably will need to know the basics of the Python re module (regular expressions).

usage: massedit.py [-h] [-V] [-w] [-v] [-e EXPRESSIONS] [-f FUNCTIONS]
                   [-x EXECUTABLES] [-s START_DIRS] [-m MAX_DEPTH] [-o FILE]
                   [-g FILE] [--encoding ENCODING] [--newline NEWLINE]
                   [file pattern [file pattern ...]]

Python mass editor

positional arguments:
  file pattern          shell-like file name patterns to process or - to read
                        from stdin.

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -w, --write           modify target file(s) in place. Shows diff otherwise.
  -v, --verbose         increases log verbosity (can be specified multiple
                        times)
  -e EXPRESSIONS, --expression EXPRESSIONS
                        Python expressions applied to target files. Use the
                        line variable to reference the current line.
  -f FUNCTIONS, --function FUNCTIONS
                        Python function to apply to target file. Takes file
                        content as input and yield lines. Specify function as
                        [module]:?<function name>.
  -x EXECUTABLES, --executable EXECUTABLES
                        Python executable to apply to target file.
  -s START_DIRS, --start START_DIRS
                        Directory(ies) from which to look for targets.
  -m MAX_DEPTH, --max-depth-level MAX_DEPTH
                        Maximum depth when walking subdirectories.
  -o FILE, --output FILE
                        redirect output to a file
  -g FILE, --generate FILE
                        generate stub file suitable for -f option
  --encoding ENCODING   Encoding of input and output files
  --newline NEWLINE     Newline character for output files

Examples:
# Simple string substitution (-e). Will show a diff. No changes applied.
massedit.py -e "re.sub('failIf', 'assertFalse', line)" *.py

# File level modifications (-f). Overwrites the files in place (-w).
massedit.py -w -f fixer:fixit *.py

# Will change all test*.py in subdirectories of tests.
massedit.py -e "re.sub('failIf', 'assertFalse', line)" -s tests test*.py

# Will transform virtual methods (almost) to MOCK_METHOD suitable for gmock (see https://github.com/google/googletest).
massedit.py -e "re.sub(r'\s*virtual\s+([\w:<>,\s&*]+)\s+(\w+)(\([^\)]*\))\s*((\w+)*)(=\s*0)?;', 'MOCK_METHOD(\g<1>, \g<2>, \g<3>, (\g<4>, override));', line)" gmock_test.cpp

If massedit is installed as a package (from pypi for instance), one can interact with it as a command line tool:

python -m massedit -e "re.sub('assertEquals', 'assertEqual', line)" test.py

Or as a library (command line option above to be passed as kewyord arguments):

>>> import massedit
>>> filenames = ['massedit.py']
>>> massedit.edit_files(filenames, ["re.sub('Jerome', 'J.', line)"])

Lastly, there is a convenient massedit.bat wrapper for Windows included in the distribution.

Installation

Download massedit.py from http://github.com/elmotec/massedit or :

pip install massedit

Poor man source-to-source manipulation

I find myself using massedit mostly for source to source modification of large code bases like this:

First create a fixer.py python module with the function that will process your source code. For instance, to add a header:

def add_header(lines, file_name):
    yield '// This is my header'  # will be the first line of the file.
    for line in lines:
        yield line

Adds the location of fixer.py to your $PYTHONPATH, then simply call massedit.py like this:

massedit.py -f fixer:add_header *.h

You can add the -s . option to process all the .h files reccursively.

Plans

  • Add support for 3rd party tool (e.g. autopep8) to process the files.
  • Add support for a file of expressions as an argument to allow multiple modification at once.
  • Find a satisfactory way (ie. easy to use) to handle multiline regex as the current version works on a line by line basis.

Rationale

  • I have a hard time practicing more than a few dialects of regular expressions.
  • I need something portable to Windows without being bothered by eol.
  • I believe Python is the ideal tool to build something more powerful than simple regex based substitutions.

Background

I have been using runsed and checksed (from Unix Power Tools) for years and did not find a good substitute under Windows until I came across Graham Fawcett python recipe 437932 on ActiveState. It inspired me to write the massedit.

The core was fleshed up a little, and here we are. If you find it useful and enhance it please, do not forget to submit patches. Thanks!

If you are more interested in awk-like tool, you probably will find pyp a better alternative.

License

Licensed under the term of MIT License. See attached file LICENSE.txt.

Changes

0.69.0 (2020-12-22)
Updated infrastructure files to setup.cfg/pyproject.toml instead of setup.py. Also moved CI to github workflows from travis and added regression tests for Python 2.7.
0.68.6 (2019-12-02)
Added support for Python 3.8, stdin input via - argument. Documented regex to turn base classes into googlemock MOCK_METHOD.
0.68.5 (2019-04-13)
Added --newline option to force newline output. Thanks @ALFNeT!
0.68.4 (2017-10-24)
Fixed bug that would cause changes to be missed when the -w option is ommited. Thanks @tgoodlet!
0.68.3 (2017-09-20)
Added --generate option to quickly generate a fixer.py template file to be modified to be used with -f fixer.fixit option. Added official support for Python 3.6
0.68.1 (2016-06-04)
Fixed encoding issues when processing non-ascii files. Added --encoding option to force the value of the encoding if need be. Listed support for Python 3.5
0.67.1 (2015-06-28)
Documentation fixes.
0.67 (2015-06-23)
Added file_name argument to processing functions. Fixed incorrect closing of sys.stdout/stderr. Improved diagnostic when the processing function does not take 2 arguments. Swapped -v and -V option to be consistent with Python. Pylint fixes. Added support for Python 3.4. Dropped support for Python 3.2.
0.66 (2013-07-14)
Fixed lost executable bit with -f option (thanks myint).
0.65 (2013-07-12)
Added -f option to execute code in a separate file/module. Added Travis continuous integration (thanks myint). Fixed python 2.7 support (thanks myint).
0.64 (2013-06-01)
Fixed setup.py so that massedit installs as a script. Fixed eol issues (thanks myint).
0.63 (2013-05-27)
Renamed to massedit. Previous version are still known as Python-Mass-Editor.
0.62 (2013-04-11)
Fixed bug that caused an EditorError to be raised when the result of the expression is an empty string.
0.61 (2012-07-06)
Added massedit.edit_files function to ease usage as library instead of as a command line tool (suggested by Maxim Veksler).
0.60 (2012-07-04)
Treats arguments as patterns rather than files to ease processing of multiple files in multiple subdirectories. Added -s (start directory) and -m (max depth) options.
0.52 (2012-06-05)
Upgraded for python 3. Still compatible with python 2.7.
0.51 (2012-05)
Initial release (Beta).

Contributor acknowledgement

https://github.com/myint, https://github.com/tgoodlet, https://github.com/ALFNeT

Comments
  • Ability to pass newline character for output files

    Ability to pass newline character for output files

    I've run into a situation where I need to make sure the output files are written with a specific EOL. In this particular case I need to read files with CRLF EOLs and write them in LF regardless of OS.

    Ideally I think the EOL from the source files should be respected but I don't think that's a simple problem to solve at point so adding a new flag to specify the OEL for the output files feels like a good compromise.

    I'm unsure if this is needed but many people or if this is the best way of going about it but I'm happy to make changes as required.

    opened by ALFNeT 6
  • Write error when file contents not utf8 encoded

    Write error when file contents not utf8 encoded

    I get the following error if my input file corpus is not utf8 encoded:

    Traceback (most recent call last):                                      
      File "replace.py", line 50, in <module>                               
        dry_run=False,                                                      
      File "/home/tyler/repos/massedit/massedit.py", line 440, in edit_files
        diffs = list(editor.edit_file(path))                                
      File "/home/tyler/repos/massedit/massedit.py", line 202, in edit_file 
        new_file.writelines(to_lines)                                       
    TypeError: must be unicode, not str                                     
    

    A change to https://github.com/elmotec/massedit/blob/master/massedit.py#L202 resolves this:

    new_file.writelines(line.decode('utf8') for line in to_lines)

    I can provide a PR if necessary.

    opened by goodboy 4
  • Decorator interface for registering processing funcs?

    Decorator interface for registering processing funcs?

    What's your opinion on a using decorators to register processing functions?

    It would eliminate the whole needing to have the -f fixer:add_header explicit syntax. Instead you just point to a module and the decorated funcs are auto loaded.

    Also the requirement that the module is on the python path could be avoided by a simple import of the file directly

    Whatcha think?

    opened by goodboy 4
  • Support Python 2

    Support Python 2

    These changes support Python 2 while maintaining Python 3 support. See the Travis CI test results.

    I also set the UTF-8 encoding in a more portable way.

    opened by myint 2
  • Preserve file permission

    Preserve file permission

    Currently, if I use massedit to modify an executable file, the executable bit gets lost.

    $ ls -l foo.sh
    -rwxr-xr-x 1 myint wheel 2 Jul 13 09:16 foo.sh
    $ massedit -we 'line.replace("a", "b")' '*.sh'
    $ ls -l foo.sh
    -rw-r--r-- 1 myint wheel 2 Jul 13 09:17 foo.sh
    
    opened by myint 1
  • pip installation doesn't work

    pip installation doesn't work

    $ pip install --user massedit
    Downloading/unpacking massedit
      Downloading massedit-0.63.zip
      Running setup.py egg_info for package massedit
    
    Installing collected packages: massedit
      Running setup.py install for massedit
    
    Successfully installed massedit
    Cleaning up...
    $ python -m massedit
    /opt/local/bin/python: No module named massedit
    
    opened by myint 0
  • Five failures on openSUSE Python 2

    Five failures on openSUSE Python 2

    Packanging project at https://build.opensuse.org/package/show/home:jayvdb:py-new/python-massedit

    [   64s] ======================================================================
    [   64s] ERROR: test_exec_option (tests.TestCommandLine)
    [   64s] Check trivial call using executable.
    [   64s] ----------------------------------------------------------------------
    [   64s] Traceback (most recent call last):
    [   64s]   File "/home/abuild/rpmbuild/BUILD/massedit-0.68.5/tests.py", line 714, in test_exec_option
    [   64s]     self.assertEqual(actual[3], '-#!/usr/bin/env python')
    [   64s] IndexError: list index out of range
    [   64s] 
    [   64s] ======================================================================
    [   64s] ERROR: test_file_option (tests.TestCommandLine)
    [   64s] Test processing of a file.
    [   64s] ----------------------------------------------------------------------
    [   64s] Traceback (most recent call last):
    [   64s]   File "/home/abuild/rpmbuild/BUILD/massedit-0.68.5/tests.py", line 644, in test_file_option
    [   64s]     actual = output.getvalue().split("\n")[3]
    [   64s] IndexError: list index out of range
    [   64s] 
    [   64s] ======================================================================
    [   64s] FAIL: test_error_in_function (tests.TestCommandLine)
    [   64s] Check error when the function triggers an exception.
    [   64s] ----------------------------------------------------------------------
    [   64s] Traceback (most recent call last):
    [   64s]   File "/home/abuild/rpmbuild/BUILD/massedit-0.68.5/tests.py", line 701, in test_error_in_function
    [   64s]     output=output)
    [   64s] AssertionError: ZeroDivisionError not raised
    [   64s] 
    [   64s] ======================================================================
    [   64s] FAIL: test_missing_function_name (tests.TestCommandLine)
    [   64s] Check error when the function is empty but not the module.
    [   64s] ----------------------------------------------------------------------
    [   64s] Traceback (most recent call last):
    [   64s]   File "/home/abuild/rpmbuild/BUILD/massedit-0.68.5/tests.py", line 680, in test_missing_function_name
    [   64s]     self.assertEqual(log_sink.log, expected)
    [   64s] AssertionError: u'' != u"'massedit:' is not a callable function: 'dict' object has no attribute 'massed [truncated]...
    [   64s] + 'massedit:' is not a callable function: 'dict' object has no attribute 'massedit'
    [   64s] 
    [   64s] 
    [   64s] ======================================================================
    [   64s] FAIL: test_wrong_number_of_argument (tests.TestCommandLine)
    [   64s] Test passing function that has the wrong number of arguments.
    [   64s] ----------------------------------------------------------------------
    [   64s] Traceback (most recent call last):
    [   64s]   File "/home/abuild/rpmbuild/BUILD/massedit-0.68.5/tests.py", line 690, in test_wrong_number_of_argument
    [   64s]     self.assertEqual(log_sink.log, expected)
    [   64s] AssertionError: u'' != u"'massedit:get_function' is not a callable function: function should take 2 arg [truncated]...
    [   64s] + 'massedit:get_function' is not a callable function: function should take 2 arguments: lines, file_name
    
    opened by jayvdb 3
  • Can't read iso-8859-5 encoded files

    Can't read iso-8859-5 encoded files

    Not that you should support it; just thought I'd report it.

    >>> massedit -e "line" -m 99 nonascii.py                                                   
    Traceback (most recent call last):
      File "/usr/bin/massedit", line 9, in <module>
        load_entry_point('massedit', 'console_scripts', 'massedit')()
      File "/home/tyler/repos/massedit/massedit.py", line 475, in main
        command_line(sys.argv)
      File "/home/tyler/repos/massedit/massedit.py", line 462, in command_line
        output=arguments.output)
      File "/home/tyler/repos/massedit/massedit.py", line 440, in edit_files
        diffs = list(editor.edit_file(path))
      File "/home/tyler/repos/massedit/massedit.py", line 168, in edit_file
        from_lines = from_file.readlines()
      File "/usr/lib/python2.7/codecs.py", line 314, in decode
        (result, consumed) = self._buffer_decode(data, self.errors, final)
    UnicodeDecodeError: 'utf8' codec can't decode byte 0xb1 in position 81: invalid start byte
    

    Where the offending file is nonascii.py from the IPython project.

    opened by goodboy 7
Releases(v0.69.1)
  • v0.69.1(Dec 23, 2020)

  • v0.69.0(Dec 23, 2020)

    Updated infrastructure files to setup.cfg/pyproject.toml instead of setup.py. Also moved CI to github workflows from travis and added regression tests for Python 2.7.

    Source code(tar.gz)
    Source code(zip)
  • v0.68.6(Dec 3, 2019)

  • v0.68.4(Oct 25, 2017)

  • v0.68.1(Jun 5, 2016)

  • v0.67.1(Jun 29, 2015)

  • v0.67(Jun 24, 2015)

    • Added file_name argument to processing functions.
    • Fixed incorrect closing of sys.stdout/stderr.
    • Improved diagnostic when the processing function does not take 2 arguments.
    • Swapped -v and -V option to be consistent with Python.
    • Pylint fixes. Added support for Python 3.4. Dropped support for Python 3.2.

    Get it from:

    • https://github.com/elmotec/massedit/archive/v0.67.zip
    • https://pypi.python.org/packages/source/m/massedit/massedit-0.67.zip#md5=8d3bd3a177e0733f6d4f415101321615

    or:

    pip install massedit
    
    Source code(tar.gz)
    Source code(zip)
  • v0.66(Jul 14, 2013)

  • v0.65(Jul 13, 2013)

    Added -f option to execute code in a separate file/module. Added Travis continuous integration (thanks myint). Fixed python 2.7 support (thanks myint).

    Source code(tar.gz)
    Source code(zip)
A library that modifies python source code to conform to pep8.

Pep8ify: Clean your code with ease Pep8ify is a library that modifies python source code to conform to pep8. Installation This library currently works

Steve Pulec 117 Jan 3, 2023
Simple, hassle-free, dependency-free, AST based source code refactoring toolkit.

refactor is an end-to-end refactoring framework that is built on top of the 'simple but effective refactorings' assumption. It is much easier to write a simple script with it rather than trying to figure out what sort of a regex you need in order to replace a pattern (if it is even matchable with regexes).

Batuhan Taskaya 385 Jan 6, 2023
A simple Python bytecode framework in pure Python

A simple Python bytecode framework in pure Python

null 3 Jan 23, 2022
Awesome autocompletion, static analysis and refactoring library for python

Jedi - an awesome autocompletion, static analysis and refactoring library for Python Jedi is a static analysis tool for Python that is typically used

Dave Halter 5.3k Dec 29, 2022
a python refactoring library

rope, a python refactoring library ... Overview Rope is a python refactoring library. Notes Nick Smith <[email protected]> takes over maintaining rope

null 1.5k Dec 30, 2022
Find dead Python code

Vulture - Find dead code Vulture finds unused code in Python programs. This is useful for cleaning up and finding errors in large code bases. If you r

Jendrik Seipp 2.4k Dec 27, 2022
Safe code refactoring for modern Python.

Safe code refactoring for modern Python projects. Overview Bowler is a refactoring tool for manipulating Python at the syntax tree level. It enables s

Facebook Incubator 1.4k Jan 4, 2023
A system for Python that generates static type annotations by collecting runtime types

MonkeyType MonkeyType collects runtime types of function arguments and return values, and can automatically generate stub files or even add draft type

Instagram 4.1k Dec 28, 2022
Tool for translation type comments to type annotations in Python

com2ann Tool for translation of type comments to type annotations in Python. The tool requires Python 3.8 to run. But the supported target code versio

Ivan Levkivskyi 123 Nov 12, 2022
Bottom-up approach to refactoring in python

Introduction RedBaron is a python library and tool powerful enough to be used into IPython solely that intent to make the process of writing code that

Python Code Quality Authority 653 Dec 30, 2022
Code generation and code search for Python and Javascript.

Codeon Code generation and code search for Python and Javascript. Similar to GitHub Copilot with one major difference: Code search is leveraged to mak

null 51 Dec 8, 2022
AST based refactoring tool for Python.

breakfast AST based refactoring tool. (Very early days, not usable yet.) Why 'breakfast'? I don't know about the most important, but it's a good meal.

eric casteleijn 0 Feb 22, 2022
Refactoring Python Applications for Simplicity

Python Refactoring Refactoring Python Applications for Simplicity. You can open and read project files or use this summary ?? Concatenate String my_st

Mohammad Dori 3 Jul 15, 2022
Leap is an experimental package written to enable the utilization of C-like goto statements in Python functions

Leap is an experimental package written to enable the utilization of C-like goto statements in Python functions

null 6 Dec 26, 2022
Turn your C++/Java code into a Python-like format for extra style points and to make everyone hates you

Turn your C++/Java code into a Python-like format for extra style points and to make everyone hates you

Tô Đức (Watson) 4 Feb 7, 2022
Fully Automated YouTube Channel ▶️with Added Extra Features.

Fully Automated Youtube Channel ▒█▀▀█ █▀▀█ ▀▀█▀▀ ▀▀█▀▀ █░░█ █▀▀▄ █▀▀ █▀▀█ ▒█▀▀▄ █░░█ ░░█░░ ░▒█░░ █░░█ █▀▀▄ █▀▀ █▄▄▀ ▒█▄▄█ ▀▀▀▀ ░░▀░░ ░▒█░░ ░▀▀▀ ▀▀▀░

sam-sepiol 249 Jan 2, 2023
Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

Talk-to-Edit (ICCV2021) This repository contains the implementation of the following paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog Yumin

Yuming Jiang 221 Jan 7, 2023
Composable transformations of Python+NumPy programsComposable transformations of Python+NumPy programs

Chex Chex is a library of utilities for helping to write reliable JAX code. This includes utils to help: Instrument your code (e.g. assertions) Debug

DeepMind 506 Jan 8, 2023
Scripts to download files and folders programmatically from Google Drive

Google Drive Downloader Scripts Every time I need to download a lot of files from Google Drive (e.g. a dataset), it's always incredibly frustrating an

Ivan Evtimov 6 Jul 22, 2021
Edit SRT files to delay subtitle time-stamps.

subtitle-delay A program written in Python that directly edits SRT file to delay the subtitles. Features: Will throw an error if delaying with negativ

null 8 Jul 17, 2022