Textpipe: clean and extract metadata from text

Overview

textpipe: clean and extract metadata from text

Build Status

The textpipe logo

textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata from that text. Its functionalities include transforming raw text into readable text by removing HTML tags and extracting metadata such as the number of words and named entities from the text.

Vision: the zen of textpipe

  • Designed for use in production pipelines without adult supervision.
  • Rechargeable batteries included: provide sane defaults and clear examples to adapt.
  • A uniform interface with thin wrappers around state-of-the-art NLP packages.
  • As language-agnostic as possible.
  • Bring your own models.

Features

  • Clean raw text by removing HTML and other unreadable constructs
  • Identify the language of text
  • Extract the number of words, number of sentences, named entities from a text
  • Calculate the complexity of a text
  • Obtain text metadata by specifying a pipeline containing all desired elements
  • Obtain sentiment (polarity and a subjectivity score)
  • Generates word counts
  • Computes minhash for cheap similarity estimation of documents

Installation

It is recommended that you install textpipe using a virtual environment.

python3 -m venv .venv
  • Using virtualenv.
virtualenv venv -p python3.6
  • Using virtualenvwrapper
mkvirtualenv textpipe -p python3.6
  • Install textpipe using pip.
pip install textpipe
  • Install the required packages using requirements.txt.
pip install -r requirements.txt

A note on spaCy download model requirement

While the requirements.txt file that comes with the package calls for spaCy's en_core_web_sm model, this can be changed depending on the model and language you require for your intended use. See spaCy.io's page on their different models for more information.

Usage example

>>> from textpipe import doc, pipeline
>>> sample_text = 'Sample text! <!DOCTYPE>'
>>> document = doc.Doc(sample_text)
>>> print(document.clean)
'Sample text!'
>>> print(document.language)
'en'
>>> print(document.nwords)
2

>>> pipe = pipeline.Pipeline(['CleanText', 'NWords'])
>>> print(pipe(sample_text))
{'CleanText': 'Sample text!', 'NWords': 3}

In order to extend the existing Textpipe operations with your own proprietary operations;

test_pipe = pipeline.Pipeline(['CleanText', 'NWords'])
def custom_op(doc, context=None, settings=None, **kwargs):
    return 1

custom_argument = {'argument' :1 }
test_pipe.register_operation('CUSTOM_STEP', custom_op)
test_pipe.steps.append(('CUSTOM_STEP', custom_argument ))

Contributing

See CONTRIBUTING for guidelines for contributors.

Changes

0.12.1

  • Bumps redis, tqdm, pyling

0.12.0

  • Bumps versions of many dependencies including textacy. Results for keyterm extraction changed.

0.11.9

  • Exposes arbitrary SpaCy ents properties

0.11.8

  • Exposes SpaCy's cats attribute

0.11.7

  • Bumps spaCy and redis versions

0.11.6

  • Fixes bug where gensim model is not cached in pipeline

0.11.5

  • Raise TextpipeMissingModelException instead of KeyError

0.11.4

  • Bumps spaCy and datasketch dependencies

0.11.1

  • Replaces codacy with pylint on CI
  • Fixes pylint issues

0.11.0

  • Adds wrapper around Gensim keyed vectors to construct document embeddings from Redis cache

0.9.0

  • Adds functionality to compute document embeddings using a Gensim word2vec model

0.8.6

  • Removes non standard utf chars before detecting language

0.8.5

  • Bump spaCy to 2.1.3

0.8.4

  • Fix broken install command

0.8.3

  • Fix broken install command

0.8.2

  • Fix copy-paste error in word vector aggregation (#118)

0.8.1

  • Fixes bugs in several operations that didn't accept kwargs

0.8.0

  • Bumps Spacy to 2.1

0.7.2

  • Pins Spacy and Pattern versions (with pinned lxml)

0.7.0

  • change operation's registry from list to dict
  • global pipeline data is available across operations via the context kwarg
  • load custom operations using register_operation in pipeline
  • custom steps (operations) with arguments
Comments
  • Add unsupervised keyphrase extraction

    Add unsupervised keyphrase extraction

    Follow up on https://github.com/textpipe/textpipe/pull/73

    It adds an extract_keyphrases method rather than a property, because there's three different algorithms that we can use. This way, it's up to the user what to pick. The sgrank algorithm provides some additional arguments (selecting which ngrams to consider, for example), so I've added a **kwargs for users to pass these arguments on.

    opened by bartdegoede 10
  • TypeError while installing through pip

    TypeError while installing through pip

    pip install textpipe
    
    Collecting textpipe
      Downloading https://files.pythonhosted.org/packages/44/0c/e7fafbda3caa0c7f8d6ffbe144d081a596c80d834e17943bbe0f5b31b8e9/textpipe-0.8.1.tar.gz
        Complete output from command python setup.py egg_info:
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "/tmp/pip-install-amdf0b/textpipe/setup.py", line 13, in <module>
            with open(Path(__file__).resolve().parent.joinpath('README.md'), 'r') as fh:
        TypeError: coercing to Unicode: need string or buffer, PosixPath found
        
        ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-amdf0b/textpipe/
    
    opened by sachinchavan9 9
  • distutils.errors.CompileError when installing textpipe

    distutils.errors.CompileError when installing textpipe

    I get the following error when I try to install textpipe using Python 3.6.8 and Ubuntu 18.04.

    Any thoughts? I can't seem to find any helpful solutions on the interwebs.

    Traceback (most recent call last): File "/usr/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile extra_postargs) File "/usr/lib/python3.6/distutils/ccompiler.py", line 909, in spawn spawn(cmd, dry_run=self.dry_run) File "/usr/lib/python3.6/distutils/spawn.py", line 36, in spawn _spawn_posix(cmd, search_path, dry_run=dry_run) File "/usr/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix % (cmd, exit_status)) distutils.errors.DistutilsExecError: command 'x86_64-linux-gnu-gcc' failed with exit status 1

    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/.eggs/cffi-1.12.3-py3.6-linux-x86_64.egg/cffi/ffiplatform.py", line 51, in _build
        dist.run_command('build_ext')
      File "/usr/lib/python3.6/distutils/dist.py", line 974, in run_command
        cmd_obj.run()
      File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 78, in run
        _build_ext.run(self)
      File "/usr/lib/python3.6/distutils/command/build_ext.py", line 339, in run
        self.build_extensions()
      File "/usr/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
        self._build_extensions_serial()
      File "/usr/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
        self.build_extension(ext)
      File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 199, in build_extension
        _build_ext.build_extension(self, ext)
      File "/usr/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
        depends=ext.depends)
      File "/usr/lib/python3.6/distutils/ccompiler.py", line 574, in compile
        self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
      File "/usr/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
        raise CompileError(msg)
    distutils.errors.CompileError: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/setup.py", line 191, in <module>
        'Topic :: Text Processing :: Linguistic'
      File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 129, in setup
        return distutils.core.setup(**attrs)
      File "/usr/lib/python3.6/distutils/core.py", line 148, in setup
        dist.run_commands()
      File "/usr/lib/python3.6/distutils/dist.py", line 955, in run_commands
        self.run_command(cmd)
      File "/usr/lib/python3.6/distutils/dist.py", line 974, in run_command
        cmd_obj.run()
      File "/usr/lib/python3/dist-packages/setuptools/command/egg_info.py", line 278, in run
        self.find_sources()
      File "/usr/lib/python3/dist-packages/setuptools/command/egg_info.py", line 293, in find_sources
        mm.run()
      File "/usr/lib/python3/dist-packages/setuptools/command/egg_info.py", line 524, in run
        self.add_defaults()
      File "/usr/lib/python3/dist-packages/setuptools/command/egg_info.py", line 560, in add_defaults
        sdist.add_defaults(self)
      File "/usr/lib/python3/dist-packages/setuptools/command/py36compat.py", line 34, in add_defaults
        self._add_defaults_python()
      File "/usr/lib/python3/dist-packages/setuptools/command/sdist.py", line 127, in _add_defaults_python
        build_py = self.get_finalized_command('build_py')
      File "/usr/lib/python3.6/distutils/cmd.py", line 299, in get_finalized_command
        cmd_obj.ensure_finalized()
      File "/usr/lib/python3.6/distutils/cmd.py", line 107, in ensure_finalized
        self.finalize_options()
      File "/usr/lib/python3/dist-packages/setuptools/command/build_py.py", line 34, in finalize_options
        orig.build_py.finalize_options(self)
      File "/usr/lib/python3.6/distutils/command/build_py.py", line 45, in finalize_options
        ('force', 'force'))
      File "/usr/lib/python3.6/distutils/cmd.py", line 287, in set_undefined_options
        src_cmd_obj.ensure_finalized()
      File "/usr/lib/python3.6/distutils/cmd.py", line 107, in ensure_finalized
        self.finalize_options()
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/setup.py", line 143, in finalize_options
        self.distribution.ext_modules = get_ext_modules()
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/setup.py", line 128, in get_ext_modules
        import cld2
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/cld2/__init__.py", line 190, in <module>
        extra_compile_args=_COMPILER_ARGS)
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/.eggs/cffi-1.12.3-py3.6-linux-x86_64.egg/cffi/api.py", line 464, in verify
        lib = self.verifier.load_library()
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/.eggs/cffi-1.12.3-py3.6-linux-x86_64.egg/cffi/verifier.py", line 104, in load_library
        self._compile_module()
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/.eggs/cffi-1.12.3-py3.6-linux-x86_64.egg/cffi/verifier.py", line 201, in _compile_module
        outputfilename = ffiplatform.compile(tmpdir, self.get_extension())
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/.eggs/cffi-1.12.3-py3.6-linux-x86_64.egg/cffi/ffiplatform.py", line 22, in compile
        outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
      File "/tmp/pip-build-8bvewwgk/cld2-cffi/.eggs/cffi-1.12.3-py3.6-linux-x86_64.egg/cffi/ffiplatform.py", line 58, in _build
        raise VerificationError('%s: %s' % (e.__class__.__name__, e))
    cffi.VerificationError: CompileError: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    
    ----------------------------------------
    

    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-8bvewwgk/cld2-cffi/

    opened by rajbala 5
  • Update pytest requirement from ~=3.6.4 to ~=3.7.1

    Update pytest requirement from ~=3.6.4 to ~=3.7.1

    Updates the requirements on pytest to permit the latest version.

    Changelog

    Sourced from pytest's changelog.

    pytest 3.7.1 (2018-08-02)

    Bug Fixes

    • #3473 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3473>_: Raise immediately if approx() is given an expected value of a type it doesn't understand (e.g. strings, nested dicts, etc.).

    • #3712 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3712>_: Correctly represent the dimensions of an numpy array when calling repr() on approx().

    • #3742 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3742>_: Fix incompatibility with third party plugins during collection, which produced the error object has no attribute '_collectfile'.

    • #3745 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3745>_: Display the absolute path if cache_dir is not relative to the rootdir instead of failing.

    • #3747 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3747>_: Fix compatibility problem with plugins and the warning code issued by fixture functions when they are called directly.

    • #3748 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3748>_: Fix infinite recursion in pytest.approx with arrays in numpy<1.13.

    • #3757 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3757>_: Pin pathlib2 to >=2.2.0 as we require __fspath__ support.

    • #3763 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3763>_: Fix TypeError when the assertion message is bytes in python 3.

    pytest 3.7.0 (2018-07-30)

    Deprecations and Removals

    • #2639 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/2639>_: pytest_namespace has been deprecated.

      See the documentation for pytest_namespace hook for suggestions on how to deal with this in plugins which use this functionality.

    • #3661 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3661>_: Calling a fixture function directly, as opposed to request them in a test function, now issues a RemovedInPytest4Warning. It will be changed into an error in pytest 4.0.

      This is a great source of confusion to new users, which will often call the fixture functions and request them from test functions interchangeably, which breaks the fixture resolution model.

    Features

    • #2283 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/2283>_: New package fixture scope: fixtures are finalized when the last test of a package finishes. This feature is considered experimental, so use it sparingly.
    ... (truncated)
    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Pull request limits (per update run and/or open at any time)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)

    Finally, you can contact us by mentioning @dependabot.

    dependencies 
    opened by dependabot-preview[bot] 5
  • Update pytest requirement from ~=4.3 to ~=5.0

    Update pytest requirement from ~=4.3 to ~=5.0

    Updates the requirements on pytest to permit the latest version.

    Changelog

    Sourced from pytest's changelog.

    pytest 5.0.0 (2019-06-28)

    Important

    This release is a Python3.5+ only release.

    For more details, see our Python 2.7 and 3.4 support plan.

    Removals

    • #1149: Pytest no longer accepts prefixes of command-line arguments, for example typing pytest --doctest-mod inplace of --doctest-modules. This was previously allowed where the ArgumentParser thought it was unambiguous, but this could be incorrect due to delayed parsing of options for plugins. See for example issues #1149, #3413, and #4009.

    • #5402: PytestDeprecationWarning are now errors by default.

      Following our plan to remove deprecated features with as little disruption as possible, all warnings of type PytestDeprecationWarning now generate errors instead of warning messages.

      The affected features will be effectively removed in pytest 5.1, so please consult the Deprecations and Removals section in the docs for directions on how to update existing code.

      In the pytest 5.0.X series, it is possible to change the errors back into warnings as a stop gap measure by adding this to your pytest.ini file:

      [pytest]
      filterwarnings =
          ignore::pytest.PytestDeprecationWarning
      

      But this will stop working when pytest 5.1 is released.

      If you have concerns about the removal of a specific feature, please add a comment to #5402.

    • #5412: ExceptionInfo objects (returned by pytest.raises) now have the same str representation as repr, which avoids some confusion when users use print(e) to inspect the object.

    Deprecations

    • #4488: The removal of the --result-log option and module has been postponed to (tentatively) pytest 6.0 as the team has not yet got around to implement a good alternative for it.
    • #466: The funcargnames attribute has been an alias for fixturenames since pytest 2.3, and is now deprecated in code too.

    Features

    • #3457: New pytest_assertion_pass hook, called with context information when an assertion passes.

      This hook is still experimental so use it with caution.

    • #5440: The faulthandler standard library module is now enabled by default to help users diagnose crashes in C modules.

      This functionality was provided by integrating the external pytest-faulthandler plugin into the core, so users should remove that plugin from their requirements if used.

    ... (truncated)
    Commits
    • 58bfc77 Use shutil.which to avoid distutils+imp warning
    • 97f0a20 Add notice about py35+ and move ExitCode changelog entry
    • 55d2fe0 Use importlib instead of imp in demo
    • 5e39eb9 Correct Zac-HD's name in changelogs
    • fd2f320 Preparing release version 5.0.0
    • 73d918d Remove astor and reproduce the original assertion expression (#5512)
    • 7ee2444 Remove astor and reproduce the original assertion expression
    • 3c9b46f Remove stray comment from tox.ini (#5507)
    • f7bfbb5 Merge pull request #5506 from asottile/fix_no_terminal
    • 45af361 Remove stray comment from tox.ini
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Pull request limits (per update run and/or open at any time)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)

    Finally, you can contact us by mentioning @dependabot.

    dependencies 
    opened by dependabot-preview[bot] 4
  • extractive summarization using textrank

    extractive summarization using textrank

    Textrank was implemented using the gensim framework in order to get a list of extracted sentences from a document that could serve as a summary of the document.

    opened by msappelli 4
  • Update pylint requirement from ~=2.1.1 to ~=2.2.2

    Update pylint requirement from ~=2.1.1 to ~=2.2.2

    Updates the requirements on pylint to permit the latest version.

    Changelog

    Sourced from pylint's changelog.

    What's New in Pylint 2.2.2?

    Release date: 2018-11-28

    • Change the logging-format-style to use name identifier instead of their corresponding Python identifiers

      This is to prevent users having to think about escaping the default value for logging-format-style in the generated config file. Also our config parsing utilities don't quite support escaped values when it comes to choices detection, so this would have needed various hacks around that.

      Closes #2614

    What's New in Pylint 2.2.1?

    Release date: 2018-11-27

    • Fix a crash caused by implicit-str-concat-in-sequence and multi-bytes characters.

      Closes #2610

    What's New in Pylint 2.2?

    Release date: 2018-11-25

    • Consider range() objects for undefined-loop-variable leaking from iteration.

      Close #2533

    • deprecated-method can use the attribute name for identifying a deprecated method

      Previously we were using the fully qualified name, which we still do, but the fully qualified name for some unittest deprecated aliases leads to a generic deprecation function. Instead on relying on that, we now also rely on the attribute name, which should solve some false positives.

      Close #1653 Close #1946

    • Fix compatibility with changes to stdlib tokenizer.

    • pylint is less eager to consume the whole line for pragmas

      Close #2485

    ... (truncated)
    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Pull request limits (per update run and/or open at any time)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)

    Finally, you can contact us by mentioning @dependabot.

    dependencies 
    opened by dependabot-preview[bot] 4
  • Adds is_reliable_language property to language detection

    Adds is_reliable_language property to language detection

    Follow up from the discussion in https://github.com/textpipe/textpipe/pull/53#discussion_r210245818, I made the language detection a bit more robust and expose the is_reliable_language property from cld2-cffi.

    needs-review 
    opened by dodijk 4
  • Allow operation-specific Spacy models

    Allow operation-specific Spacy models

    See #51

    This changes the way spacy_nlp and spacy_doc are called to allow different spacy language modules and spacy docs per operation in the pipeline.

    opened by lmdehaas 4
  • Update spacy requirement from ~=2.1.6 to ~=2.2.1

    Update spacy requirement from ~=2.1.6 to ~=2.2.1

    Updates the requirements on spacy to permit the latest version.

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    opened by dependabot-preview[bot] 3
  • Update pytest requirement from ~=3.8.0 to ~=3.8.1

    Update pytest requirement from ~=3.8.0 to ~=3.8.1

    Updates the requirements on pytest to permit the latest version.

    Changelog

    Sourced from pytest's changelog.

    pytest 3.8.1 (2018-09-22)

    Bug Fixes

    • #3286 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3286>_: .pytest_cache directory is now automatically ignored by Git. Users who would like to contribute a solution for other SCMs please consult/comment on this issue.

    • #3749 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3749>_: Fix the following error during collection of tests inside packages::

      TypeError: object of type 'Package' has no len()
      
    • #3941 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3941>_: Fix bug where indirect parametrization would consider the scope of all fixtures used by the test function to determine the parametrization scope, and not only the scope of the fixtures being parametrized.

    • #3973 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3973>_: Fix crash of the assertion rewriter if a test changed the current working directory without restoring it afterwards.

    • #3998 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3998>_: Fix issue that prevented some caplog properties (for example record_tuples) from being available when entering the debugger with --pdb.

    • #3999 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3999>_: Fix UnicodeDecodeError in python2.x when a class returns a non-ascii binary __repr__ in an assertion which also contains non-ascii text.

    Improved Documentation

    • #3996 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3996>: New Deprecations and Removals <https://docs.pytest.org/en/latest/deprecations.html> page shows all currently deprecated features, the rationale to do so, and alternatives to update your code. It also list features removed from pytest in past major releases to help those with ancient pytest versions to upgrade.

    Trivial/Internal Changes

    • #3955 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3955>_: Improve pre-commit detection for changelog filenames

    • #3975 <https://github-redirect.dependabot.com/pytest-dev/pytest/issues/3975>_: Remove legacy code around im_func as that was python2 only

    pytest 3.8.0 (2018-09-05)

    Deprecations and Removals

    ... (truncated)
    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Pull request limits (per update run and/or open at any time)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)

    Finally, you can contact us by mentioning @dependabot.

    dependencies 
    opened by dependabot-preview[bot] 3
  • Bump fakeredis from 0.16.0 to 1.5.0

    Bump fakeredis from 0.16.0 to 1.5.0

    Bumps fakeredis from 0.16.0 to 1.5.0.

    Commits
    • d1dbaf2 Prepare for 1.5.0 release.
    • 919d951 Prevent running hypothesis tests on 32-bit server
    • ee1599d Merge pull request #293 from wandering-tales/master
    • 3bcb8fc Remove unused _description_args connection attribute
    • f8b2170 Align FakeConnection constructor signature to base class.
    • f8e3df1 Ensure that failed exec always aborts in-progress transaction
    • f2a4018 Update to match redis 6.0.6+ handling exec failures
    • 541c91a Test against Python 3.9
    • c1f68d2 Stop testing on Python 3.5
    • 5861f4a Update requirements.txt to latest versions
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    opened by dependabot-preview[bot] 0
  • Update numpy requirement from ~=1.19.5 to ~=1.20.2

    Update numpy requirement from ~=1.19.5 to ~=1.20.2

    Updates the requirements on numpy to permit the latest version.

    Release notes

    Sourced from numpy's releases.

    v1.20.2

    NumPy 1.20.2 Release Notes

    NumPy 1,20.2 is a bugfix release containing several fixes merged to the main branch after the NumPy 1.20.1 release.

    Contributors

    A total of 7 people contributed to this release. People with a "+" by their names contributed a patch for the first time.

    • Allan Haldane
    • Bas van Beek
    • Charles Harris
    • Christoph Gohlke
    • Mateusz Sokół +
    • Michael Lamparski
    • Sebastian Berg

    Pull requests merged

    A total of 20 pull requests were merged for this release.

    • #18382: MAINT: Update f2py from master.
    • #18459: BUG: diagflat could overflow on windows or 32-bit platforms
    • #18460: BUG: Fix refcount leak in f2py complex_double_from_pyobj.
    • #18461: BUG: Fix tiny memory leaks when like= overrides are used
    • #18462: BUG: Remove temporary change of descr/flags in VOID functions
    • #18469: BUG: Segfault in nditer buffer dealloc for Object arrays
    • #18485: BUG: Remove suspicious type casting
    • #18486: BUG: remove nonsensical comparison of pointer < 0
    • #18487: BUG: verify pointer against NULL before using it
    • #18488: BUG: check if PyArray_malloc succeeded
    • #18546: BUG: incorrect error fallthrough in nditer
    • #18559: CI: Backport CI fixes from main.
    • #18599: MAINT: Add annotations for __getitem__, __mul__ and...
    • #18611: BUG: NameError in numpy.distutils.fcompiler.compaq
    • #18612: BUG: Fixed where keyword for np.mean & np.var methods
    • #18617: CI: Update apt package list before Python install
    • #18636: MAINT: Ensure that re-exported sub-modules are properly annotated
    • #18638: BUG: Fix ma coercion list-of-ma-arrays if they do not cast to...
    • #18661: BUG: Fix small valgrind-found issues
    • #18671: BUG: Fix small issues found with pytest-leaks

    Checksums

    ... (truncated)

    Commits
    • b19ad5b REL: NumPy 1.20.2 release.
    • 7025ddc Merge pull request #18681 from charris/prepare-1.20.2-release
    • 5dc057c REL: Prepare for the NumPy 1.20.2 release.
    • e687165 Merge pull request #18671 from seberg/backport-small-pytest-leaks-fixes
    • afc861e BUG: Fix small issues found with pytest-leaks
    • c4fd82f Merge pull request #18661 from charris/backport-18651
    • b9edc25 BUG: Fix small valgrind-found issues (#18651)
    • 59c99e4 Merge pull request #18638 from charris/backport-18605
    • fb44ee2 Update numpy/ma/tests/test_core.py
    • 9719ac5 Apply suggestions from code review
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    opened by dependabot-preview[bot] 0
  • Update tqdm requirement from ~=4.56.0 to ~=4.59.0

    Update tqdm requirement from ~=4.56.0 to ~=4.59.0

    Updates the requirements on tqdm to permit the latest version.

    Release notes

    Sourced from tqdm's releases.

    tqdm v4.59.0 stable

    • add tqdm.dask.TqdmCallback (#1079, #279 <- #278)
    • add asyncio.gather() (#1136)
    • add basic support for length_hint (#1068)
    • add & update tests
    • misc documentation updates (#1132)
      • update contributing guide
      • update URLs
      • bash completion: add missing --delay
    • misc code tidy
      • add [notebook] extra (#1135)
    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    opened by dependabot-preview[bot] 0
  • Update pylint requirement from ~=2.6 to ~=2.7

    Update pylint requirement from ~=2.6 to ~=2.7

    Updates the requirements on pylint to permit the latest version.

    Changelog

    Sourced from pylint's changelog.

    What's New in Pylint 2.7.0?

    Release date: 2021-02-21

    • Introduce DeprecationMixin for reusable deprecation checks.

      Closes #4049

    • Fix false positive for builtin-not-iterating when map receives iterable

      Closes #4078

    • Python 3.6+ is now required.

    • Fix false positive for builtin-not-iterating when zip receives iterable

    • Add nan-comparison check for NaN comparisons

    • Bug fix for empty-comment message line number.

      Closes #4009

    • Only emit bad-reversed-sequence on dictionaries if below py3.8

      Closes #3940

    • Handle class decorators applied to function.

      Closes #3882

    • Add check for empty comments

    • Fix minor documentation issue in contribute.rst

    • Enums are now required to be named in UPPER_CASE by invalid-name.

      Close #3834

    • Add missing checks for deprecated functions.

    • Postponed evaluation of annotations are now recognized by default if python version is above 3.10

      Closes #3992

    • Fix column metadata for anomalous backslash lints

    • Drop support for Python 3.5

    • Add support for pep585 with postponed evaluation

    ... (truncated)

    Commits
    • 5e04ce7 Better documentation for the change in version during release
    • 41d7022 Upgrade the documentation about release
    • ac85223 Apply copyrite --contribution-threshold
    • 6e2de8e Upgrade version to 2.7.0 and fix astroid to 2.5.0
    • b9d9ca2 Edit highlights for 2.7.0 in whatsnew
    • ca11047 Fix link to isort documentation in doc/faq.rst
    • ee91075 Migrate from % syntax or bad format() syntax to fstring
    • 5bed07e Move from % string formatting syntax to f-string or .format()
    • a1e553d Add pyupgrade to the pre-commit configuration
    • 154718c Remove the # coding, since PEP3120 the default is UTF8
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    opened by dependabot-preview[bot] 0
  • Update pytest requirement from ~=6.1 to ~=6.2

    Update pytest requirement from ~=6.1 to ~=6.2

    Updates the requirements on pytest to permit the latest version.

    Release notes

    Sourced from pytest's releases.

    6.2.0

    pytest 6.2.0 (2020-12-12)

    Breaking Changes

    • #7808: pytest now supports python3.6+ only.

    Deprecations

    • #7469: Directly constructing/calling the following classes/functions is now deprecated:

      • _pytest.cacheprovider.Cache
      • _pytest.cacheprovider.Cache.for_config()
      • _pytest.cacheprovider.Cache.clear_cache()
      • _pytest.cacheprovider.Cache.cache_dir_from_config()
      • _pytest.capture.CaptureFixture
      • _pytest.fixtures.FixtureRequest
      • _pytest.fixtures.SubRequest
      • _pytest.logging.LogCaptureFixture
      • _pytest.pytester.Pytester
      • _pytest.pytester.Testdir
      • _pytest.recwarn.WarningsRecorder
      • _pytest.recwarn.WarningsChecker
      • _pytest.tmpdir.TempPathFactory
      • _pytest.tmpdir.TempdirFactory

      These have always been considered private, but now issue a deprecation warning, which may become a hard error in pytest 7.0.0.

    • #7530: The --strict command-line option has been deprecated, use --strict-markers instead.

      We have plans to maybe in the future to reintroduce --strict and make it an encompassing flag for all strictness related options (--strict-markers and --strict-config at the moment, more might be introduced in the future).

    • #7988: The @pytest.yield_fixture decorator/function is now deprecated. Use pytest.fixture instead.

      yield_fixture has been an alias for fixture for a very long time, so can be search/replaced safely.

    Features

    • #5299: pytest now warns about unraisable exceptions and unhandled thread exceptions that occur in tests on Python>=3.8. See unraisable for more information.

    • #7425: New pytester fixture, which is identical to testdir but its methods return pathlib.Path when appropriate instead of py.path.local.

      This is part of the movement to use pathlib.Path objects internally, in order to remove the dependency to py in the future.

      Internally, the old Testdir <_pytest.pytester.Testdir> is now a thin wrapper around Pytester <_pytest.pytester.Pytester>, preserving the old interface.

    Changelog

    Sourced from pytest's changelog.

    Commits
    • e7073af Prepare release version 6.2.0
    • 683f29f Merge pull request #8129 from bluetech/docs-pygments-workaround
    • 0feeddf doc: temporary workaround for pytest-pygments lexing error
    • b478275 Merge pull request #8128 from bluetech/skip-reason-empty
    • 3302ff9 terminal: when the skip/xfail is empty, don't show it as "()"
    • 59bd0f6 Merge pull request #8126 from bluetech/tox-regen-pretend-scm2
    • 6298ff1 tox: use pip legacy resolver for regen job
    • d51ecbd Merge pull request #8125 from bluetech/tox-rm-pip-req
    • f237b07 tox: remove requires: pip>=20.3.1
    • 95e0e19 Merge pull request #8124 from bluetech/s0undt3ch-feature/skip-context-hook
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    opened by dependabot-preview[bot] 0
Owner
Textpipe
Textpipe
Search for documents in a domain through Google. The objective is to extract metadata

MetaFinder - Metadata search through Google _____ __ ___________ .__ .___ / \

Josué Encinar 85 Dec 16, 2022
Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

Predicting Yelp Review Quality Table of Contents Introduction Motivation Goal and Central Questions The Data Data Storage and ETL EDA Data Pipeline Da

Jeff Johannsen 3 Nov 27, 2022
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

null 186 Dec 24, 2022
The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

Speech Separation The simple project to separate mixed voice (2 clean voices) to 2 separate voices. Result Example (Clisk to hear the voices): mix ||

vuthede 31 Oct 30, 2022
The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

Prakhar Mishra 28 May 25, 2021
Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.

flashgeotext ⚡ ?? Extract and count countries and cities (+their synonyms) from text, like GeoText on steroids using FlashText, a Aho-Corasick impleme

Ben 57 Dec 16, 2022
Snips Python library to extract meaning from text

Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur

Snips 3.7k Dec 30, 2022
Snips Python library to extract meaning from text

Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur

Snips 3.5k Feb 12, 2021
Snips Python library to extract meaning from text

Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur

Snips 3.5k Feb 17, 2021
An easy-to-use Python module that helps you to extract the BERT embeddings for a large text dataset (Bengali/English) efficiently.

An easy-to-use Python module that helps you to extract the BERT embeddings for a large text dataset (Bengali/English) efficiently.

Khalid Saifullah 37 Sep 5, 2022
This is a NLP based project to extract effective date of the contract from their text files.

Date-Extraction-from-Contracts This is a NLP based project to extract effective date of the contract from their text files. Problem statement This is

Sambhav Garg 1 Jan 26, 2022
Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset

Housegan-data-reader House-GAN++ (data-reader) Code and instructions for converting rplan dataset (raster images) to housegan++ data format. House-GAN

Sepid Hosseini 13 Nov 24, 2022
Extract Keywords from sentence or Replace keywords in sentences.

FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install

Vikash Singh 5.3k Jan 1, 2023
Extract Keywords from sentence or Replace keywords in sentences.

FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install

Vikash Singh 4.7k Feb 17, 2021
NLP tool to extract emotional phrase from tweets 🤩

Emotional phrase extractor Extract phrase in the given text that is used to express the sentiment. Capturing sentiment in language is important in the

Shahul ES 38 Oct 17, 2022
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ?? ?? ?? We released the 2.0.0 version with TF2 Support. ?? ?? ?? If you

Eliyar Eziz 2.3k Dec 29, 2022
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ?? ?? ?? We released the 2.0.0 version with TF2 Support. ?? ?? ?? If you

Eliyar Eziz 2k Feb 9, 2021