Repository for Project Insight: NLP as a Service

Overview

Project Insight

NLP as a Service

Project Insight

GitHub issues GitHub forks Github Stars GitHub license Code style: black

Contents

  1. Introduction
  2. Installation
  3. Project Details
  4. License

Introduction

Project Insight is designed to create NLP as a service with code base for both front end GUI (streamlit) and backend server (FastApi) the usage of transformers models on various downstream NLP task.

The downstream NLP tasks covered:

  • News Classification

  • Entity Recognition

  • Sentiment Analysis

  • Summarization

  • Information Extraction To Do

The user can select different models from the drop down to run the inference.

The users can also directly use the backend fastapi server to have a command line inference.

Features of the solution

  • Python Code Base: Built using Fastapi and Streamlit making the complete code base in Python.
  • Expandable: The backend is desinged in a way that it can be expanded with more Transformer based models and it will be available in the front end app automatically.
  • Micro-Services: The backend is designed with a microservices architecture, with dockerfile for each service and leveraging on Nginx as a reverse proxy to each independently running service.
    • This makes it easy to update, manitain, start, stop individual NLP services.

Installation

  • Clone the Repo.
  • Run the Docker Compose to spin up the Fastapi based backend service.
  • Run the Streamlit app with the streamlit run command.

Setup and Documentation

  1. Download the models

    • Download the models from here
    • Save them in the specific model folders inside the src_fastapi folder.
  2. Running the backend service.

    • Go to the src_fastapi folder
    • Run the Docker Compose comnand
    $ cd src_fastapi
    src_fastapi:~$ sudo docker-compose up -d
  3. Running the frontend app.

    • Go to the src_streamlit folder
    • Run the app with the streamlit run command
    $ cd src_streamlit
    src_streamlit:~$ streamlit run NLPfily.py
  4. Access to Fastapi Documentation: Since this is a microservice based design, every NLP task has its own seperate documentation

Project Details

Demonstration

Project Insight Demo

Directory Details

  • Front End: Front end code is in the src_streamlit folder. Along with the Dockerfile and requirements.txt

  • Back End: Back End code is in the src_fastapi folder.

    • This folder contains directory for each task: Classification, ner, summary...etc
    • Each NLP task has been implemented as a microservice, with its own fastapi server and requirements and Dockerfile so that they can be independently mantained and managed.
    • Each NLP task has its own folder and within each folder each trained model has 1 folder each. For example:
    - sentiment
        > app
            > api
                > distilbert
                    - model.bin
                    - network.py
                    - tokeniser files
                >roberta
                    - model.bin
                    - network.py
                    - tokeniser files
    
    • For each new model under each service a new folder will have to be added.

    • Each folder model will need the following files:

      • Model bin file.
      • Tokenizer files
      • network.py Defining the class of the model if customised model used.
    • config.json: This file contains the details of the models in the backend and the dataset they are trained on.

How to Add a new Model

  1. Fine Tune a transformer model for specific task. You can leverage the transformers-tutorials

  2. Save the model files, tokenizer files and also create a network.py script if using a customized training network.

  3. Create a directory within the NLP task with directory_name as the model name and save all the files in this directory.

  4. Update the config.json with the model details and dataset details.

  5. Update the <service>pro.py with the correct imports and conditions where the model is imported. For example for a new Bert model in Classification Task, do the following:

    • Create a new directory in classification/app/api/. Directory name bert.

    • Update config.json with following:

      "classification": {
      "model-1": {
          "name": "DistilBERT",
          "info": "This model is trained on News Aggregator Dataset from UC Irvin Machine Learning Repository. The news headlines are classified into 4 categories: **Business**, **Science and Technology**, **Entertainment**, **Health**. [New Dataset](https://archive.ics.uci.edu/ml/datasets/News+Aggregator)"
      },
      "model-2": {
          "name": "BERT",
          "info": "Model Info"
      }
      }
    • Update classificationpro.py with the following snippets:

      Only if customized class used

      from classification.bert import BertClass

      Section where the model is selected

      if model == "bert":
          self.model = BertClass()
          self.tokenizer = BertTokenizerFast.from_pretrained(self.path)

License

This project is licensed under the GPL-3.0 License - see the LICENSE.md file for details

Comments
  • Failed to establish a new connection: [Errno 111] Connection refused

    Failed to establish a new connection: [Errno 111] Connection refused

    Hello! Nice work, it looks very good in the video. Unfortunately, I was not able to make it run. I get this error:

    Insight JSONDecodeError: Expecting value: line 1 column 1 (char 0) Traceback: File "c:\users\mbene\miniconda3\lib\site-packages\streamlit\script_runner.py", line 324, in run_script exec(code, module.dict) File "C:\Users\mbene\Documents\GitHub\insight-master\insight-master\src_streamlit\NLPfiy.py", line 132, in main() File "C:\Users\mbene\Documents\GitHub\insight-master\insight-master\src_streamlit\NLPfiy.py", line 118, in main model_details = apicall.model_list(service=service) File "C:\Users\mbene\Documents\GitHub\insight-master\insight-master\src_streamlit\NLPfiy.py", line 24, in model_list return json.loads(models.text) File "c:\users\mbene\miniconda3\lib\json_init.py", line 348, in loads return _default_decoder.decode(s) File "c:\users\mbene\miniconda3\lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "c:\users\mbene\miniconda3\lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None

    In any of the service I get the same message (only change the URL endpoint)

    .- I already add the sreamlit to the docker network .- I already down and up the firewall .- Create an exception in the firewall .- Update docker and docker-compose .- In Linux Ubuntu and Windows 10 is the same error .- Installed the requirements.txt .- I can't access the localhost:8000/docs. .- And l already tried docker run the Dockerfile inside streamlit, and add it to the network

    no success so far.

    I would appreciate your help. Thank you!

    Mauro

    bug 
    opened by mbenetti 17
  • JSONDecodeError: Expecting value: line 1 column 1 (char 0)

    JSONDecodeError: Expecting value: line 1 column 1 (char 0)

    Hello Abhishek I try today with the latest changes but still is not working for me. Now I get "Internal Server Error" using FastAPI Swagger UI (see picture)

    image

    If I request a GET info, I get the name and the description of the model (see next picture)

    image

    The front end present the same Json decoder error as yesterday.

    image

    I tried in windows and linux again.

    bug 
    opened by mbenetti 9
  • No model.bin in the repo

    No model.bin in the repo

    Hi Abhi Mishra

    Amazing work. Really like the idea of building NLP services. I'm highly interested in augmenting the models with custom features for sentiment analysis. Is it possible for you to upload the model files (pytorch_model.bin) to try different NLP pipelines for sentiment analysis?

    question 
    opened by AnilB87 2
  • summarypro.py missing trailing slash in path

    summarypro.py missing trailing slash in path

    There's a missing / in the path on line 14:

            self.path = f"./app/api/{model}" 
    

    Should be:

            self.path = f"./app/api/{model}/"
    
    bug 
    opened by asehmi 1
  • Bump fastapi from 0.45.0 to 0.65.2 in /src_fastapi/classification

    Bump fastapi from 0.45.0 to 0.65.2 in /src_fastapi/classification

    Bumps fastapi from 0.45.0 to 0.65.2.

    Release notes

    Sourced from fastapi's releases.

    0.65.2

    Security fixes

    This change fixes a CSRF security vulnerability when using cookies for authentication in path operations with JSON payloads sent by browsers.

    In versions lower than 0.65.2, FastAPI would try to read the request payload as JSON even if the content-type header sent was not set to application/json or a compatible JSON media type (e.g. application/geo+json).

    So, a request with a content type of text/plain containing JSON data would be accepted and the JSON data would be extracted.

    But requests with content type text/plain are exempt from CORS preflights, for being considered Simple requests. So, the browser would execute them right away including cookies, and the text content could be a JSON string that would be parsed and accepted by the FastAPI application.

    See CVE-2021-32677 for more details.

    Thanks to Dima Boger for the security report! 🙇🔒

    Internal

    0.65.1

    Security fixes

    0.65.0

    Breaking Changes - Upgrade

    • ⬆️ Upgrade Starlette to 0.14.2, including internal UJSONResponse migrated from Starlette. This includes several bug fixes and features from Starlette. PR #2335 by @​hanneskuettner.

    Translations

    Internal

    0.64.0

    Features

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump fastapi from 0.45.0 to 0.65.2 in /src_fastapi/ner

    Bump fastapi from 0.45.0 to 0.65.2 in /src_fastapi/ner

    Bumps fastapi from 0.45.0 to 0.65.2.

    Release notes

    Sourced from fastapi's releases.

    0.65.2

    Security fixes

    This change fixes a CSRF security vulnerability when using cookies for authentication in path operations with JSON payloads sent by browsers.

    In versions lower than 0.65.2, FastAPI would try to read the request payload as JSON even if the content-type header sent was not set to application/json or a compatible JSON media type (e.g. application/geo+json).

    So, a request with a content type of text/plain containing JSON data would be accepted and the JSON data would be extracted.

    But requests with content type text/plain are exempt from CORS preflights, for being considered Simple requests. So, the browser would execute them right away including cookies, and the text content could be a JSON string that would be parsed and accepted by the FastAPI application.

    See CVE-2021-32677 for more details.

    Thanks to Dima Boger for the security report! 🙇🔒

    Internal

    0.65.1

    Security fixes

    0.65.0

    Breaking Changes - Upgrade

    • ⬆️ Upgrade Starlette to 0.14.2, including internal UJSONResponse migrated from Starlette. This includes several bug fixes and features from Starlette. PR #2335 by @​hanneskuettner.

    Translations

    Internal

    0.64.0

    Features

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump pydantic from 1.5.1 to 1.6.2 in /src_fastapi/ner

    Bump pydantic from 1.5.1 to 1.6.2 in /src_fastapi/ner

    Bumps pydantic from 1.5.1 to 1.6.2.

    Release notes

    Sourced from pydantic's releases.

    v1.6.2 (2021-05-11)

    Security fix: Fix date and datetime parsing so passing either 'infinity' or float('inf') (or their negative values) does not cause an infinite loop, see security advisory CVE-2021-29510.

    v1.6.1 (2020-07-15)

    See Changelog.

    Thank you to pydantic's sponsors: @​matin, @​tiangolo, @​chdsbd, @​jorgecarleitao, and 1 anonymous sponsor for their kind support.

    changes:

    v1.6 (2020-07-11)

    See Changelog.

    Thank you to pydantic's sponsors: @​matin, @​tiangolo, @​chdsbd, @​jorgecarleitao, and 1 anonymous sponsor for their kind support.

    changes:

    • Modify validators for conlist and conset to not have always=True, #1682 by @​samuelcolvin
    • add port check to AnyUrl (can't exceed 65536) ports are 16 insigned bits: 0 <= port <= 2**16-1 src: rfc793 header format, #1654 by @​flapili
    • Document default regex anchoring semantics, #1648 by @​yurikhan
    • Use chain.from_iterable in class_validators.py. This is a faster and more idiomatic way of using itertools.chain. Instead of computing all the items in the iterable and storing them in memory, they are computed one-by-one and never stored as a huge list. This can save on both runtime and memory space, #1642 by @​cool-RR
    • Add conset(), analogous to conlist(), #1623 by @​patrickkwang
    • make pydantic errors (un)pickable, #1616 by @​PrettyWood
    • Allow custom encoding for dotenv files, #1615 by @​PrettyWood
    • Ensure SchemaExtraCallable is always defined to get type hints on BaseConfig, #1614 by @​PrettyWood
    • Update datetime parser to support negative timestamps, #1600 by @​mlbiche
    • Update mypy, remove AnyType alias for Type[Any], #1598 by @​samuelcolvin
    • Adjust handling of root validators so that errors are aggregated from all failing root validators, instead of reporting on only the first root validator to fail, #1586 by @​beezee
    • Make __modify_schema__ on Enums apply to the enum schema rather than fields that use the enum, #1581 by @​therefromhere
    • Fix behavior of __all__ key when used in conjunction with index keys in advanced include/exclude of fields that are sequences, #1579 by @​xspirus
    • Subclass validators do not run when referencing a List field defined in a parent class when each_item=True. Added an example to the docs illustrating this, #1566 by @​samueldeklund
    • change schema.field_class_to_schema to support frozenset in schema, #1557 by @​wangpeibao
    • Call __modify_schema__ only for the field schema, #1552 by @​PrettyWood
    • Move the assignment of field.validate_always in fields.py so the always parameter of validators work on inheritance, #1545 by @​dcHHH
    • Added support for UUID instantiation through 16 byte strings such as b'\x12\x34\x56\x78' * 4. This was done to support BINARY(16) columns in sqlalchemy, #1541 by @​shawnwall
    • Add a test assertion that default_factory can return a singleton, #1523 by @​therefromhere
    • Add NameEmail.__eq__ so duplicate NameEmail instances are evaluated as equal, #1514 by @​stephen-bunn
    • Add datamodel-code-generator link in pydantic document site, #1500 by @​koxudaxi
    • Added a "Discussion of Pydantic" section to the documentation, with a link to "Pydantic Introduction" video by Alexander Hultnér, #1499 by @​hultner
    • Avoid some side effects of default_factory by calling it only once if possible and by not setting a default value in the schema, #1491 by @​PrettyWood
    • Added docs about dumping dataclasses to JSON, #1487 by @​mikegrima
    • Make BaseModel.__signature__ class-only, so getting __signature__ from model instance will raise AttributeError, #1466 by @​MrMrRobat
    • include 'format': 'password' in the schema for secret types, #1424 by @​atheuz
    • Modify schema constraints on ConstrainedFloat so that exclusiveMinimum and minimum are not included in the schema if they are equal to -math.inf and exclusiveMaximum and maximum are not included if they are equal to math.inf, #1417 by @​vdwees
    • Squash internal __root__ dicts in .dict() (and, by extension, in .json()), #1414 by @​patrickkwang

    ... (truncated)

    Changelog

    Sourced from pydantic's changelog.

    v1.6.2 (2021-05-11)

    • Security fix: Fix date and datetime parsing so passing either 'infinity' or float('inf') (or their negative values) does not cause an infinite loop, See security advisory CVE-2021-29510

    v1.6.1 (2020-07-15)

    v1.6 (2020-07-11)

    Thank you to pydantic's sponsors: @​matin, @​tiangolo, @​chdsbd, @​jorgecarleitao, and 1 anonymous sponsor for their kind support.

    • Modify validators for conlist and conset to not have always=True, #1682 by @​samuelcolvin
    • add port check to AnyUrl (can't exceed 65536) ports are 16 insigned bits: 0 <= port <= 2**16-1 src: rfc793 header format, #1654 by @​flapili
    • Document default regex anchoring semantics, #1648 by @​yurikhan
    • Use chain.from_iterable in class_validators.py. This is a faster and more idiomatic way of using itertools.chain. Instead of computing all the items in the iterable and storing them in memory, they are computed one-by-one and never stored as a huge list. This can save on both runtime and memory space, #1642 by @​cool-RR
    • Add conset(), analogous to conlist(), #1623 by @​patrickkwang
    • make pydantic errors (un)pickable, #1616 by @​PrettyWood
    • Allow custom encoding for dotenv files, #1615 by @​PrettyWood
    • Ensure SchemaExtraCallable is always defined to get type hints on BaseConfig, #1614 by @​PrettyWood
    • Update datetime parser to support negative timestamps, #1600 by @​mlbiche
    • Update mypy, remove AnyType alias for Type[Any], #1598 by @​samuelcolvin
    • Adjust handling of root validators so that errors are aggregated from all failing root validators, instead of reporting on only the first root validator to fail, #1586 by @​beezee
    • Make __modify_schema__ on Enums apply to the enum schema rather than fields that use the enum, #1581 by @​therefromhere
    • Fix behavior of __all__ key when used in conjunction with index keys in advanced include/exclude of fields that are sequences, #1579 by @​xspirus
    • Subclass validators do not run when referencing a List field defined in a parent class when each_item=True. Added an example to the docs illustrating this, #1566 by @​samueldeklund
    • change schema.field_class_to_schema to support frozenset in schema, #1557 by @​wangpeibao
    • Call __modify_schema__ only for the field schema, #1552 by @​PrettyWood
    • Move the assignment of field.validate_always in fields.py so the always parameter of validators work on inheritance, #1545 by @​dcHHH
    • Added support for UUID instantiation through 16 byte strings such as b'\x12\x34\x56\x78' * 4. This was done to support BINARY(16) columns in sqlalchemy, #1541 by @​shawnwall
    • Add a test assertion that default_factory can return a singleton, #1523 by @​therefromhere
    • Add NameEmail.__eq__ so duplicate NameEmail instances are evaluated as equal, #1514 by @​stephen-bunn
    • Add datamodel-code-generator link in pydantic document site, #1500 by @​koxudaxi
    • Added a "Discussion of Pydantic" section to the documentation, with a link to "Pydantic Introduction" video by Alexander Hultnér, #1499 by @​hultner
    • Avoid some side effects of default_factory by calling it only once if possible and by not setting a default value in the schema, #1491 by @​PrettyWood
    • Added docs about dumping dataclasses to JSON, #1487 by @​mikegrima
    • Make BaseModel.__signature__ class-only, so getting __signature__ from model instance will raise AttributeError, #1466 by @​MrMrRobat
    • include 'format': 'password' in the schema for secret types, #1424 by @​atheuz
    • Modify schema constraints on ConstrainedFloat so that exclusiveMinimum and minimum are not included in the schema if they are equal to -math.inf and exclusiveMaximum and maximum are not included if they are equal to math.inf, #1417 by @​vdwees
    • Squash internal __root__ dicts in .dict() (and, by extension, in .json()), #1414 by @​patrickkwang
    • Move const validator to post-validators so it validates the parsed value, #1410 by @​selimb
    • Fix model validation to handle nested literals, e.g. Literal['foo', Literal['bar']], #1364 by @​DBCerigo
    • Remove user_required = True from RedisDsn, neither user nor password are required, #1275 by @​samuelcolvin

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump pydantic from 1.5.1 to 1.6.2 in /src_fastapi/classification

    Bump pydantic from 1.5.1 to 1.6.2 in /src_fastapi/classification

    Bumps pydantic from 1.5.1 to 1.6.2.

    Release notes

    Sourced from pydantic's releases.

    v1.6.2 (2021-05-11)

    Security fix: Fix date and datetime parsing so passing either 'infinity' or float('inf') (or their negative values) does not cause an infinite loop, see security advisory CVE-2021-29510.

    v1.6.1 (2020-07-15)

    See Changelog.

    Thank you to pydantic's sponsors: @​matin, @​tiangolo, @​chdsbd, @​jorgecarleitao, and 1 anonymous sponsor for their kind support.

    changes:

    v1.6 (2020-07-11)

    See Changelog.

    Thank you to pydantic's sponsors: @​matin, @​tiangolo, @​chdsbd, @​jorgecarleitao, and 1 anonymous sponsor for their kind support.

    changes:

    • Modify validators for conlist and conset to not have always=True, #1682 by @​samuelcolvin
    • add port check to AnyUrl (can't exceed 65536) ports are 16 insigned bits: 0 <= port <= 2**16-1 src: rfc793 header format, #1654 by @​flapili
    • Document default regex anchoring semantics, #1648 by @​yurikhan
    • Use chain.from_iterable in class_validators.py. This is a faster and more idiomatic way of using itertools.chain. Instead of computing all the items in the iterable and storing them in memory, they are computed one-by-one and never stored as a huge list. This can save on both runtime and memory space, #1642 by @​cool-RR
    • Add conset(), analogous to conlist(), #1623 by @​patrickkwang
    • make pydantic errors (un)pickable, #1616 by @​PrettyWood
    • Allow custom encoding for dotenv files, #1615 by @​PrettyWood
    • Ensure SchemaExtraCallable is always defined to get type hints on BaseConfig, #1614 by @​PrettyWood
    • Update datetime parser to support negative timestamps, #1600 by @​mlbiche
    • Update mypy, remove AnyType alias for Type[Any], #1598 by @​samuelcolvin
    • Adjust handling of root validators so that errors are aggregated from all failing root validators, instead of reporting on only the first root validator to fail, #1586 by @​beezee
    • Make __modify_schema__ on Enums apply to the enum schema rather than fields that use the enum, #1581 by @​therefromhere
    • Fix behavior of __all__ key when used in conjunction with index keys in advanced include/exclude of fields that are sequences, #1579 by @​xspirus
    • Subclass validators do not run when referencing a List field defined in a parent class when each_item=True. Added an example to the docs illustrating this, #1566 by @​samueldeklund
    • change schema.field_class_to_schema to support frozenset in schema, #1557 by @​wangpeibao
    • Call __modify_schema__ only for the field schema, #1552 by @​PrettyWood
    • Move the assignment of field.validate_always in fields.py so the always parameter of validators work on inheritance, #1545 by @​dcHHH
    • Added support for UUID instantiation through 16 byte strings such as b'\x12\x34\x56\x78' * 4. This was done to support BINARY(16) columns in sqlalchemy, #1541 by @​shawnwall
    • Add a test assertion that default_factory can return a singleton, #1523 by @​therefromhere
    • Add NameEmail.__eq__ so duplicate NameEmail instances are evaluated as equal, #1514 by @​stephen-bunn
    • Add datamodel-code-generator link in pydantic document site, #1500 by @​koxudaxi
    • Added a "Discussion of Pydantic" section to the documentation, with a link to "Pydantic Introduction" video by Alexander Hultnér, #1499 by @​hultner
    • Avoid some side effects of default_factory by calling it only once if possible and by not setting a default value in the schema, #1491 by @​PrettyWood
    • Added docs about dumping dataclasses to JSON, #1487 by @​mikegrima
    • Make BaseModel.__signature__ class-only, so getting __signature__ from model instance will raise AttributeError, #1466 by @​MrMrRobat
    • include 'format': 'password' in the schema for secret types, #1424 by @​atheuz
    • Modify schema constraints on ConstrainedFloat so that exclusiveMinimum and minimum are not included in the schema if they are equal to -math.inf and exclusiveMaximum and maximum are not included if they are equal to math.inf, #1417 by @​vdwees
    • Squash internal __root__ dicts in .dict() (and, by extension, in .json()), #1414 by @​patrickkwang

    ... (truncated)

    Changelog

    Sourced from pydantic's changelog.

    v1.6.2 (2021-05-11)

    • Security fix: Fix date and datetime parsing so passing either 'infinity' or float('inf') (or their negative values) does not cause an infinite loop, See security advisory CVE-2021-29510

    v1.6.1 (2020-07-15)

    v1.6 (2020-07-11)

    Thank you to pydantic's sponsors: @​matin, @​tiangolo, @​chdsbd, @​jorgecarleitao, and 1 anonymous sponsor for their kind support.

    • Modify validators for conlist and conset to not have always=True, #1682 by @​samuelcolvin
    • add port check to AnyUrl (can't exceed 65536) ports are 16 insigned bits: 0 <= port <= 2**16-1 src: rfc793 header format, #1654 by @​flapili
    • Document default regex anchoring semantics, #1648 by @​yurikhan
    • Use chain.from_iterable in class_validators.py. This is a faster and more idiomatic way of using itertools.chain. Instead of computing all the items in the iterable and storing them in memory, they are computed one-by-one and never stored as a huge list. This can save on both runtime and memory space, #1642 by @​cool-RR
    • Add conset(), analogous to conlist(), #1623 by @​patrickkwang
    • make pydantic errors (un)pickable, #1616 by @​PrettyWood
    • Allow custom encoding for dotenv files, #1615 by @​PrettyWood
    • Ensure SchemaExtraCallable is always defined to get type hints on BaseConfig, #1614 by @​PrettyWood
    • Update datetime parser to support negative timestamps, #1600 by @​mlbiche
    • Update mypy, remove AnyType alias for Type[Any], #1598 by @​samuelcolvin
    • Adjust handling of root validators so that errors are aggregated from all failing root validators, instead of reporting on only the first root validator to fail, #1586 by @​beezee
    • Make __modify_schema__ on Enums apply to the enum schema rather than fields that use the enum, #1581 by @​therefromhere
    • Fix behavior of __all__ key when used in conjunction with index keys in advanced include/exclude of fields that are sequences, #1579 by @​xspirus
    • Subclass validators do not run when referencing a List field defined in a parent class when each_item=True. Added an example to the docs illustrating this, #1566 by @​samueldeklund
    • change schema.field_class_to_schema to support frozenset in schema, #1557 by @​wangpeibao
    • Call __modify_schema__ only for the field schema, #1552 by @​PrettyWood
    • Move the assignment of field.validate_always in fields.py so the always parameter of validators work on inheritance, #1545 by @​dcHHH
    • Added support for UUID instantiation through 16 byte strings such as b'\x12\x34\x56\x78' * 4. This was done to support BINARY(16) columns in sqlalchemy, #1541 by @​shawnwall
    • Add a test assertion that default_factory can return a singleton, #1523 by @​therefromhere
    • Add NameEmail.__eq__ so duplicate NameEmail instances are evaluated as equal, #1514 by @​stephen-bunn
    • Add datamodel-code-generator link in pydantic document site, #1500 by @​koxudaxi
    • Added a "Discussion of Pydantic" section to the documentation, with a link to "Pydantic Introduction" video by Alexander Hultnér, #1499 by @​hultner
    • Avoid some side effects of default_factory by calling it only once if possible and by not setting a default value in the schema, #1491 by @​PrettyWood
    • Added docs about dumping dataclasses to JSON, #1487 by @​mikegrima
    • Make BaseModel.__signature__ class-only, so getting __signature__ from model instance will raise AttributeError, #1466 by @​MrMrRobat
    • include 'format': 'password' in the schema for secret types, #1424 by @​atheuz
    • Modify schema constraints on ConstrainedFloat so that exclusiveMinimum and minimum are not included in the schema if they are equal to -math.inf and exclusiveMaximum and maximum are not included if they are equal to math.inf, #1417 by @​vdwees
    • Squash internal __root__ dicts in .dict() (and, by extension, in .json()), #1414 by @​patrickkwang
    • Move const validator to post-validators so it validates the parsed value, #1410 by @​selimb
    • Fix model validation to handle nested literals, e.g. Literal['foo', Literal['bar']], #1364 by @​DBCerigo
    • Remove user_required = True from RedisDsn, neither user nor password are required, #1275 by @​samuelcolvin

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • JSONDecodeError

    JSONDecodeError

    After I run the "streamlit run NLPfily.py" and get the webpage, there is a jsondecodeerror raised when I choose the service. I don't know how to solve it, the error is below: could you please help me ? File "/home/test/ENTER/lib/python3.8/site-packages/streamlit/script_runner.py", line 332, in _run_script exec(code, module.dict) File "/home/test/insight-master/src_streamlit/NLPfiy.py", line 130, in main() File "/home/test/insight-master/src_streamlit/NLPfiy.py", line 116, in main model_details = apicall.model_list(service=service) File "/home/test/insight-master/src_streamlit/NLPfiy.py", line 24, in model_list return json.loads(models.text) File "/home/test/ENTER/lib/python3.8/json/init.py", line 357, in loads return _default_decoder.decode(s) File "/home/test/ENTER/lib/python3.8/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/home/test/ENTER/lib/python3.8/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None

    opened by James0128 0
  • Backend server giving errors on running

    Backend server giving errors on running

    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
        chunked=chunked)
      File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 354, in _make_request
        conn.request(method, url, **httplib_request_kw)
      File "/usr/lib/python3.6/http/client.py", line 1264, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/lib/python3.6/http/client.py", line 1310, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.6/http/client.py", line 1259, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.6/http/client.py", line 1038, in _send_output
        self.send(msg)
      File "/usr/lib/python3.6/http/client.py", line 976, in send
        self.connect()
      File "/usr/local/lib/python3.6/dist-packages/docker/transport/unixconn.py", line 43, in connect
        sock.connect(self.unix_socket)
    FileNotFoundError: [Errno 2] No such file or directory
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/requests/adapters.py", line 449, in send
        timeout=timeout
      File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 638, in urlopen
        _stacktrace=sys.exc_info()[2])
      File "/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py", line 368, in increment
        raise six.reraise(type(error), error, _stacktrace)
      File "/usr/local/lib/python3.6/dist-packages/urllib3/packages/six.py", line 685, in reraise
        raise value.with_traceback(tb)
      File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
        chunked=chunked)
      File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 354, in _make_request
        conn.request(method, url, **httplib_request_kw)
      File "/usr/lib/python3.6/http/client.py", line 1264, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/lib/python3.6/http/client.py", line 1310, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.6/http/client.py", line 1259, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.6/http/client.py", line 1038, in _send_output
        self.send(msg)
      File "/usr/lib/python3.6/http/client.py", line 976, in send
        self.connect()
      File "/usr/local/lib/python3.6/dist-packages/docker/transport/unixconn.py", line 43, in connect
        sock.connect(self.unix_socket)
    urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 205, in _retrieve_server_version
        return self.version(api_version=False)["ApiVersion"]
      File "/usr/local/lib/python3.6/dist-packages/docker/api/daemon.py", line 181, in version
        return self._result(self._get(url), json=True)
      File "/usr/local/lib/python3.6/dist-packages/docker/utils/decorators.py", line 46, in inner
        return f(self, *args, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 228, in _get
        return self.get(url, **self._set_request_timeout(kwargs))
      File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 543, in get
        return self.request('GET', url, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 530, in request
        resp = self.send(prep, **send_kwargs)
      File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 643, in send
        r = adapter.send(request, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/requests/adapters.py", line 498, in send
        raise ConnectionError(err, request=request)
    requests.exceptions.ConnectionError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/bin/docker-compose", line 8, in <module>
        sys.exit(main())
      File "/usr/local/lib/python3.6/dist-packages/compose/cli/main.py", line 67, in main
        command()
      File "/usr/local/lib/python3.6/dist-packages/compose/cli/main.py", line 123, in perform_command
        project = project_from_options('.', options)
      File "/usr/local/lib/python3.6/dist-packages/compose/cli/command.py", line 69, in project_from_options
        environment_file=environment_file
      File "/usr/local/lib/python3.6/dist-packages/compose/cli/command.py", line 132, in get_project
        verbose=verbose, version=api_version, context=context, environment=environment
      File "/usr/local/lib/python3.6/dist-packages/compose/cli/docker_client.py", line 43, in get_client
        environment=environment, tls_version=get_tls_version(environment)
      File "/usr/local/lib/python3.6/dist-packages/compose/cli/docker_client.py", line 170, in docker_client
        client = APIClient(**kwargs)
      File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 188, in __init__
        self._version = self._retrieve_server_version()
      File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 213, in _retrieve_server_version
        'Error while fetching server API version: {0}'.format(e)
    docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
    
    opened by garain 0
  • Internal server error invoking NER predict end point

    Internal server error invoking NER predict end point

    Hi,

    I have all other models working except for NER, which uses spaCy. lexemes.bin seems to be missing. I've used spaCy before, but not with an unpackaged model like this appears to be. Any pointers welcomed.

    This is my trace:

    nginx_1           | 172.20.0.1 - - [26/Aug/2020:12:29:56 +0000] "GET /api/v1/ner/docs HTTP/1.1" 200 910 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36" "-"
    ner_1             | INFO:     172.20.0.6:56800 - "GET /api/v1/ner/openapi.json HTTP/1.0" 200 OK
    nginx_1           | 172.20.0.1 - - [26/Aug/2020:12:29:57 +0000] "GET /api/v1/ner/openapi.json HTTP/1.1" 200 2724 "http://localhost:8080/api/v1/ner/docs" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36" "-"
    nginx_1           | 172.20.0.1 - - [26/Aug/2020:12:30:15 +0000] "GET /api/v1/ner/info HTTP/1.1" 200 163 "http://localhost:8080/api/v1/ner/docs" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36" "-"
    ner_1             | INFO:     172.20.0.6:56802 - "GET /api/v1/ner/info HTTP/1.0" 200 OK
    ner_1             | INFO:     172.20.0.6:56810 - "POST /api/v1/ner/predict HTTP/1.0" 500 Internal Server Error
    ner_1             | ERROR:    Exception in ASGI application
    ner_1             | Traceback (most recent call last):
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi
    ner_1             |     result = await app(self.scope, self.receive, self.send)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    ner_1             |     return await self.app(scope, receive, send)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/fastapi/applications.py", line 140, in __call__
    ner_1             |     await super().__call__(scope, receive, send)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/starlette/applications.py", line 134, in __call__
    ner_1             |     await self.error_middleware(scope, receive, send)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/starlette/middleware/errors.py", line 178, in __call__
    ner_1             |     raise exc from None
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/starlette/middleware/errors.py", line 156, in __call__
    ner_1             |     await self.app(scope, receive, _send)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/starlette/exceptions.py", line 73, in __call__
    ner_1             |     raise exc from None
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/starlette/exceptions.py", line 62, in __call__
    ner_1             |     await self.app(scope, receive, sender)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/starlette/routing.py", line 590, in __call__
    ner_1             |     await route(scope, receive, send)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/starlette/routing.py", line 208, in __call__
    ner_1             |     await self.app(scope, receive, send)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/starlette/routing.py", line 41, in app
    ner_1             |     response = await func(request)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/fastapi/routing.py", line 127, in app
    ner_1             |     raw_response = await dependant.call(**values)
    ner_1             |   File "./app/api/ner.py", line 46, in named_entity_recognition
    ner_1             |     ner_process = NerProcessor(model=item.model.lower())
    ner_1             |   File "./app/api/nerpro.py", line 21, in __init__
    ner_1             |     self.model = spacy.load("./app/api/spacy/", disable=["tagger", "parser"])
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/spacy/__init__.py", line 21, in load
    ner_1             |     return util.load_model(name, **overrides)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 116, in load_model
    ner_1             |     return load_model_from_path(Path(name), **overrides)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 156, in load_model_from_path
    ner_1             |     return nlp.from_disk(model_path)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/spacy/language.py", line 647, in from_disk
    ner_1             |     util.from_disk(path, deserializers, exclude)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 511, in from_disk
    ner_1             |     reader(path / key)
    ner_1             |   File "/usr/local/lib/python3.7/site-packages/spacy/language.py", line 635, in <lambda>
    ner_1             |     self.vocab.from_disk(p) and _fix_pretrained_vectors_name(self))),
    ner_1             |   File "vocab.pyx", line 377, in spacy.vocab.Vocab.from_disk
    ner_1             |   File "/usr/local/lib/python3.7/pathlib.py", line 1203, in open
    ner_1             |     opener=self._opener)
    ner_1             |   File "/usr/local/lib/python3.7/pathlib.py", line 1058, in _opener
    ner_1             |     return self._accessor.open(self, flags, mode)
    ner_1             | FileNotFoundError: [Errno 2] No such file or directory: 'app/api/spacy/vocab/lexemes.bin'
    nginx_1           | 172.20.0.1 - - [26/Aug/2020:12:32:00 +0000] "POST /api/v1/ner/predict HTTP/1.1" 500 21 "http://localhost:8080/api/v1/ner/docs" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36" "-"
    

    This is the request body I used from Swagger:

    {
      "model": "spaCy",
      "text": "Dense, real valued vectors representing distributional similarity information are now a cornerstone of practical NLP. The most common way to train these vectors is the Word2vec family of algorithms. If you need to train a word2vec model, we recommend the implementation in the Python library Gensim.",
      "query": "string"
    }
    
    bug 
    opened by asehmi 2
  • Text extraction with tesseract

    Text extraction with tesseract

    Nice project, similar to one I started but you got much further. One feature in mine you might like to steal is text extraction (via OCR) using tesseract. https://github.com/robmarkcole/text-insights-app

    enhancement 
    opened by robmarkcole 1
Owner
Abhishek Kumar Mishra
Eat, Sleep, Pray, and Code * An Operations Innovation Lead at IHS Markit during working hours. * Love to read manga and cook new cuisines.
Abhishek Kumar Mishra
This repository has a implementations of data augmentation for NLP for Japanese.

daaja This repository has a implementations of data augmentation for NLP for Japanese: EDA: Easy Data Augmentation Techniques for Boosting Performance

Koga Kobayashi 60 Nov 11, 2022
NLPretext packages in a unique library all the text preprocessing functions you need to ease your NLP project.

NLPretext packages in a unique library all the text preprocessing functions you need to ease your NLP project.

Artefact 114 Dec 15, 2022
Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

Dirk Neuhäuser 4 Apr 6, 2022
Simple NLP based project without any use of AI

Simple NLP based project without any use of AI

Shripad Rao 1 Apr 26, 2022
Course project of NLP@UCAS

NaiveMT Prepare Clone this repository git clone [email protected]:Poeroz/NaiveMT.git Install Please first install PyTorch >= 1.5.0, then type the followi

Poeroz 2 Apr 24, 2022
This is a project of data parallel that running on NLP tasks.

This is a project of data parallel that running on NLP tasks.

null 2 Dec 12, 2021
Code for the project carried out fulfilling the course requirements for Fall 2021 NLP at NYU

Introduction Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization,

Sai Himal Allu 1 Apr 25, 2022
NLP project that works with news (NER, context generation, news trend analytics)

СоАвтор СоАвтор – платформа и открытый набор инструментов для редакций и журналистов-фрилансеров, который призван сделать процесс создания контента ма

null 38 Jan 4, 2023
T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets

T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets (product titles, images, comments, etc.).

null 55 Nov 22, 2022
This is a NLP based project to extract effective date of the contract from their text files.

Date-Extraction-from-Contracts This is a NLP based project to extract effective date of the contract from their text files. Problem statement This is

Sambhav Garg 1 Jan 26, 2022
A modular Karton Framework service that unpacks common packers like UPX and others using the Qiling Framework.

Unpacker Karton Service A modular Karton Framework service that unpacks common packers like UPX and others using the Qiling Framework. This project is

c3rb3ru5 45 Jan 5, 2023
Minimal GUI for accessing the Watson Text to Speech service.

Description Minimal graphical application for accessing the Watson Text to Speech service. Requirements Python 3 plus all dependencies listed in requi

Moritz Maxeiner 1 Oct 22, 2021
Azure Text-to-speech service for Home Assistant

Azure Text-to-speech service for Home Assistant The Azure text-to-speech platform uses online Azure Text-to-Speech cognitive service to read a text wi

Yassine Selmi 2 Aug 6, 2022
Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

Heartex 135 Dec 29, 2022
Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

Frog for Python This is a Python binding to the Natural Language Processing suite Frog. Frog is intended for Dutch and performs part-of-speech tagging

Maarten van Gompel 46 Dec 14, 2022
💫 Industrial-strength Natural Language Processing (NLP) in Python

spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc

Explosion 24.9k Jan 2, 2023
NLP, before and after spaCy

textacy: NLP, before and after spaCy textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the hig

Chartbeat Labs Projects 2k Jan 4, 2023
Multilingual text (NLP) processing toolkit

polyglot Polyglot is a natural language pipeline that supports massive multilingual applications. Free software: GPLv3 license Documentation: http://p

RAMI ALRFOU 2.1k Jan 7, 2023