1820 Repositories
Python nlp-framework Libraries
Multilingual text (NLP) processing toolkit
polyglot Polyglot is a natural language pipeline that supports massive multilingual applications. Free software: GPLv3 license Documentation: http://p
Topic Modelling for Humans
gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ
Text vectorization tool to outperform TFIDF for classification tasks
WHAT: Supervised text vectorization tool Textvec is a text vectorization tool, with the aim to implement all the "classic" text vectorization NLP meth
NLP library designed for reproducible experimentation management
Welcome to the Transfer NLP library, a framework built on top of PyTorch to promote reproducible experimentation and Transfer Learning in NLP You can
🏖 Easy training and deployment of seq2seq models.
Headliner Headliner is a sequence modeling library that eases the training and in particular, the deployment of custom sequence models for both resear
Textpipe: clean and extract metadata from text
textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata
Unsupervised text tokenizer focused on computational efficiency
YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you
Scikit-learn style model finetuning for NLP
Scikit-learn style model finetuning for NLP Finetune is a library that allows users to leverage state-of-the-art pretrained NLP models for a wide vari
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet
Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u
DELTA is a deep learning based natural language and speech processing platform.
DELTA - A DEep learning Language Technology plAtform What is DELTA? DELTA is a deep learning based end-to-end natural language and speech processing p
Text preprocessing, representation and visualization from zero to hero.
Text preprocessing, representation and visualization from zero to hero. From zero to hero • Installation • Getting Started • Examples • API • FAQ • Co
NeMo: a toolkit for conversational AI
NVIDIA NeMo Introduction NeMo is a toolkit for creating Conversational AI applications. NeMo product page. Introductory video. The toolkit comes with
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
A Deep Learning NLP/NLU library by Intel® AI Lab Overview | Models | Installation | Examples | Documentation | Tutorials | Contributing NLP Architect
Super easy library for BERT based NLP models
Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
Basic Utilities for PyTorch Natural Language Processing (NLP)
Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor
Snips Python library to extract meaning from text
Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur
A full spaCy pipeline and models for scientific/biomedical documents.
This repository contains custom pipes and models related to using spaCy for scientific documents. In particular, there is a custom tokenizer that adds
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr
🦆 Contextually-keyed word vectors
sense2vec: Contextually-keyed word vectors sense2vec (Trask et. al, 2015) is a nice twist on word2vec that lets you learn more interesting and detaile
✨Fast Coreference Resolution in spaCy with Neural Networks
✨ NeuralCoref 4.0: Coreference Resolution in spaCy with Neural Networks. NeuralCoref is a pipeline extension for spaCy 2.1+ which annotates and resolv
Extract Keywords from sentence or Replace keywords in sentences.
FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install
The Classical Language Toolkit
Notice: This Git branch (dev) contains the CLTK's upcoming major release (v. 1.0.0). See https://github.com/cltk/cltk/tree/master and https://docs.clt
Python implementation of TextRank for phrase extraction and summarization of text documents
PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
fastNLP fastNLP是一款轻量级的自然语言处理(NLP)工具包,目标是快速实现NLP任务以及构建复杂模型。 fastNLP具有如下的特性: 统一的Tabular式数据容器,简化数据预处理过程; 内置多种数据集的Loader和Pipe,省去预处理代码; 各种方便的NLP工具,例如Embedd
Module for automatic summarization of text documents and HTML pages.
Automatic text summarizer Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains sim
Official Stanford NLP Python Library for Many Human Languages
Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac
A natural language modeling framework based on PyTorch
Overview PyText is a deep-learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapi
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat, to task-oriented dia
:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...
Haystack is an end-to-end framework for Question Answering & Neural search that enables you to ... ... ask questions in natural language and find gran
State of the Art Natural Language Processing
Spark NLP: State of the Art Natural Language Processing Spark NLP is a Natural Language Processing library built on top of Apache Spark ML. It provide
An open source library for deep learning end-to-end dialog systems and chatbots.
DeepPavlov is an open-source conversational AI library built on TensorFlow, Keras and PyTorch. DeepPavlov is designed for development of production re
NLP, before and after spaCy
textacy: NLP, before and after spaCy textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the hig
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
NLP made easy
GluonNLP: Your Choice of Deep Learning for NLP GluonNLP is a toolkit that helps you solve NLP problems. It provides easy-to-use tools that helps you l
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. Main features: Train new vocabularies and tok
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It
An open-source NLP research library, built on PyTorch.
An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic
Data loaders and abstractions for text and NLP
torchtext This repository consists of: torchtext.data: Generic data loaders, abstractions, and iterators for text (including vocabulary and word vecto
A very simple framework for state-of-the-art Natural Language Processing (NLP)
A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. IMPORTANT: (30.08.2020) We moved our models
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual
NLTK Source
Natural Language Toolkit (NLTK) NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets, and tutorials supporting
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
State-of-the-art Natural Language Processing for PyTorch and TensorFlow 2.0 🤗 Transformers provides thousands of pretrained models to perform tasks o
💫 Industrial-strength Natural Language Processing (NLP) in Python
spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc
Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Dash Dash is the most downloaded, trusted Python framework for building ML & data science web apps. Built on top of Plotly.js, React and Flask, Dash t
Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.
Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this
Machine learning framework for both deep learning and traditional algorithms
NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models. This framework is used by ABBYY engineers for
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.
Documentation | FAQ | Release Notes | Roadmap | MACE Model Zoo | Demo | Join Us | 中文 Mobile AI Compute Engine (or MACE for short) is a deep learning i
ktrain is a Python library that makes deep learning and AI more accessible and easier to apply
Overview | Tutorials | Examples | Installation | FAQ | How to Cite Welcome to ktrain News and Announcements 2020-11-08: ktrain v0.25.x is released and
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open
A flexible framework of neural networks for deep learning
Chainer: A deep learning framework Website | Docs | Install Guide | Tutorials (ja) | Examples (Official, External) | Concepts | ChainerX Forum (en, ja
🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
Thinc: A refreshing functional take on deep learning, compatible with your favorite libraries From the makers of spaCy, Prodigy and FastAPI Thinc is a
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a
An Open Source Machine Learning Framework for Everyone
Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a
Serverless Python
Zappa - Serverless Python About Installation and Configuration Running the Initial Setup / Settings Basic Usage Initial Deployments Updates Rollback S
Flask-Potion is a RESTful API framework for Flask and SQLAlchemy, Peewee or MongoEngine
Flask-Potion Description Flask-Potion is a powerful Flask extension for building RESTful JSON APIs. Potion features include validation, model resource
Fully featured framework for fast, easy and documented API development with Flask
Flask RestPlus IMPORTANT NOTICE: This project has been forked to Flask-RESTX and will be maintained by by the python-restx organization. Flask-RESTPlu
Swagger/OpenAPI First framework for Python on top of Flask with automatic endpoint validation & OAuth2 support
Connexion Connexion is a framework that automagically handles HTTP requests based on OpenAPI Specification (formerly known as Swagger Spec) of your AP
A dynamic FastAPI router that automatically creates CRUD routes for your models
⚡ Create CRUD routes with lighting speed ⚡ A dynamic FastAPI router that automatically creates CRUD routes for your models Documentation: https://fast
Python-based continuous integration testing framework; your pull requests are more than welcome!
Buildbot The Continuous Integration Framework Buildbot is based on original work from Brian Warner, and currently maintained by the Botherders. Visit
Sixpack is a language-agnostic a/b-testing framework
Sixpack Sixpack is a framework to enable A/B testing across multiple programming languages. It does this by exposing a simple API for client libraries
The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
The pytest framework makes it easy to write small tests, yet scales to support complex functional testing for applications and libraries. An example o
Open source UI framework written in Python, running on Windows, Linux, macOS, Android and iOS
Kivy Innovative user interfaces made easy. Kivy is an open source, cross-platform Python framework for the development of applications that make use o
🏆 A ranked list of awesome python libraries for web development. Updated weekly.
Best-of Web Development with Python 🏆 A ranked list of awesome python libraries for web development. Updated weekly. This curated list contains 540 a
💨 Fast, Async-ready, Openapi, type hints based framework for building APIs
Fast to learn, fast to code, fast to run Django Ninja - Fast Django REST Framework Django Ninja is a web framework for building APIs with Django and P
PEP-484 stubs for django-rest-framework
pep484 stubs for Django REST framework Mypy stubs for DRF 3.12.x. Supports Python 3.6, 3.7, 3.8 and 3.9. Installation pip install djangorestframework-
Cookiecutter Django is a framework for jumpstarting production-ready Django projects quickly.
Cookiecutter Django Powered by Cookiecutter, Cookiecutter Django is a framework for jumpstarting production-ready Django projects quickly. Documentati
django-reversion is an extension to the Django web framework that provides version control for model instances.
django-reversion django-reversion is an extension to the Django web framework that provides version control for model instances. Requirements Python 3
This is a repository for collecting global custom management extensions for the Django Framework.
Django Extensions Django Extensions is a collection of custom extensions for the Django Framework. Getting Started The easiest way to figure out what
FastAPI framework plugins
Plugins for FastAPI framework, high performance, easy to learn, fast to code, ready for production fastapi-plugins FastAPI framework plugins Cache Mem
Django Smuggler is a pluggable application for Django Web Framework that helps you to import/export fixtures via the automatically-generated administration interface.
Django Smuggler Django Smuggler is a pluggable application for Django Web Framework to easily dump/load fixtures via the automatically-generated admin
Simple and extensible administrative interface framework for Flask
Flask-Admin The project was recently moved into its own organization. Please update your references to [email protected]:flask-admin/flask-admin.git. Int
Library to scrape and clean web pages to create massive datasets.
lazynlp A straightforward library that allows you to crawl, clean up, and deduplicate webpages to create massive monolingual datasets. Using this libr
Web crawling framework based on asyncio.
Web crawling framework for everyone. Written with asyncio, uvloop and aiohttp. Requirements Python3.5+ Installation pip install gain pip install uvloo
Transistor, a Python web scraping framework for intelligent use cases.
Web data collection and storage for intelligent use cases. transistor About The web is full of data. Transistor is a web scraping framework for collec
A high-level distributed crawling framework.
Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structur
Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)
trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow
Async Python 3.6+ web scraping micro-framework based on asyncio
Ruia 🕸️ Async Python 3.6+ web scraping micro-framework based on asyncio. ⚡ Write less, run faster. Overview Ruia is an async web scraping micro-frame
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
Gerapy Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js. Documentation Documentation
Scrapy, a fast high-level web crawling & scraping framework for Python.
Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag
Facebook open graph api implementation using the Django web framework in python
Django Facebook by Thierry Schellenbach (mellowmorning.com) Status Django and Facebook are both rapidly changing at the moment. Meanwhile, I'm caught
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
Best-of Machine Learning with Python 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly. This curated list contains 840 awe
Automated Security Testing For REST API's
Astra REST API penetration testing is complex due to continuous changes in existing APIs and newly added APIs. Astra can be used by security engineers
Load and performance benchmark tool
Yandex Tank Yandextank has been moved to Python 3. Latest stable release for Python 2 here. Yandex.Tank is an extensible open source load testing tool
Parameterized testing with any Python test framework
Parameterized testing with any Python test framework Parameterized testing in Python sucks. parameterized fixes that. For everything. Parameterized te
A complete test automation tool
Golem - Test Automation Golem is a test framework and a complete tool for browser automation. Tests can be written with code in Python, codeless using
A Modular Penetration Testing Framework
fsociety A Modular Penetration Testing Framework Install pip install fsociety Update pip install --upgrade fsociety Usage usage: fsociety [-h] [-i] [-
A framework-agnostic library for testing ASGI web applications
async-asgi-testclient Async ASGI TestClient is a library for testing web applications that implements ASGI specification (version 2 and 3). The motiva
a socket mock framework - for all kinds of socket animals, web-clients included
mocket /mɔˈkɛt/ A socket mock framework for all kinds of socket animals, web-clients included - with gevent/asyncio/SSL support ...and then MicroPytho
Web testing library for Robot Framework
SeleniumLibrary Contents Introduction Keyword Documentation Installation Browser drivers Usage Extending SeleniumLibrary Community Versions History In
Language-agnostic HTTP API Testing Tool
Dredd — HTTP API Testing Framework Dredd is a language-agnostic command-line tool for validating API description document against backend implementati
splinter - python test framework for web applications
splinter - python tool for testing web applications splinter is an open source tool for testing web applications using Python. It lets you automate br
A browser automation framework and ecosystem.
Selenium Selenium is an umbrella project encapsulating a variety of tools and libraries enabling web browser automation. Selenium specifically provide
GraphQL framework for Python
Graphene 💬 Join the community on Slack We are looking for contributors! Please check the ROADMAP to see how you can help ❤️ The below readme is the d
NLP Core Library and Model Zoo based on PaddlePaddle 2.0
PaddleNLP 2.0拥有丰富的模型库、简洁易用的API与高性能的分布式训练的能力,旨在为飞桨开发者提升文本建模效率,并提供基于PaddlePaddle 2.0的NLP领域最佳实践。
🏆 A ranked list of awesome Python open-source libraries and tools. Updated weekly.
Best-of Python 🏆 A ranked list of awesome Python open-source libraries & tools. Updated weekly. This curated list contains 230 awesome open-source pr
🏆 A ranked list of awesome python libraries for web development. Updated weekly.
Best-of Web Development with Python 🏆 A ranked list of awesome python libraries for web development. Updated weekly. This curated list contains 540 a