1237 Repositories
Python Flask-CRUD-NLP Libraries
PyTorch implementation of the end-to-end coreference resolution model with different higher-order inference methods.
End-to-End Coreference Resolution with Different Higher-Order Inference Methods This repository contains the implementation of the paper: Revealing th
SimCSE: Simple Contrastive Learning of Sentence Embeddings
SimCSE: Simple Contrastive Learning of Sentence Embeddings This repository contains the code and pre-trained models for our paper SimCSE: Simple Contr
Find and notify users in your Active Directory with weak passwords
Crack-O-Matic Find and notify users in your Active Directory with weak passwords. Features: Linux-based Flask-based web app Hashcat or John cracker Au
SpikeX - SpaCy Pipes for Knowledge Extraction
SpikeX is a collection of pipes ready to be plugged in a spaCy pipeline. It aims to help in building knowledge extraction tools with almost-zero effort.
The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`
Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c
Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.
GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021
skweak: A software toolkit for weak supervision applied to NLP tasks
Labelled data remains a scarce resource in many practical NLP scenarios. This is especially the case when working with resource-poor languages (or text domains), or when using task-specific labels without pre-existing datasets. The only available option is often to collect and annotate texts by hand, which is expensive and time-consuming.
Code for producing Japanese GPT-2 provided by rinna Co., Ltd.
japanese-gpt2 This repository provides the code for training Japanese GPT-2 models. This code has been used for producing japanese-gpt2-medium release
FedNLP: A Benchmarking Framework for Federated Learning in Natural Language Processing
FedNLP is a research-oriented benchmarking framework for advancing federated learning (FL) in natural language processing (NLP). It uses FedML repository as the git submodule. In other words, FedNLP only focuses on adavanced models and dataset, while FedML supports various federated optimizers (e.g., FedAvg) and platforms (Distributed Computing, IoT/Mobile, Standalone).
nlp基础任务
NLP算法 说明 此算法仓库包括文本分类、序列标注、关系抽取、文本匹配、文本相似度匹配这五个主流NLP任务,涉及到22个相关的模型算法。 框架结构 文件结构 all_models ├── Base_line │ ├── __init__.py │ ├── base_data_process.
Tool which allow you to detect and translate text.
Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr
Awesome-NLP-Research (ANLP)
Awesome-NLP-Research (ANLP)
A fast and easy implementation of Transformer with PyTorch.
FasySeq FasySeq is a shorthand as a Fast and easy sequential modeling toolkit. It aims to provide a seq2seq model to researchers and developers, which
中文空间语义理解评测
中文空间语义理解评测 最新消息 2021-04-10 🚩 排行榜发布: Leaderboard 2021-04-05 基线系统发布: SpaCE2021-Baseline 2021-04-05 开放数据提交: 提交结果 2021-04-01 开放报名: 我要报名 2021-04-01 数据集 pa
A light-weight, versatile XYZ tile server, built with Flask and Rasterio :earth_africa:
Terracotta is a pure Python tile server that runs as a WSGI app on a dedicated webserver or as a serverless app on AWS Lambda. It is built on a modern
Python bindings to libpostal for fast international address parsing/normalization
pypostal These are the official Python bindings to https://github.com/openvenues/libpostal, a fast statistical parser/normalizer for street addresses
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
Apache Superset is a Data Visualization and Data Exploration Platform
Apache Superset is a Data Visualization and Data Exploration Platform
Glauth management ui created with python/flask
glauth-ui Glauth-UI is a small flask web app i created to manage the minimal glauth ldap server. I created this as i wanted to use glauth for authenti
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".
Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in
Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"
Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language This repository contains UA-GEC data and an accompanying Python lib
FlaskBB is a Forum Software written in Python using the micro framework Flask.
FlaskBB is a Forum Software written in Python using the micro framework Flask.
Bittorrent software for cats
NyaaV2 Setting up for development This project uses Python 3.7. There are features used that do not exist in 3.6, so make sure to use Python 3.7. This
Learning Dense Representations of Phrases at Scale (Lee et al., 2020)
DensePhrases DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time. While it efficiently searches th
An OpenSource crowd-sourced cooking recipes website
An OpenSource crowd-sourced cooking recipes website
A web-app helping to create strong passwords that are easy to remember.
This is a simple Web-App that demonstrates a method of creating strong passwords that are still easy to remember. It also provides time estimates how long it would take an attacker to crack a password using the zxcvbn library developed by Dropbox.
WebSocket support for Flask
flask-sock WebSocket support for Flask Installation pip install flask-sock Example from flask import Flask, render_template from flask_sock import Soc
Tracking Progress in Natural Language Processing
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
News SRU++, a new SRU variant, is released. [tech report] [blog] The experimental code and SRU++ implementation are available on the dev branch which
Library for faster pinned CPU - GPU transfer in Pytorch
SpeedTorch Faster pinned CPU tensor - GPU Pytorch variabe transfer and GPU tensor - GPU Pytorch variable transfer, in certain cases. Update 9-29-1
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing
SPLASH: Semantic Parsing with Language Assistance from Humans SPLASH is dataset for the task of semantic parse correction with natural language feedba
🏆 A ranked list of awesome python libraries for web development. Updated weekly.
Best-of Web Development with Python 🏆 A ranked list of awesome python libraries for web development. Updated weekly. This curated list contains 540 a
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x using fastT5.
Reduce T5 model size by 3X and increase the inference speed up to 5X. Install Usage Details Functionalities Benchmarks Onnx model Quantized onnx model
Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021.
capbot-siic Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021. Problem Inspiration A plethora
Package for controllable summarization
summarizers summarizers is package for controllable summarization based CTRLsum. currently, we only supports English. It doesn't work in other languag
a Deep Learning Framework for Text
DeLFT DeLFT (Deep Learning Framework for Text) is a Keras and TensorFlow framework for text processing, focusing on sequence labelling (e.g. named ent
Natural language detection
Detect the language of text. What’s so cool about franc? franc can support more languages(†) than any other library franc is packaged with support for
A set of workflows for corpus building through OCR, post-correction and normalisation
PICCL: Philosophical Integrator of Computational and Corpus Libraries PICCL offers a workflow for corpus building and builds on a variety of tools. Th
Tool which allow you to detect and translate text.
Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr
Lightning Fast Language Prediction 🚀
whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and
Deep Learning Chinese Word Segment
引用 本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098 构建 安装好bazel代码构建工具,安装好tensorflow(目前本项目需
CUAD
Contract Understanding Atticus Dataset This repository contains code for the Contract Understanding Atticus Dataset (CUAD), a dataset for legal contra
Contract Understanding Atticus Dataset
Contract Understanding Atticus Dataset This repository contains code for the Contract Understanding Atticus Dataset (CUAD), a dataset for legal contra
本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料,该资料目前包含 自然语言处理各领域的 面试题积累。
【关于 NLP】那些你不知道的事 作者:杨夕、芙蕖、李玲、陈海顺、twilight、LeoLRH、JimmyDU、艾春辉、张永泰、金金金 介绍 本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料,该资料目前包含 自然语言处理各领域的 面试题积累。 目录架构 一、【
Django project starter on steroids: quickly create a Django app AND generate source code for data models + REST/GraphQL APIs (the generated code is auto-linted and has 100% test coverage).
Create Django App 💛 We're a Django project starter on steroids! One-line command to create a Django app with all the dependencies auto-installed AND
A very simple framework for state-of-the-art Natural Language Processing (NLP)
A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. Flair is: A powerful NLP library. Flair allo
A library for debugging/inspecting machine learning classifiers and explaining their predictions
ELI5 ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions. It provides support for the following m
NLP made easy
GluonNLP: Your Choice of Deep Learning for NLP GluonNLP is a toolkit that helps you solve NLP problems. It provides easy-to-use tools that helps you l
📝 Wrapper library for text generation / language models at char and word level with RNN in TensorFlow
tensorlm Generate Shakespeare poems with 4 lines of code. Installation tensorlm is written in / for Python 3.4+ and TensorFlow 1.1+ pip3 install tenso
Data loaders and abstractions for text and NLP
torchtext This repository consists of: torchtext.datasets: The raw text iterators for common NLP datasets torchtext.data: Some basic NLP building bloc
Qwerkey is a social media platform for connecting and learning more about mechanical keyboards built on React and Redux in the frontend and Flask in the backend on top of a PostgreSQL database.
Flask React Project This is the backend for the Flask React project. Getting started Clone this repository (only this branch) git clone https://github
DaCy: The State of the Art Danish NLP pipeline using SpaCy
DaCy: A SpaCy NLP Pipeline for Danish DaCy is a Danish preprocessing pipeline trained in SpaCy. At the time of writing it has achieved State-of-the-Ar
GitHub repository for the SecureDrop whistleblower platform. Do not submit tips here!
SecureDrop is an open-source whistleblower submission system that media organizations can use to securely accept documents from, and communicate with
Indico - A feature-rich event management system, made @ CERN, the place where the Web was born.
Indico Indico is: 🗓 a general-purpose event management tool; 🌍 fully web-based; 🧩 feature-rich but also extensible through the use of plugins; ⚖️ O
Easy HTML form without PHP or JavaScript
This repository is no longer active. If you're looking for a simple and powerful hosted form API, please check out https://formspree.io. If you are in
Invenio digital library framework
Invenio Framework v3 Open Source framework for large-scale digital repositories. Invenio Framework is like a Swiss Army knife of battle-tested, safe a
Indico - A feature-rich event management system, made @ CERN, the place where the Web was born.
Indico Indico is: 🗓 a general-purpose event management tool; 🌍 fully web-based; 🧩 feature-rich but also extensible through the use of plugins; ⚖️ O
A simple shared budget manager web application
I hate money I hate money is a web application made to ease shared budget management. It keeps track of who bought what, when, and for whom; and helps
ASGI middleware for authentication, rate limiting, and building CRUD endpoints.
Piccolo API Utilities for easily exposing Piccolo models as REST endpoints in ASGI apps, such as Starlette and FastAPI. Includes a bunch of useful ASG
📦 Autowiring dependency injection container for python 3
Lagom - Dependency injection container What Lagom is a dependency injection container designed to give you "just enough" help with building your depen
fastapi-crud-sync
Developing and Testing an API with FastAPI and Pytest Syncronous Example Want to use this project? Build the images and run the containers: $ docker-c
API & Webapp to answer questions about COVID-19. Using NLP (Question Answering) and trusted data sources.
This open source project serves two purposes. Collection and evaluation of a Question Answering dataset to improve existing QA/search methods - COVID-
A set of pytest fixtures to test Flask applications
pytest-flask An extension of pytest test runner which provides a set of useful tools to simplify testing and development of the Flask extensions and a
Whoosh indexing capabilities for Flask-SQLAlchemy, Python 3 compatibility fork.
Flask-WhooshAlchemy3 Whoosh indexing capabilities for Flask-SQLAlchemy, Python 3 compatibility fork. Performance improvements and suggestions are read
Full text search for flask.
flask-msearch Installation To install flask-msearch: pip install flask-msearch # when MSEARCH_BACKEND = "whoosh" pip install whoosh blinker # when MSE
HTTP security headers for Flask
Talisman: HTTP security headers for Flask Talisman is a small Flask extension that handles setting HTTP headers that can help protect against a few co
SeaSurf is a Flask extension for preventing cross-site request forgery (CSRF).
Flask-SeaSurf SeaSurf is a Flask extension for preventing cross-site request forgery (CSRF). CSRF vulnerabilities have been found in large and popular
flask extension for integration with the awesome pydantic package
Flask-Pydantic Flask extension for integration of the awesome pydantic package with Flask. Installation python3 -m pip install Flask-Pydantic Basics v
Doing the OAuth dance with style using Flask, requests, and oauthlib.
Flask-Dance Doing the OAuth dance with style using Flask, requests, and oauthlib. Currently, only OAuth consumers are supported, but this project coul
Flask JWT Router is a Python library that adds authorised routes to a Flask app.
Read the docs: Flask-JWT-Router Flask JWT Router Flask JWT Router is a Python library that adds authorised routes to a Flask app. Both basic & Google'
Simple Login - Login Extension for Flask - maintainer @cuducos
Login Extension for Flask The simplest way to add login to flask! Top Contributors Add yourself, send a PR! How it works First install it from PyPI. p
SqlAlchemy Flask-Restful Swagger Json:API OpenAPI
SAFRS: Python OpenAPI & JSON:API Framework Overview Installation JSON:API Interface Resource Objects Relationships Methods Custom Methods Class Method
Flask-Rebar combines flask, marshmallow, and swagger for robust REST services.
Flask-Rebar Flask-Rebar combines flask, marshmallow, and swagger for robust REST services. Features Request and Response Validation - Flask-Rebar reli
Restful API framework wrapped around MongoEngine
Flask-MongoRest A Restful API framework wrapped around MongoEngine. Setup from flask import Flask from flask_mongoengine import MongoEngine from flask
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag
Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.
NeuroNER NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to insta
Yet another Python binding for fastText
pyfasttext Warning! pyfasttext is no longer maintained: use the official Python binding from the fastText repository: https://github.com/facebookresea
Multilingual text (NLP) processing toolkit
polyglot Polyglot is a natural language pipeline that supports massive multilingual applications. Free software: GPLv3 license Documentation: http://p
Topic Modelling for Humans
gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ
Text vectorization tool to outperform TFIDF for classification tasks
WHAT: Supervised text vectorization tool Textvec is a text vectorization tool, with the aim to implement all the "classic" text vectorization NLP meth
NLP library designed for reproducible experimentation management
Welcome to the Transfer NLP library, a framework built on top of PyTorch to promote reproducible experimentation and Transfer Learning in NLP You can
🏖 Easy training and deployment of seq2seq models.
Headliner Headliner is a sequence modeling library that eases the training and in particular, the deployment of custom sequence models for both resear
Textpipe: clean and extract metadata from text
textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata
Unsupervised text tokenizer focused on computational efficiency
YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you
Scikit-learn style model finetuning for NLP
Scikit-learn style model finetuning for NLP Finetune is a library that allows users to leverage state-of-the-art pretrained NLP models for a wide vari
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u
DELTA is a deep learning based natural language and speech processing platform.
DELTA - A DEep learning Language Technology plAtform What is DELTA? DELTA is a deep learning based end-to-end natural language and speech processing p
Text preprocessing, representation and visualization from zero to hero.
Text preprocessing, representation and visualization from zero to hero. From zero to hero • Installation • Getting Started • Examples • API • FAQ • Co
NeMo: a toolkit for conversational AI
NVIDIA NeMo Introduction NeMo is a toolkit for creating Conversational AI applications. NeMo product page. Introductory video. The toolkit comes with
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
A Deep Learning NLP/NLU library by Intel® AI Lab Overview | Models | Installation | Examples | Documentation | Tutorials | Contributing NLP Architect
Super easy library for BERT based NLP models
Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
Basic Utilities for PyTorch Natural Language Processing (NLP)
Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor
Snips Python library to extract meaning from text
Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur
A full spaCy pipeline and models for scientific/biomedical documents.
This repository contains custom pipes and models related to using spaCy for scientific documents. In particular, there is a custom tokenizer that adds