1140 Python Text-expander Libraries

A tool for extracting plain text from Wikipedia dumps

WikiExtractor WikiExtractor.py is a Python script that extracts and cleans text from a Wikipedia database dump. The tool is written in Python and requ

3.2k Dec 31, 2022

Convert HTML to Markdown-formatted text.

html2text html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to

1.3k Dec 31, 2022

NLP Core Library and Model Zoo based on PaddlePaddle 2.0

PaddleNLP 2.0拥有丰富的模型库、简洁易用的API与高性能的分布式训练的能力，旨在为飞桨开发者提升文本建模效率，并提供基于PaddlePaddle 2.0的NLP领域最佳实践。

6.9k Jan 1, 2023

Transformer-based Text Auto-encoder (T-TA) using TensorFlow 2.

T-TA (Transformer-based Text Auto-encoder) This repository contains codes for Transformer-based Text Auto-encoder (T-TA, paper: Fast and Accurate Deep

13 Dec 13, 2022

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

DALL-E in Pytorch Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch. It will also contain CLIP for ranking the ge

5k Jan 4, 2023

Open-AI's DALL-E for large scale training in mesh-tensorflow.

DALL-E in Mesh-Tensorflow [WIP] Open-AI's DALL-E in Mesh-Tensorflow. If this is similarly efficient to GPT-Neo, this repo should be able to train mode

432 Dec 16, 2022

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

GPT2-NewsTitle 带有超详细注释的GPT2新闻标题生成项目 UpDate 01.02.2021 从网上收集数据，将清华新闻数据、搜狗新闻数据等新闻数据集，以及开源的一些摘要数据进行整理清洗，构建一个较完善的中文摘要数据集。数据集清洗时，仅进行了简单地规则清洗。

785 Dec 29, 2022

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

4.4k Jan 3, 2023

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN

artificial intelligence cosmic love and attention fire in the sky a pyramid made of ice a lonely house in the woods marriage in the mountains lantern

2.3k Jan 1, 2023

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS) Yoonhyung Lee, Joongbo Shin, Kyomin Jung Abstract: Although early

147 Dec 5, 2022

On Generating Extended Summaries of Long Documents

ExtendedSumm This repository contains the implementation details and datasets used in On Generating Extended Summaries of Long Documents paper at the

76 Sep 5, 2022

Build Text Rerankers with Deep Language Models

Reranker is a lightweight, effective and efficient package for training and deploying deep languge model reranker in information retrieval (IR), question answering (QA) and many other natural language processing (NLP) pipelines. The training procedure follows our ECIR paper Rethink Training of BERT Rerankers in Multi-Stage Retrieval Pipeline using a localized constrastive esimation (LCE) loss.

140 Dec 6, 2022

Converts text into a PDF of handwritten notes

Text To Handwritten Notes Converts text into a PDF of handwritten notes Explore the docs » · Report Bug · Request Feature · Steps: $ git clone https:/

63 Oct 9, 2022

Automatically erase objects in the video, such as logo, text, etc.

Video-Auto-Wipe Read English Introduction：Here 本人不定期的基于生成技术制作一些好玩有趣的算法模型，这次带来的作品是“视频擦除”方向的应用模型，它实现的功能是自动感知到视频中我们不想看见的部分（譬如广告、水印、字幕、图标等等）然后进行擦除。由于图标擦

141 Dec 26, 2022

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca

50 Dec 3, 2022

Spectrum is an AI that uses machine learning to generate Rap song lyrics

Spectrum Spectrum is an AI that uses deep learning to generate rap song lyrics. View Demo Report Bug Request Feature Open In Colab About The Project S

39 Dec 16, 2022

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

101 Nov 25, 2022

Discord Bot for server hosts, devs, and admins. Analyzes timings reports & uploads text files to hastebin. Developed by https://birdflop.com.

"Botflop" Click here to invite Botflop to your server. Current abilities Analyze timings reports Paste a timings report to review an in-depth descript

76 Dec 31, 2022

CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

CCF BDCI 2020 房产行业聊天问答匹配 A榜47/2985 赛题描述详见：https://www.datafountain.cn/competitions/474 文件说明 data: 存放训练数据和测试数据以及预处理代码 model_bert.py: 网络模型结构定义 adv_train

40 Sep 28, 2022

A simple Python module for parsing human names into their individual components

Name Parser A simple Python (3.2+ & 2.6+) module for parsing human names into their individual components. hn.title hn.first hn.middle hn.last hn.suff

574 Dec 20, 2022

Python library for creating PEG parsers

PyParsing -- A Python Parsing Module Introduction The pyparsing module is an alternative approach to creating and executing simple grammars, vs. the t

1.7k Dec 27, 2022

Paranoid text spacing in Python

pangu.py Paranoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width charact

194 Nov 19, 2022

Fixes mojibake and other glitches in Unicode text, after the fact.

ftfy: fixes text for you print(fix_encoding("(à¸‡'âŒ£')à¸‡")) (ง'⌣')ง Full documentation: https://ftfy.readthedocs.org Testimonials “My life is li

3.4k Jan 8, 2023

Official Python client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Python apps.

MonkeyLearn API for Python Official Python client for the MonkeyLearn API. Build and run machine learning models for language processing from your Pyt

157 Nov 22, 2022

Module for automatic summarization of text documents and HTML pages.

Automatic text summarizer Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains sim

3k Jan 3, 2023

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Newspaper3k: Article scraping & curation Inspired by requests for its simplicity and powered by lxml for its speed: "Newspaper is an amazing python li

12.3k Jan 1, 2023

Convert HTML to Markdown-formatted text.

html2text html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to

1.3k Dec 31, 2022

Fast topic modeling platform

The state-of-the-art platform for topic modeling. Full Documentation User Mailing List Download Releases User survey What is BigARTM? BigARTM is a pow

633 Dec 21, 2022

Snips Python library to extract meaning from text

Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur

3.7k Dec 30, 2022

Multilingual text (NLP) processing toolkit

polyglot Polyglot is a natural language pipeline that supports massive multilingual applications. Free software: GPLv3 license Documentation: http://p

2.1k Jan 7, 2023

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual

15.3k Dec 30, 2022

💫 Industrial-strength Natural Language Processing (NLP) in Python

spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc

24.9k Jan 2, 2023

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

Colibri Core by Maarten van Gompel, [email protected], Radboud University Nijmegen Licensed under GPLv3 (See http://www.gnu.org/licenses/gpl-3.0.html

122 Nov 17, 2022

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

Ucto for Python This is a Python binding to the tokeniser Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task,

27 Dec 14, 2022

Anaconda turns your Sublime Text 3 in a full featured Python development IDE including autocompletion, code linting, IDE features, autopep8 formating, McCabe complexity checker Vagrant and Docker support for Sublime Text 3 using Jedi, PyFlakes, pep8, MyPy, PyLint, pep257 and McCabe that will never freeze your Sublime Text 3

| _` | __ \ _` | __| _ \ __ \ _` | _` | ( | | | ( | ( ( | | | ( | ( | \__,_| _| _|

2.2k Dec 22, 2022

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

16.7k Jan 3, 2023

Python Text-expander Resources

Python text-expander Libraries

A tool for extracting plain text from Wikipedia dumps

Convert HTML to Markdown-formatted text.

NLP Core Library and Model Zoo based on PaddlePaddle 2.0

Transformer-based Text Auto-encoder (T-TA) using TensorFlow 2.

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Open-AI's DALL-E for large scale training in mesh-tensorflow.

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

On Generating Extended Summaries of Long Documents

Build Text Rerankers with Deep Language Models

Converts text into a PDF of handwritten notes

Automatically erase objects in the video, such as logo, text, etc.

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Spectrum is an AI that uses machine learning to generate Rap song lyrics

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Discord Bot for server hosts, devs, and admins. Analyzes timings reports & uploads text files to hastebin. Developed by https://birdflop.com.

CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

A simple Python module for parsing human names into their individual components

Python library for creating PEG parsers

Paranoid text spacing in Python

Fixes mojibake and other glitches in Unicode text, after the fact.

Official Python client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Python apps.

Module for automatic summarization of text documents and HTML pages.

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Convert HTML to Markdown-formatted text.

Fast topic modeling platform

Snips Python library to extract meaning from text

Multilingual text (NLP) processing toolkit

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

💫 Industrial-strength Natural Language Processing (NLP) in Python

A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

Tools, wrappers, etc... for data science with a concentration on text processing

Python library for processing Chinese text

Accelerated deep learning R&D

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python Text-expander Resources

Python text-expander Libraries

A tool for extracting plain text from Wikipedia dumps

Convert HTML to Markdown-formatted text.

NLP Core Library and Model Zoo based on PaddlePaddle 2.0

Transformer-based Text Auto-encoder (T-TA) using TensorFlow 2.

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Open-AI's DALL-E for large scale training in mesh-tensorflow.

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

On Generating Extended Summaries of Long Documents

Build Text Rerankers with Deep Language Models

Converts text into a PDF of handwritten notes

Automatically erase objects in the video, such as logo, text, etc.

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Spectrum is an AI that uses machine learning to generate Rap song lyrics

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Discord Bot for server hosts, devs, and admins. Analyzes timings reports & uploads text files to hastebin. Developed by https://birdflop.com.

CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

A simple Python module for parsing human names into their individual components

Python library for creating PEG parsers

Paranoid text spacing in Python

Fixes mojibake and other glitches in Unicode text, after the fact.

Official Python client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Python apps.

Module for automatic summarization of text documents and HTML pages.

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Convert HTML to Markdown-formatted text.

Fast topic modeling platform

Snips Python library to extract meaning from text

Multilingual text (NLP) processing toolkit

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

💫 Industrial-strength Natural Language Processing (NLP) in Python

A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

Tools, wrappers, etc... for data science with a concentration on text processing

Python library for processing Chinese text

Accelerated deep learning R&D

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Related tags