1537 Repositories
Python text-recommendation-engine Libraries
TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER
TweebankNLP This repo contains the new Tweebank-NER dataset and Twitter-Stanza p
Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset
Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset
An ongoing process to make a physics engine using python.
Simple_Physics_Engine An ongoing process to make a physics engine using python. I am using this goal as a way to learn python in and out. I am trying
CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss
CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss This is official implement of "
Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization
Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization 📥 Download Datasets 📥 Download Trained Models INTRODUCTION TH2ZH (
Elkeid HUB - A rule/event processing engine maintained by the Elkeid Team that supports streaming/offline data processing
Elkeid HUB - A rule/event processing engine maintained by the Elkeid Team that supports streaming/offline data processing
Custom function scheduler TUI (text-based user interface) in the console
Custom function scheduler TUI (text-based user interface) in the console
A project for the Qvault Hackathon, 2022-01-17
musical-octo-engine Steps to run brew install python-tk brew install portaudio
Free & simple way to encipher text
VenSipher VenSipher is a free medium through which text can be enciphered. It can convert any text into an unrecognizable secret text that can only be
The Scary Story - A Text Adventure
This is a text adventure which I made in python 3. This is one of my first big projects so any feedback would be greatly appreciated.
📝An easy-to-use package to restore punctuation of the text.
✏️ rpunct - Restore Punctuation This repo contains code for Punctuation restoration. This package is intended for direct use as a punctuation restorat
Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU
Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2
Pretty-doc - Composable text objects with python
pretty-doc from __future__ import annotations from dataclasses import dataclass
Deasciify-highlighted - A Python script for deasciifying text to Turkish and copying clipboard
deasciify-highlighted is a Python script for deasciifying text to Turkish and copying clipboard.
A Unified Framework and Analysis for Structured Knowledge Grounding
UnifiedSKG 📚 : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection This material is supplementray code for paper accepted in ICDAR 2021 We h
[AI6122] Text Data Management & Processing
[AI6122] Text Data Management & Processing is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course is Prof. Sun Aixin.
Desktop utility to download images/videos/music/text from various websites, and more
Desktop utility to download images/videos/music/text from various websites, and more
A Python library for generating new text from existing samples.
ReMarkov is a Python library for generating text from existing samples using Markov chains. You can use it to customize all sorts of writing from birt
Text editor on python tkinter to convert english text to other languages with the help of ployglot.
Transliterator Text Editor This is a simple transliteration program which is used to convert english word to phonetically matching word in another lan
A python notification tool used for sending you text messages when certain conditions are met in the game, Neptune's Pride.
A python notification tool used for sending you text messages when certain conditions are met in the game, Neptune's Pride.
The text based version of my App Blocker that I planning on converting to GUI soon.
App-Blocker The text based version of my App Blocker that I planning on converting to GUI soon. Currently I am just uploading the appblocker.py file,
Unsupervised text tokenizer focused on computational efficiency
YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)
Bot that embeds a random hysterical meme from Reddit into your text channel as an embedded message, using an API call.
Discord_Meme_Bot 🤣 Bot that embeds a random hysterical meme from Reddit into your text channel as an embedded message, using an API call. Add the bot
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper
The PyTorch implementation of paper REST: Debiased Social Recommendation via Reconstructing Exposure Strategies
REST The PyTorch implementation of paper REST: Debiased Social Recommendation via Reconstructing Exposure Strategies. Usage Download dataset Download
Sentiment Based Product Recommendation System
Sentiment Based Product Recommendation System The e-commerce business is quite p
Long text token classification using LongFormer
Long text token classification using LongFormer
A collection of useful functions for writers to analyze text/stories.
AuthorTools AuthorTools provides a multitude of functions for easily analyzing (your?) writing. AuthorTools is made especially for creative writers wi
Book Recommender System Using Sci-kit learn N-neighbours
Model-Based-Recommender-Engine I created a book Recommender System using Sci-kit learn's N-neighbours algorithm for my model and the streamlit library
Frappe tinymce - Frappe app to replace default text editor with tinymce
Frappe tinyMCE tinyMCE Text Editor for frappe apps Replace frappe's Quill Text E
Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)
Arabic-Phonetic-Output You can input the phonetic version of any Arabic text her
Dagon - An Asynchronous Task Graph Execution Engine
Dagon - An Asynchronous Task Graph Execution Engine Dagon is a job execution sys
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
2021 AI CUP Competition on Traditional Chinese Scene Text Recognition - Intermediate Contest
繁體中文場景文字辨識 程式碼說明 組別:這就是我 成員:蔣明憲 唐碩謙 黃玥菱 林冠霆 蕭靖騰 目錄 環境套件 安裝方式 資料夾布局 前處理-製作偵測訓練註解檔 前處理-製作分類訓練樣本 part.py : 從 json 裁切出分類訓練樣本 Class.py : 將切出來的樣本按照文字分類到各資料夾
A minimal platform for Markdown-based blogs
madblog This project provides a minimal blogging platform based on Markdown files. Demos This project powers the following blogs: Platypush My persona
Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.
DistilBERT-Text-mining-authorship-attribution Dataset used: https://www.kaggle.com/azimulh/tweets-data-for-authorship-attribution-modelling/version/2
Image-generation-baseline - MUGE Text To Image Generation Baseline
MUGE Text To Image Generation Baseline Requirements and Installation More detail
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.
Lbl2Vec Lbl2Vec is an algorithm for unsupervised document classification and unsupervised document retrieval. It automatically generates jointly embed
A menu for pygame. Simple, and easy to use
pygame-menu Source repo on GitHub, and run it on Repl.it Introduction Pygame-menu is a python-pygame library for creating menus and GUIs. It supports
👄 The most accurate natural language detection library for Python, suitable for long and short text alike
1. What does this library do? Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a prepr
A Telegram bot written in python.
telegram_bot This bot is currently a beta project. Features A telegram bot which can: Send current COVID-19 cases/stats of Germany Send current worth
Multi-Stage Episodic Control for Strategic Exploration in Text Games
XTX: eXploit - Then - eXplore Requirements First clone this repo using git clone https://github.com/princeton-nlp/XTX.git Please create two conda envi
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by
This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".
Introduction This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents". If
Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation
SUCP Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation () Direct Friends (i.e., users who follow each o
Towards Boosting the Accuracy of Non-Latin Scene Text Recognition
Convolutional Recurrent Neural Network + CTCLoss | STAR-Net Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition" Depende
Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.
Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.
This is an AI that is supposed to say you if your text is formal or not
This is an AI that is supposed to say you if your text is formal or not. It's written in Python 3 and has some german examples (because I'm german yk) in the text.json file. This file contains the text which the AI. The TXT-file isn't important but necessary.
Send SMS text messages via email with as many accounts as you want :)
SMS-Spammer Send SMS text messages via email with as many accounts as you want :) Example Set Up Guide! To start log into the gmail account you would
Text classification on IMDB dataset using Keras and Bi-LSTM network
Text classification on IMDB dataset using Keras and Bi-LSTM Text classification on IMDB dataset using Keras and Bi-LSTM network. Usage python3 main.py
🎁 3,000,000+ Unsplash images made available for research and machine learning
The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of
A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion
List Of English Words A text file containing over 466k English words. While searching for a list of english words (for an auto-complete tutorial) I fo
Natural Language Processing Best Practices & Examples
NLP Best Practices In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive bus
Easy-to-use,Modular and Extendible package of deep-learning based CTR models .
DeepCTR DeepCTR is a Easy-to-use,Modular and Extendible package of deep-learning based CTR models along with lots of core components layers which can
macOS development environment setup: Setting up a new developer machine can be an ad-hoc, manual, and time-consuming process.
dev-setup Motivation Setting up a new developer machine can be an ad-hoc, manual, and time-consuming process. dev-setup aims to simplify the process w
Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 Tensorflow 2.0
NLP-Models-Tensorflow, Gathers machine learning and tensorflow deep learning models for NLP problems, code simplify inside Jupyter Notebooks 100%. Tab
FURY - A software library for scientific visualization in Python
Free Unified Rendering in Python A software library for scientific visualization in Python. General Information • Key Features • Installation • How to
A library for demo trading | backtest and forward test simulation
Trade Engine a library for demo trading | backtest and forward test simulation Features Limit/Market orders: you can place a Limit or Market order in
Moji sends text and fun facts from different APIs wit da use of a notification deamon
Moji sends text and fun facts from different APIs wit da use of a notification deamon. Can be runned via dmenu or rofi.
Text editor on python to convert english text to malayalam(Romanization/Transiteration).
Manglish Text Editor This is a simple transiteration (romanization ) program which is used to convert manglish to malayalam (converts njaan to ഞാൻ ).
This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).
FlatGCN This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation, submitted to ICASSP2022). Req
Trained T5 and T5-large model for creating keywords from text
text to keywords Trained T5-base and T5-large model for creating keywords from text. Supported languages: ru Pretraining Large version | Pretraining B
py2dis - A disassembly engine & library for Python
py2dis - A disassembly engine & library for Python. py2dis is a disassembly library for Python that does not use any modules/libraries other than colo
A Neural Network based chess engine and GUI made with Python and Tensorflow/Keras.
Haxaw-Chess Haxaw: Haxaw is the Neural Network based chess engine made with Python and Tensorflow/Keras. Also uses the python-chess library. (WIP: Imp
Algorithms for outlier, adversarial and drift detection
Alibi Detect is an open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline d
Speech to text streamlit app
Speech to text Streamlit-app! 👄 This speech to text recognition is powered by t
Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.
Spchcat Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi. Description spchcat is a command-line tool that read
Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your personal computer!
Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your machine! Motivation Would
AXI Combat is a networked multiplayer game built on the AXI Visualizer 3D engine.
AXI_Combat AXI Combat is a networked multiplayer game built on the AXI Visualizer 3D engine. https://axi.x10.mx/Combat AXI Combat is released under th
A ninja python package that unifies the Google Earth Engine ecosystem.
A Python package that unifies the Google Earth Engine ecosystem. EarthEngine.jl | rgee | rgee+ | eemont GitHub: https://github.com/r-earthengine/ee_ex
A Practitioner's Guide to Natural Language Processing
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, Text Analytics with Python published by Apress/Springer.
A list of NLP(Natural Language Processing) tutorials
NLP Tutorial A list of NLP(Natural Language Processing) tutorials built on PyTorch. Table of Contents A step-by-step tutorial on how to implement and
Open source stenotype engine
Plover Bringing stenography to everyone. Homepage Releases Wiki Blog Google Group Discord Chat About Installation Getting help Contributing Donations
Beancount: Double-Entry Accounting from Text Files.
beancount: Double-Entry Accounting from Text Files Contents Description Documentation Download & Installation Versions Filing Bugs Copyright and Licen
Free and open source qualitative research tool
Taguette A spin on the phrase "tag it!", Taguette is a free and open source qualitative research tool that allows users to: Import PDFs, Word Docs (.d
Jekyll is a simple, blog-aware, static site generator perfect for personal, project, or organization sites.
Jekyll Jekyll is a simple, blog-aware, static site generator perfect for personal, project, or organization sites. Think of it like a file-based CMS,
Graphical Password Authentication System.
Graphical Password Authentication System. This is used to increase the protection/security of a website. Our system is divided into further 4 layers of protection. Each layer is totally different and diverse than the others. This not only increases protection, but also makes sure that no non-human can log in to your account using different activities such as Brute Force Algorithm and so on.
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization
Python client for using Prefect Cloud with Saturn Cloud
prefect-saturn prefect-saturn is a Python package that makes it easy to run Prefect Cloud flows on a Dask cluster with Saturn Cloud. For a detailed tu
Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.
Think Bayes 2 by Allen B. Downey The HTML version of this book is here. Think Bayes is an introduction to Bayesian statistics using computational meth
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text. Diff: Compare two blocks o
Chinese version of GPT2 training code, using BERT tokenizer.
GPT2-Chinese Description Chinese version of GPT2 training code, using BERT tokenizer or BPE tokenizer. It is based on the extremely awesome repository
NLP techniques such as named entity recognition, sentiment analysis, topic modeling, text classification with Python to predict sentiment and rating of drug from user reviews.
This file contains the following documents sumbited for Baruch CIS9665 group 9 fall 2021. 1. Dataset: drug_reviews.csv 2. python codes for text classi
Textual: a TUI (Text User Interface) framework for Python inspired by modern web development
Textual Textual is a TUI (Text User Interface) framework for Python inspired by
End-to-End text sumarization, QAs generation using flask.
Help-Me-Read A web application created with Flask + BootStrap + HuggingFace 🤗 to generate summary and question-answer from given input text. It uses
👑 spaCy building blocks and visualizers for Streamlit apps
spacy-streamlit: spaCy building blocks for Streamlit apps This package contains utilities for visualizing spaCy models and building interactive spaCy-
A simple component to display annotated text in Streamlit apps.
Annotated Text Component for Streamlit A simple component to display annotated text in Streamlit apps. For example: Installation First install Streaml
A simple USI Shogi Engine written in python using python-shogi.
Revengeshogi My attempt at creating a USI Shogi Engine in python using python-shogi. Current State of Engine Currently only generating random moves us
Understand Text Summarization and create your own summarizer in python
Automatic summarization is the process of shortening a text document with software, in order to create a summary with the major points of the original document. Technologies that can make a coherent summary take into account variables such as length, writing style and syntax.
This repository contains datasets and baselines for benchmarking Chinese text recognition.
Benchmarking-Chinese-Text-Recognition This repository contains datasets and baselines for benchmarking Chinese text recognition. Please see the corres
SAFL: A Self-Attention Scene Text Recognizer with Focal Loss
SAFL: A Self-Attention Scene Text Recognizer with Focal Loss This repository implements the SAFL in pytorch. Installation conda env create -f environm
RodoSol-ALPR Dataset
RodoSol-ALPR Dataset This dataset, called RodoSol-ALPR dataset, contains 20,000 images captured by static cameras located at pay tolls owned by the Ro
Two-stage text summarization with BERT and BART
Two-Stage Text Summarization Description We experiment with a 2-stage summarization model on CNN/DailyMail dataset that combines the ability to filter
STEFANN: Scene Text Editor using Font Adaptive Neural Network
STEFANN: Scene Text Editor using Font Adaptive Neural Network @ The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.
A unified framework to jointly model images, text, and human attention traces.
connect-caption-and-trace This repository contains the reference code for our paper Connecting What to Say With Where to Look by Modeling Human Attent