DocuMiner
A production-ready pipeline for text mining and subject indexing
Want to Contribute?
More code and documentation coming soon.
Authors
Open Source Club
A production-ready pipeline for text mining and subject indexing
More code and documentation coming soon.
Open Source Club
Several functions take in various lengths of text and return the subjectivity/objectivity score as a decimal. Example strings are already implemented to display functionality.
Notify users via text or email of changes to their uploaded stash of documents. This is meant for a group contribution where more than one person can upload extra or delete documents from an initial stash of documents afterwards.
Perform OCR on images of text to recognize and transform the text into digital format.
Create basic frontend UI with widgets for file upload.
pip install streamlit
.txt
, doc
, docx
, pdf
.Extract keywords from a document.
Use a stable library such as KeyBERT or an established (and sometimes more simple) algorithm like TF-IDF.
Text2ASCII Description This python script (converter.py) contains two functions: encode() is used to return a list of Integer, one item per character
Text to HandWritten Text ✍️ Converter Watch Tutorial for this project Usage:- Clone my repository. Open CMD in working directory. Run following comman
strbind strbind - lapidary text converter for translate an text file to the C-style string. My motivation is fast adding large text chunks to the C co
Unicode Converter A python tool to convert Bangla Bijoy text to Unicode text. Installation Unicode Converter can be installed via PyPi. Make sure pip
TextStatistics This program get a text file wich contains English text. The program analyses the text, and print some information. For this program I
Redlines Redlines produces a Markdown text showing the differences between two strings/text. The changes are represented with strike-throughs and unde
Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.
?? Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! ??♀️
ftfy: fixes text for you >>> print(fix_encoding("(ง'⌣')ง")) (ง'⌣')ง Full documentation: https://ftfy.readthedocs.org Testimonials “My life is li
colormate Python script text formatting package What is colormate? colormate is a python library that lets you add text formatting to your scripts, it
Text Based Adventure Jam Author: Devin McIntyre Our goal is two-fold: Create a text based adventure game engine that can parse a standard file format
Indonesian Text Summarization Using FastAPI This is REST-API for Indonesian Text Summarization using Non-Negative Matrix Factorization for the algorit
TextToSymbolConverter A program that looks through entered text and replaces certain commands with mathematical symbols Example: Syntax: Enter text in
Memorize-New-Words In this very very very little project, I've wrote a code to memorize new english words. Therefore you can add the words and their m
Convert text(english) to morse codes and play morse sound!
ATFTyper A neat little program to read the text from the "All Ten Fingers" program, and write them back. How does it work? This program uses the Pillo
deasciify-highlighted is a Python script for deasciifying text to Turkish and copying clipboard.
A python Tk GUI that creates, writes text and attaches images into a custom spreadsheet file
pangu.py Paranoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width charact