248 Repositories
Python document-store Libraries
[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links
LinkBERT: A Knowledgeable Language Model Pretrained with Document Links This repo provides the model, code & data of our paper: LinkBERT: Pretraining
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
Multi-Type-TD-TSR Check it out on Source Code of our Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for
StorSeismic: An approach to pre-train a neural network to store seismic data features
StorSeismic: An approach to pre-train a neural network to store seismic data features This repository contains codes and resources to reproduce experi
A workshop with several modules to help learn Feast, an open-source feature store
Workshop: Learning Feast This workshop aims to teach users about Feast, an open-source feature store. We explain concepts & best practices by example,
Paaster is a secure by default end-to-end encrypted pastebin built with the objective of simplicity.
Follow the development of our desktop client here Paaster Paaster is a secure by default end-to-end encrypted pastebin built with the objective of sim
CaskDB is a disk-based, embedded, persistent, key-value store based on the Riak's bitcask paper, written in Python.
CaskDB - Disk based Log Structured Hash Table Store CaskDB is a disk-based, embedded, persistent, key-value store based on the Riak's bitcask paper, w
🍷 Gracefully claim weekly free games and monthly content from Epic Store.
EPIC 免费人 🚀 优雅地领取 Epic 免费游戏 Introduction 👋 Epic AwesomeGamer 帮助玩家优雅地领取 Epic 免费游戏。 使用 「Epic免费人」可以实现如下需求: get:搬空游戏商店,获取所有常驻免费游戏与免费附加内容; claim:领取周免游戏及其免
Source code for "A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction" @ NAACL 2022
TSAR Source code for NAACL 2022 paper: A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction. 🔥 Introduction We focus on extra
Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.
Deep-Learning-for-Text-Document-Classification Text classification is one of the popular tasks in NLP that allows a program to classify free-text docu
A way to store images in YAML.
YAMLImg A way to store images in YAML. I made this after seeing Roadcrosser's JSON-G because it was too inspiring to ignore this opportunity. Installa
K2HASH Python library - NoSQL Key Value Store(KVS) library
k2hash_python Overview k2hash_python is an official python driver for k2hash. Install Firstly you must install the k2hash shared library: curl -o- htt
Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval
BiDR Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval. Requirements torch==
Pgn2tex - Scripts to convert pgn files to latex document. Useful to build books or pdf from pgn studies
Pgn2Latex (WIP) A simple script to make pdf from pgn files and studies. It's sti
Viperdb - A tiny log-structured key-value database written in pure Python
ViperDB 🐍 ViperDB is a lightweight embedded key-value store written in pure Pyt
Searches a document for hash tags. Support multiple natural languages. Works in various contexts.
ht-getter Searches a document for hash tags. Supports multiple natural languages. Works in various contexts. This package uses a non-regex approach an
An interactive document scanner built in Python using OpenCV
The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.
File-based TF-IDF: Calculates keywords in a document, using a word corpus.
File-based TF-IDF Calculates keywords in a document, using a word corpus. Why? Because I found myself with hundreds of plain text files, with no way t
Split given PDF document into 4 page groups and convert them to booklet format
PUTO: PDF to Booklet converter Split given PDF document into 4 page groups and convert them to booklet format. It creates a PDF like shown below: Fir
Kartothek - a Python library to manage large amounts of tabular data in a blob store
Kartothek - a Python library to manage (create, read, update, delete) large amounts of tabular data in a blob store
Password-Manager - A Password Manager application made using Python. You can use this python application to store and to see the stored passwords
Password Manager 🔑 This is a Password Manager Application which is made using P
A scrapy pipeline that provides an easy way to store files and images using various folder structures.
scrapy-folder-tree This is a scrapy pipeline that provides an easy way to store files and images using various folder structures. Supported folder str
MySQL Connectivity based project. Contains various functions of a Store-Management-System
An Intermediate Level Python - MySQL Connectivity based project. Contains various functions of a Store-Management-System.
Word document generator with python
In this study, real world data is anonymized. The content is completely different, but the structure is the same. It was a script I prepared for the backend of a work using UiPath.
DocEnTr: An end-to-end document image enhancement transformer
DocEnTR Description Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer. This model is implemented on to
Valorant store offer discord-bot
Valorant store checker - Discord Bot Discord bot that shows your daily store offer without open the VALORANT by using the Ingame API. written using Py
This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2
GPT-2 Catalan playground and scripts to train a GPT-2 model either from scrath or from another pretrained model.
A deep learning framework for historical document image analysis
DIVA-DAF Description A deep learning framework for historical document image analysis. How to run Install dependencies # clone project git clone https
Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.
DocEnTR Description Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer. This model is implemented on to
Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data
Statistical_Modelling Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data Statistical Methods for Decision Ma
An easy-to-use feature store
A feature store is a data storage system for data science and machine-learning. It can store raw data and also transformed features, which can be fed straight into an ML model or training script.
A simple document management REST based API for collaboratively interacting with documents
documan_api A simple document management REST based API for collaboratively interacting with documents.
Cereal box identification in store shelves using computer vision and a single train image per model.
Product Recognition on Store Shelves Description You can read the task description here. Report You can read and download our report here. Step A - Mu
Discord E-Store Bot
A delivery bot for Discord, works like Amazon where real users can pack & deliver orders in different servers!
The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.
The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.
The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.
Good news! Our new work exhibits state-of-the-art performances on DocUNet benchmark dataset: DocScanner: Robust Document Image Rectification with Prog
RedisJSON - a JSON data type for Redis
RedisJSON is a Redis module that implements ECMA-404 The JSON Data Interchange Standard as a native data type. It allows storing, updating and fetching JSON values from Redis keys (documents).
Repository to store sample python programs for python learning
py Repository to store sample Python programs. This repository is meant for beginners to assist them in their learning of Python. The repository cover
AirDrive lets you store unlimited files to cloud for free. Upload & download files from your personal drive at any time using its super-fast API.
AirDrive lets you store unlimited files to cloud for free. Upload & download files from your personal drive at any time using its super-fast API.
Document manipulation detection with python
image manipulation detection task: -- tianchi function image segmentation salie
A very simple document database
DockieDb A simple in-memory document database. Installation Build the Wheel Fork or clone this repository and run python setup.py bdist_wheel in the r
this repo store a Awoesome telegram bot for protect from your large group from bot attack.
this repo store a Awoesome telegram bot for protect from your large group from bot attack.
A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population
DeepKE is a knowledge extraction toolkit supporting low-resource and document-level scenarios for entity, relation and attribute extraction. We provide comprehensive documents, Google Colab tutorials, and online demo for beginners.
Calibre Libgen Non-fiction / Sci-tech store plugin
CalibreLibgenSci A Libgen Non-Fiction/Sci-tech store plugin for Calibre Installation Download the latest zip file release from here Open Calibre Navig
PyTorch code for JEREX: Joint Entity-Level Relation Extractor
JEREX: "Joint Entity-Level Relation Extractor" PyTorch code for JEREX: "Joint Entity-Level Relation Extractor". For a description of the model and exp
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.
Lbl2Vec Lbl2Vec is an algorithm for unsupervised document classification and unsupervised document retrieval. It automatically generates jointly embed
Shelf DB is a tiny document database for Python to stores documents or JSON-like data
Shelf DB Introduction Shelf DB is a tiny document database for Python to stores documents or JSON-like data. Get it $ pip install shelfdb shelfquery S
KivyPassword - A password generator using both Kivy framework and SQL in order to create a local database for users to generate strong passwords and store them
KivyPassword A password generator using both Kivy framework and SQL in order t
A project to work with databases in 4 worksheets, insert, update, select, delete using Python and MySqI
A project to work with databases in 4 worksheets, insert, update, select, delete using Python and MySqI As a small project for school or college hope it is useful
This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' published at ECIR'22.
Paragraph Aggregation Retrieval Model (PARM) for Dense Document-to-Document Retrieval This repository contains the code for the paper PARM: A Paragrap
This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".
Introduction This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents". If
official ( API ) for the zAmericanEnglish app in [ Google play ] and [ App store ]
official ( API ) for the zAmericanEnglish app in [ Google play ] and [ App store ]
Discord bot that shows valorant your daily store by using the Ingame API
Valorant store checker - Discord Bot Discord bot that shows valorant your daily store by using the Ingame API. written using Python and the Pycord lib
A Simple Key-Value Data-store written in Python
mercury-db This is a File Based Key-Value Datastore that supports basic CRUD (Create, Read, Update, Delete) operations developed using Python. The dat
Possible solutions to Wordscapes, a mobile game for the android operating system, downloadable from the play store
Possible solutions to Wordscapes, a mobile game for the android operating system, downloadable from the play store
Import entity definition document into SQLie3. Manage the entity. Also, create a "Create Table SQL file".
EntityDocumentMaker Version 1.00 After importing the entity definition (Excel file), store the data in sqlite3. エンティティ定義(Excelファイル)をインポートした後、データをsqlit
Storefront - A store App developed using Django, RESTFul API, JWT
Storefront A store App developed using Django, RESTFul API, JWT. SQLite has been
Python utility library for compositing PDF documents with reportlab.
pdfdoc-py Python utility library for compositing PDF documents with reportlab. Installation The pdfdoc-py package can be installed directly from the s
JSONManipulator is a Python package to retrieve, add, delete, change and store objects in JSON files.
JSONManipulator JSONManipulator is a Python package to retrieve, add, delete, change and store objects in JSON files. Installation Use the package man
A supercharged version of paperless: scan, index and archive all your physical documents
Paperless-ng Paperless (click me) is an application by Daniel Quinn and contributors that indexes your scanned documents and allows you to easily sear
Python function to construct an ODS spreadsheet on the fly - without having to store the entire file in memory or disk
stream-write-ods Python function to construct an ODS (OpenDocument Spreadsheet) on the fly - without having to store the entire file in memory or disk
Python function to construct a ZIP archive with on the fly - without having to store the entire ZIP in memory or disk
Python function to construct a ZIP archive with on the fly - without having to store the entire ZIP in memory or disk
An Instagram bot that can mass text users, receive and read a text, and store it somewhere with user details.
Instagram Bot 🤖 July 14, 2021 Overview 👍 A multifunctionality automated instagram bot that can mass text users, receive and read a message and store
Simple API written in Python using FastAPI to store and retrieve Books and Authors.
Simple API made with Python FastAPI WIP: Deploy in AWS with Terraform Simple API written in Python using FastAPI to store and retrieve Books and Autho
Oh-My-PickleDB is an open source key-value store using Python's json module.
OH-MY-PICKLEDB oh-my-pickleDB is a lightweight, fast, and intuitive data manager written in python 📝 Table of Contents About Getting Started Deployme
PREFS is a Python library to store and manage preferences and settings.
PREFS PREFS is a Python library to store and manage preferences and settings. PREFS stores a Python dictionary in a total human-readable file, the PRE
Neptune client library - integrate your Python scripts with Neptune
Lightweight experiment tracking tool for AI/ML individuals and teams. Fits any workflow. Neptune is a lightweight experiment logging/tracking tool tha
ElasticSearch ODM (Object Document Mapper) for Python - pip install esengine
esengine - The Elasticsearch Object Document Mapper esengine is an ODM (Object Document Mapper) it maps Python classes in to Elasticsearch index/doc_t
Python wrapper around Apple App Store Api
App Store Connect Api This is a Python wrapper around the Apple App Store Api : https://developer.apple.com/documentation/appstoreconnectapi So far, i
Essential Document Generator
Essential Document Generator Dead Simple Document Generation Whether it's testing database performance or a new web interface, we've all needed a dead
A document format conversion service based on Pandoc.
reformed Document format conversion service based on Pandoc. Usage The API specification for the Reformed server is as follows: GET /api/v1/formats: L
Telegram Bot to store Posts and Documents and it can Access by Special Links.
Telegram Bot to store Posts and Documents and it can Access by Special Links. I Guess This Will Be Usefull For Many People..... 😇 . Features Fully cu
Python document object mapper (load python object from JSON and vice-versa)
lupin is a Python JSON object mapper lupin is meant to help in serializing python objects to JSON and unserializing JSON data to python objects. Insta
Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)
Woody's Just Dance Tools Public repository created to store my custom-made tools for Just Dance (UbiArt Engine) Development and updates Almost all of
A wrapper around SageMaker ML Lineage Tracking extending ML Lineage to end-to-end ML lifecycles, including additional capabilities around Feature Store groups, queries, and other relevant artifacts.
ML Lineage Helper This library is a wrapper around the SageMaker SDK to support ease of lineage tracking across the ML lifecycle. Lineage artifacts in
Datamining of 15 Days of (free) Games at the Epic Games Store (EGS).
EGS: 15 Days of Games This repository contains Python code to data-mine the 15 Days of (free) Games at the Epic Games Store (EGS). Requirements Instal
Highly decentralized and censorship-resistant way to store key data
Beacon coin Beacon coin is a Chia singelton coin that can store data that needs to be: always available censorship resistant versioned potentially imm
Feature Store for Machine Learning
Overview Feast is an open source feature store for machine learning. Feast is the fastest path to productionizing analytic data for model training and
Build a grocery store management application.
python_projects_grocery_webapp In this python project, we will build a grocery store management application. It will be 3 tier application, Front end:
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Hiring We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on NLP and large-scale pre-traine
Longformer: The Long-Document Transformer
Longformer Longformer and LongformerEncoderDecoder (LED) are pretrained transformer models for long documents. ***** New December 1st, 2020: Longforme
A toolkit for document-level event extraction, containing some SOTA model implementations
❤️ A Toolkit for Document-level Event Extraction with & without Triggers Hi, there 👋 . Thanks for your stay in this repo. This project aims at buildi
[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification
Sparse Structure Learning via Graph Neural Networks for inductive document classification Make graph dataset create co-occurrence graph for datasets.
A toolkit for document-level event extraction, containing some SOTA model implementations
Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker Source code for ACL-IJCNLP 2021 Long paper: Document-le
Generate modern Python clients from OpenAPI
openapi-python-client Generate modern Python clients from OpenAPI 3.x documents. This generator does not support OpenAPI 2.x FKA Swagger. If you need
🦎 A NeoVim plugin for highlighting visual selections like in a normal document editor!
🦎 HighStr.nvim A NeoVim plugin for highlighting visual selections like in a normal document editor! Demo TL;DR HighStr.nvim is a NeoVim plugin writte
A modular, high performance, headless e-commerce platform built with Python, GraphQL, Django, and React.
Saleor Commerce Customer-centric e-commerce on a modern stack A headless, GraphQL commerce platform delivering ultra-fast, dynamic, personalized shopp
Persistent dict, backed by sqlite3 and pickle, multithread-safe.
sqlitedict -- persistent dict, backed-up by SQLite and pickle A lightweight wrapper around Python's sqlite3 database with a simple, Pythonic dict-like
Redis Python Client - The Python interface to the Redis key-value store.
redis-py The Python interface to the Redis key-value store. Installation | Contributing | Getting Started | Connecting To Redis Installation redis-py
Dungeon Dice Rolls is an aplication that the user can roll dices (d4, d6, d8, d10, d12, d20 and d100) and store the results in one of the 6 arrays.
Dungeon Dice Rolls is an aplication that the user can roll dices (d4, d6, d8, d10, d12, d20 and d100) and store the results in one of the 6 arrays.
A(Sync) Interface for internal Audible API written in pure Python.
Audible Audible is a Python low-level interface to communicate with the non-publicly Audible API. It enables Python developers to create there own Aud
Digdata presented 'BrandX' as a clothing brand that wants to know the best places to set up a 'pop up' store.
Digdata presented 'BrandX' as a clothing brand that wants to know the best places to set up a 'pop up' store. I used the dataset given to write a program that ranks these places.
Bnagla hand written document digiiztion
Bnagla hand written document digiiztion This repo addresses the problem of digiizing hand written documents in Bangla. Documents have definite fields
The repo for reproducing Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study
ECIR Reproducibility Paper: Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study This code corresponds to the reproducibility
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"
Hierarchical Attention Networks for Document Classification This is an implementation of the paper Hierarchical Attention Networks for Document Classi
DUE: End-to-End Document Understanding Benchmark
This is the repository that provide tools to download data, reproduce the baseline results and evaluation. What can you achieve with this guide Based
Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"
Graph Neural Topic Model (GNTM) This is the pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Persp
Repo to store back end infrastructure for Message in a Bottle
Message in a Bottle Backend API RESTful API for Message in a Bottle frontend application consumption. About The Project • Tools Used • Local Set Up •
A voice assistant which can handle your everyday task and allows you to book items from your favourite store!
Voicely Table of Contents About The Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Contact Acknowled
Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API
Dominate Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API. It allows you to write HTML pages in pure
This code can help you with auto update for-TV-advertisements in the store.
Auto-update-files-for-TV-advertisements-in-the-store This code can help you with auto update for-TV-advertisements in the store. It was write for Rasp