385 Repositories
Python html-extraction Libraries
Entity Disambiguation as text extraction (ACL 2022)
ExtEnD: Extractive Entity Disambiguation This repository contains the code of ExtEnD: Extractive Entity Disambiguation, a novel approach to Entity Dis
This is a beginner-friendly repo to make a collection of some unique and awesome projects. Everyone in the community can benefit & get inspired by the amazing projects present over here.
Awesome-Projects-Collection Quality over Quantity :) What to do? Add some unique and amazing projects as per your favourite tech stack for the communi
Simple, Fast, Powerful and Easily extensible python package for extracting patterns from text, with over than 60 predefined Regular Expressions.
patterns-finder Simple, Fast, Powerful and Easily extensible python package for extracting patterns from text, with over than 60 predefined Regular Ex
SentimentArcs: a large ensemble of dozens of sentiment analysis models to analyze emotion in text over time
SentimentArcs - Emotion in Text An end-to-end pipeline based on Jupyter notebooks to detect, extract, process and anlayze emotion over time in text. E
Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset
Ego4D EGO4D is the world's largest egocentric (first person) video ML dataset and benchmark suite, with 3,600 hrs (and counting) of densely narrated v
Code Implementation of "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".
Span-ASTE: Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction ***** New March 31th, 2022: Scikit-Style API for Easy Usage *****
The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".
LEAR The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction". See below for an overview of
Make your first PR. A beginner friendly repository made specifically for open source beginners. Add any program under any language (it can be anything from a simple program to a complex data structure algorithm). Happy coding...
Hacktober Fest 2021 Upload Different Types of Programs in any Language Use this project to make your first contribution to an open source project on G
⚓ Eurybia monitor model drift over time and securize model deployment with data validation
View Demo · Documentation · Medium article 🔍 Overview Eurybia is a Python library which aims to help in : Detecting data drift and model drift Valida
The PyTorch implementation for paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)
ArXiv | Get Start Neural-Texture-Extraction-Distribution The PyTorch implementation for our paper "Neural Texture Extraction and Distribution for Cont
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"
MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu
Source code for "A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction" @ NAACL 2022
TSAR Source code for NAACL 2022 paper: A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction. 🔥 Introduction We focus on extra
ProtFeat is protein feature extraction tool that utilizes POSSUM and iFeature.
Description: ProtFeat is designed to extract the protein features by employing POSSUM and iFeature python-based tools. ProtFeat includes a total of 39
Browse JSON API in a HTML interface.
Falcon API Browse This project provides a middleware for Falcon Web Framework that will render the response in an HTML form for documentation purpose.
Django-Text-to-HTML-converter - The simple Text to HTML Converter using Django framework
Django-Text-to-HTML-converter This is the simple Text to HTML Converter using Dj
SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022
SafePicking Learning Safe Object Extraction via Object-Level Mapping Kentaro Wad
FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)
FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction
The bot creates hashtags for user's texts in Russian and English.
telegram_bot_hashtags The bot creates hashtags for user's texts in Russian and English. It is a simple bot for creating hashtags. NOTE file config.py
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
BROS (BERT Relying On Spatiality) is a pre-trained language model focusing on text and layout for better key information extraction from documents. Given the OCR results of the document image, which are text and bounding box pairs, it can perform various key information extraction tasks, such as extracting an ordered item list from receipts
Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.
Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.
Company clustering with K-means/GMM and visualization with PCA, t-SNE, using SSAN relation extraction
RE results graph visualization and company clustering Installation pip install -r requirements.txt python -m nltk.downloader stopwords python3.7 main.
A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.
A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.
TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music
TONet Introduction The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music", in ICASSP 2022 We
Text Analysis & Topic Extraction on Android App user reviews
AndroidApp_TextAnalysis Hi, there! This is code archive for Text Analysis and Topic Extraction from user_reviews of Android App. Dataset Source : http
According to the received excel file (.xlsx,.xlsm,.xltx,.xltm), it converts to word format with a given table structure and formatting
According to the received excel file (.xlsx,.xlsm,.xltx,.xltm), it converts to word format with a given table structure and formatting
Quot-a-lecture - Lecture transcript question extraction
Setup virtualenv venv source venv/bin/activate pip install -r requirements.txt
A Microsoft Azure Web App project named Covid 19 Predictor using Machine learning Model
A Microsoft Azure Web App project named Covid 19 Predictor using Machine learning Model (Random Forest Classifier Model ) that helps the user to identify whether someone is showing positive Covid symptoms or not by simply inputting certain values like oxygen level , breath rate , age, Vaccination done or not etc. with the help of kaggle database.
An app that allows you to add recipes from the dashboard made using DJango, JQuery, JScript and HTMl.
An app that allows you to add recipes from the dashboard. Then visitors filter based on different categories also each ingredient has a unique page with their related recipes.
[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)
Computer Vision - NTUA (2020-2021) This repository hosts the lab projects and theoretical problem sets of the Computer Vision course held by ECE NTUA
FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction
FaceExtraction FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction Occlusions often occur in face images in the wild, tr
Lektor-html-pretify - Lektor plugin to pretify the HTML DOM using Beautiful Soup
html-pretify Lektor plugin to pretify the HTML DOM using Beautiful Soup. How doe
This is a NLP based project to extract effective date of the contract from their text files.
Date-Extraction-from-Contracts This is a NLP based project to extract effective date of the contract from their text files. Problem statement This is
Multiple Object Extraction from Aerial Imagery with Convolutional Neural Networks
This is an implementation of Volodymyr Mnih's dissertation methods on his Massachusetts road & building dataset and my original methods that are publi
Generic ecosystem for feature extraction from aerial and satellite imagery
Note: Robosat is neither maintained not actively developed any longer by Mapbox. See this issue. The main developers (@daniel-j-h, @bkowshik) are no l
CS50 pset9: Using flask API to create a web application to exchange stocks' shares.
C$50 Finance In this guide we want to implement a website via which users can “register”, “login” “buy” and “sell” stocks, like below: Background If y
WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request
Project A: WebScraper A script that prints out a list of all EXTERNAL references
Basic-html-scraper - A complete how to of web scraping with Python for beginners
basic-html-scraper Code from YT Video This video includes a complete how to of w
Bootstraparse is a personal project started with a specific goal in mind: creating static html pages for direct display from a markdown-like file
Bootstraparse is a personal project started with a specific goal in mind: creating static html pages for direct display from a markdown-like file
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents [Project Page] [Paper] [Video] Wenlong Huang1, Pieter Abbee
This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP.
Overview Welcome to the Step-X repository. This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP. Be
A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".
Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti
Python Computer Vision from Scratch
This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos.
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction
The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. It is a large, multi-sourced, diverse dataset for product attribute extraction study.
A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population
DeepKE is a knowledge extraction toolkit supporting low-resource and document-level scenarios for entity, relation and attribute extraction. We provide comprehensive documents, Google Colab tutorials, and online demo for beginners.
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Application that converts markdown to html.
Markdown-Engine An application that converts markdown to html. Installation Using the package manager [pip] pip install -r requirements.txt Usage Run
PyTorch code for JEREX: Joint Entity-Level Relation Extractor
JEREX: "Joint Entity-Level Relation Extractor" PyTorch code for JEREX: "Joint Entity-Level Relation Extractor". For a description of the model and exp
Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)
TDEER 🦌 🦒 Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021) Overview TDEE
Large scale PTM - PPI relation extraction
Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT The silver standard
HuSpaCy: industrial-strength Hungarian natural language processing
HuSpaCy: Industrial-strength Hungarian NLP HuSpaCy is a spaCy model and a library providing industrial-strength Hungarian language processing faciliti
Source code for the plant extraction workflow introduced in the paper “Agricultural Plant Cataloging and Establishment of a Data Framework from UAV-based Crop Images by Computer Vision”
Plant extraction workflow Source code for the plant extraction workflow introduced in the paper "Agricultural Plant Cataloging and Establishment of a
That project takes as input special TXT File, divides its content into lsit of HTML objects and then creates HTML file from them.
That project takes as input special TXT File, divides its content into lsit of HTML objects and then creates HTML file from them.
DeltaPy - Tabular Data Augmentation (by @firmai)
DeltaPy — Tabular Data Augmentation & Feature Engineering Finance Quant Machine Learning ML-Quant.com - Automated Research Repository Introduction T
An intuitive library to extract features from time series
Time Series Feature Extraction Library Intuitive time series feature extraction This repository hosts the TSFEL - Time Series Feature Extraction Libra
Highly comparative time-series analysis
〰️ hctsa 〰️ : highly comparative time-series analysis hctsa is a software package for running highly comparative time-series analysis using Matlab (fu
Getdp-project - A Django-built web app that generates a personalized banner of events to come
getdp-project https://get-my-dp.herokuapp.com/ A Django-built web app that gener
NS-Defacer: a auto html injecter, In other words It's a auto defacer to deface a lot of websites in less time
Overview NS-Defacer is a auto html injecter, In other words It's a auto defacer
Boamp-extractor - Script d'extraction des AOs publiés au BOAMP
BOAMP Extractor BOAMP-Extractor permet d'extraire les offres de marchés publics publiées au bulletin officiel des annonces des marchés publics (BOAMP)
NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.
NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.
Htmdf - html to pdf with support for variables using fastApi.
htmdf Converts html to pdf with support for variables using fastApi. Installation Clone this repository. git clone https://github.com/ShreehariVaasish
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
JupyterNotebook - C/C++, Javascript, HTML, LaTex, Shell scripts in Jupyter Notebook Also run them on remote computer
JupyterNotebook Read, write and execute C, C++, Javascript, Shell scripts, HTML, LaTex in jupyter notebook, And also execute them on remote computer R
WyPyPlus is a minimal wiki in 42 lines of Python code.
🍦 WyPyPlus: A personal wiki in 42 lines of code 🍦 WyPyPlus (pronounced "whippy plus") is a minimalist wiki server in 42 lines of code based on wypy
A markdown extension for converting Leiden+ epigraphic text to TEI XML/HTML
LeidenMark $ pip install leidenmark A Python Markdown extension for converting Leiden+ epigraphic text to TEI XML/HTML. Inspired by the Brill plain te
An html wrapper for python
MessySoup What is it? MessySoup is a python wrapper for html elements. While still a ways away, the main goal is to be able to build a wesbite straigh
Use minify-html, the extremely fast HTML + JS + CSS minifier, with Django.
django-minify-html Use minify-html, the extremely fast HTML + JS + CSS minifier, with Django. Requirements Python 3.8 to 3.10 supported. Django 2.2 to
Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.
flashgeotext ⚡ 🌍 Extract and count countries and cities (+their synonyms) from text, like GeoText on steroids using FlashText, a Aho-Corasick impleme
An Instagram bot that can mass text users, receive and read a text, and store it somewhere with user details.
Instagram Bot 🤖 July 14, 2021 Overview 👍 A multifunctionality automated instagram bot that can mass text users, receive and read a message and store
Download clips from youtube videos with a few clicks and a GUI!
YouClip v2.0.0 Table Of Contents: What Is YouClip Installation Usage Stuff To Fix Changelog What Is YouClip? ! IMPORTANT: The source files are a total
A python-based static site generator for setting up a CV/Resume site
ezcv A python-based static site generator for setting up a CV/Resume site Table of Contents What does ezcv do? Features & Roadmap Why should I use ezc
Detection And Breaking With Python
Detection And Breaking IIIIIIIIIIIIIIIIIIII PPPPPPPPPPPPPPPPP VVVVVVVV VVVVVVVV I::::::::II::::::::I P:::::::
A fully-featured e-commerce application powered by Django
kobbyshop - Django Ecommerce App A fully featured e-commerce application powered by Django. Sections Project Description Features Technology Setup Scr
Fast HTML/XML template engine for Python
Overview Chameleon is an HTML/XML template engine for Python. It uses the page templates language. You can use it in any Python web application with j
Find thumbnails and original images from URL or HTML file.
Haul Find thumbnails and original images from URL or HTML file. Demo Hauler on Heroku Installation on Ubuntu $ sudo apt-get install build-essential py
inscriptis -- HTML to text conversion library, command line client and Web service
inscriptis -- HTML to text conversion library, command line client and Web service A python based HTML to text conversion library, command line client
A Python library that simplifies the extraction of datasets from XML content.
xmldataset: simple xml parsing 🗃️ XML Dataset: simple xml parsing Documentation: https://xmldataset.readthedocs.io A Python library that simplifies t
Zen-Knit is a formal (PDF), informal (HTML) report generator for data analyst and data scientist who wants to use python.
About Zen-Knit: Zen-Knit is a formal (PDF), informal (HTML) report generator for data analyst and data scientist who wants to use python. Inspired fro
AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction
AT-BMC Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction (AAAI 2022) Paper Prerequisites Install pac
Fast and robust date extraction from web pages, with Python or on the command-line
Find original and updated publication dates of any web page. From the command-line or within Python, all the steps needed from web page download to HTML parsing, scraping, and text analysis are included.
Grimoire is a Python library for creating interactive fiction as hyperlinked html.
Grimoire Grimoire is a Python library for creating interactive fiction as hyperlinked html. Installation pip install grimoire-if Usage Check out the
A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations
A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations
Combine XPath, CSS Selectors and JSONPath for Web data extracting.
Data Extractor Combine XPath, CSS Selectors and JSONPath for Web data extracting. Quickstarts Installation Install the stable version from PYPI. pip i
A query expression for extracting data from JSON.
JSONPATH A selector expression for extracting data from JSON. Quickstarts Installation Install the stable version from PYPI. pip install jsonpath-extr
Flask app + (html+css+ajax) contain ability add employee and place where employee work - plant or salon
#Manage your employees! With all employee information stored in one place, you no longer have to sift through hoards of spreadsheets to manually searc
CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery
CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery This paper (CoANet) has been published in IEEE TIP 2021. This code i
A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery
A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery This repository is the official implementati
PyTorch implementation for our AAAI 2022 Paper "Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning"
deepGCFX PyTorch implementation for our AAAI 2022 Paper "Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning" Pr
Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"
CLIN-X (CLIN-X-ES) & (CLIN-X-EN) This repository holds the companion code for the system reported in the paper: "CLIN-X: pre-trained language models a
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction The dataset contains 3 million attribute-value annotations across 1257 unique ca
Convert text with ANSI color codes to HTML or to LaTeX.
Convert text with ANSI color codes to HTML or to LaTeX.
An HTML interface for finetuning the sync map output from aeneas
finetuneas 3.0 finetuneas is a simple HTML interface for fine tuning sync maps output by aeneas Version 3.0 Easier adjusting time: following cells wil
Free casino website. Madden just for learning / fun
Website Casino Free casino website. Madden just for learning / fun. Uses Jinja2 (HTML), Flask, JavaScript, etc. Dice game Preview
BERT, LDA, and TFIDF based keyword extraction in Python
BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl
Wake: Context-Sensitive Automatic Keyword Extraction Using Word2vec
Wake Wake: Context-Sensitive Automatic Keyword Extraction Using Word2vec Abstract استخراج خودکار کلمات کلیدی متون کوتاه فارسی با استفاده از word2vec ب
Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python
Synchrosqueezing is a powerful reassignment method that focuses time-frequency representations, and allows extraction of instantaneous amplitudes and frequencies
Repository for the paper "Optimal Subarchitecture Extraction for BERT"
Bort Companion code for the paper "Optimal Subarchitecture Extraction for BERT." Bort is an optimal subset of architectural parameters for the BERT ar
An extremely configurable markdown reverser for Python3.
🔄 Unmarkd A markdown reverser. Unmarkd is a BeautifulSoup-powered Markdown reverser written in Python and for Python. Why This is created as a StackS
We'll be using HTML, CSS and JavaScript for the frontend
We'll be using HTML, CSS and JavaScript for the frontend. Nothing to install in specific. Open your text-editor and start coding a beautiful front-end.
This "I P L Team Project" is developed by Prasanta Kumar Mohanty using Python with Django web framework, HTML & CSS.
I-P-L-Team-Project This "I P L Team Project" is developed by Prasanta Kumar Mohanty using Python with Django web framework, HTML & CSS. Screenshots HO
A toolkit for document-level event extraction, containing some SOTA model implementations
❤️ A Toolkit for Document-level Event Extraction with & without Triggers Hi, there 👋 . Thanks for your stay in this repo. This project aims at buildi