1759 Repositories
Python language-files Libraries
Must-read papers on improving efficiency for pre-trained language models.
Must-read papers on improving efficiency for pre-trained language models.
Download YOUR files, documents from vk.
vk-documents-downloader Кароч эта симпл херня качает все ВАШИ документы с вк. Или я еблан, но в гх и тмб гугле я подобного не нашел. py main.py Login:
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
ROSITA News & Updates (24/08/2021) Release the demo to perform fine-grained semantic alignments using the pretrained ROSITA model. (15/08/2021) Releas
Tool for installing and updating MiSTer cores and other files
MiSTer Downloader This tool installs and updates all the cores and other extra files for your MiSTer. It also updates the menu core, the MiSTer firmwa
Learn how to responsibly deliver value with ML.
Made With ML Applied ML · MLOps · Production Join 30K+ developers in learning how to responsibly deliver value with ML. 🔥 Among the top MLOps reposit
Augmenty is an augmentation library based on spaCy for augmenting texts.
Augmenty: The cherry on top of your NLP pipeline Augmenty is an augmentation library based on spaCy for augmenting texts. Besides a wide array of high
This code renames subtitle file names to your video files names, so you don't need to rename them manually.
Rename Subtitle This code renames your subtitle file names to your video file names so you don't need to do it manually Note: It only works for series
PyDownloader - Downloads files and folders at high speed (based on your interent speed).
PyDownloader - Downloads files and folders at high speed (based on your interent speed).
An HTTP server to easily download and upload files.
httpsweet An HTTP server to easily download and upload files. It was created with flexibility in mind, allowing be used in many different situations,
DirBruter is a Python based CLI tool. It looks for hidden or existing directories/files using brute force method. It basically works by launching a dictionary based attack against a webserver and analyse its response.
DirBruter DirBruter is a Python based CLI tool. It looks for hidden or existing directories/files using brute force method. It basically works by laun
This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.
STaCK: Sentence Ordering with Temporal Commonsense Knowledge This repository contains the pytorch implementation of the paper STaCK: Sentence Ordering
Extract and visualize information from Gurobi log files
GRBlogtools Extract information from Gurobi log files and generate pandas DataFrames or Excel worksheets for further processing. Also includes a wrapp
Natural language computational chemistry command line interface.
nlcc Install pip install nlcc Must have Open-AI Codex key: export OPENAI_API_KEY=your key here then nlcc key bindings ctrl-w copy to clipboard (Note
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training
RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p
Hasher Hash, Compare and Verify your files Translations
Hasher Hash, Compare and Verify your files Translations In order to translate Hasher to a language you must add a folder with the language abbreviatio
The only purpose of a byte-sized application is to help you create .desktop entry files for downloaded applications.
Turtle 🐢 The only purpose of a byte-sized application is to help you create .desktop entry files for downloaded applications. As of usual with elemen
Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer works with OpenVINO IR models.
Demonstration of OpenVINO techniques - Model-division and a simplest-way to support custom layers Description: Model Optimizer in Intel(r) OpenVINO(tm
Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet.
Sonnet finder Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet. Usage This is a Python scrip
Asynchronous and also synchronous non-official QvaPay client for asyncio and Python language.
Asynchronous and also synchronous non-official QvaPay client for asyncio and Python language. This library is still under development, the interface could be changed.
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities
Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)
In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqtrade so much yet.
My Freqtrade stuff In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqt
Mysterium the first tool which permits you to retrieve the most part of a Python code even the .py or .pyc was extracted from an executable file, even it is encrypted with every existing encryptage. Mysterium don't make any difference between encrypted and non encrypted files, it can retrieve code from Pyarmor or .pyc files.
Mysterium the first tool which permits you to retrieve the most part of a Python code even the .py or .pyc was extracted from an executable file, even it is encrypted with every existing encryptage. Mysterium don't make any difference between encrypted and non encrypted files, it can retrieve code from Pyarmor or .pyc files.
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".
BanglaBERT This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced i
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
PIZZA - a task-oriented semantic parsing dataset
The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents.
This Mirror Bot is a multipurpose Telegram Bot writen in Python for mirroring files on the Internet to our beloved Google Drive.
MIRROR HUNTER This Mirror Bot is a multipurpose Telegram Bot writen in Python for mirroring files on the Internet to our beloved Google Drive. Repo la
✨Rubrix is a production-ready Python framework for exploring, annotating, and managing data in NLP projects.
✨A Python framework to explore, label, and monitor data for NLP projects
Bridging Vision and Language Model
BriVL BriVL (Bridging Vision and Language Model) 是首个中文通用图文多模态大规模预训练模型。BriVL模型在图文检索任务上有着优异的效果,超过了同期其他常见的多模态预训练模型(例如UNITER、CLIP)。 BriVL论文:WenLan: Bridgi
debinstaller - A tool to install .deb files in any distro.
debinstaller A tool to install .deb files in any distro. Installation for debinstaller
tagls is a language server based on gtags.
tagls tagls is a language server based on gtags. Why I wrote it? Almost all modern editors have great support to LSP, but language servers based on se
borb is a library for reading, creating and manipulating PDF files in python.
borb is a library for reading, creating and manipulating PDF files in python.
🎴 LearnQuick is a flashcard application that you can study with decks and cards.
🎴 LearnQuick is a flashcard application that you can study with decks and cards. The main function of the application is to show the front sides of the created cards to the user and ask them to guess the back of the card. As a result of self-assessment of user's ability to guess the back sides of the displayed cards, the cards that user weak against are shown more often, and the cards that user strong against are shown less frequently.
Application for shadowing Chinese.
chinese-shadowing Simple APP for shadowing chinese. With this application, it is very easy to record yourself, play the sound recorded and listen to s
💻 Open recent VS Code folders and files using Ulauncher
ulauncher-vscode-recent 💻 Open recent VS Code folders and files using Ulauncher. Quickly open recently-opened VS Code project directories and files.
MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving.
MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving. It is a comprehensive framework for research purpose that integrates popular MWP benchmark datasets and typical deep learning-based MWP algorithms.
Collection of NLP model explanations and accompanying analysis tools
Thermostat is a large collection of NLP model explanations and accompanying analysis tools. Combines explainability methods from the captum library wi
Goblyn is a Python tool focused to enumeration and capture of website files metadata.
Goblyn Metadata Enumeration What's Goblyn? Goblyn is a tool focused to enumeration and capture of website files metadata. How it works? Goblyn will se
The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".
Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera
Tool to create 3D printable terrain with integrated path/road part files (Single material 3d printer)
BACKGROUND This has been an ongoing project of mine for a few months now. I run trails a lot and original the goal was to create a function to combine
Python script that split PDF files.
Automatic PDF Splitter This script can create new single-page PDFs files from multipaged PDFs. Requirements Python 3.0+ # Debian distros sudo apt-get
commandpack - A package of modules for working with commands, command packages, files with command packages.
commandpack Help the project financially: Donate: https://smartlegion.github.io/donate/ Yandex Money: https://yoomoney.ru/to/4100115206129186 PayPal:
shred - A cross-platform library for securely deleting files beyond recovery.
shred Help the project financially: Donate: https://smartlegion.github.io/donate/ Yandex Money: https://yoomoney.ru/to/4100115206129186 PayPal: https:
AIL LeakFeeder: A Module for AIL Framework that automate the process to feed leaked files automatically to AIL
AIL LeakFeeder: A Module for AIL Framework that automates the process to feed leaked files automatically to AIL, So basically this feeder will help you ingest AIL with your leaked files automatically.
a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot
Drive Search Bot This is a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot How to deploy? Clone this repo: git clone
Convert ACSM files to DRM-free EPUB files with one command on Linux
Knock Convert ACSM files to DRM-free EPUB files using one command. This software does not utilize Adobe Digital Editions nor Wine. It is completely fr
The system to host your files on the Discord application
Distorage The system to host your files on the Discord application Documentation Documentation Distorage How to use the package You can install it wit
EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling
Frustratingly Simple Pretraining Alternatives to Masked Language Modeling This is the official implementation for "Frustratingly Simple Pretraining Al
theHasher Tool created for generate strong and unbreakable passwords by using Hash Functions.Generate Hashes and store them in txt files.Use the txt files as lists to execute Brute Force Attacks!
$theHasher theHasher is a Tool for generating hashes using some of the most Famous Hashes Functions ever created. You can save your hashes to correspo
Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences
"Kötü söz sahibine aittir." -Anonim Nedir? sinkaf uygunsuz yorumların bulunmasını sağlayan bir python kütüphanesidir. Farkı nedir? Diğer algoritmalard
Dataloader tools for language modelling
Installation: pip install lm_dataloader Design Philosophy A library to unify lm dataloading at large scale Simple interface, any tokenizer can be inte
🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"
SGLKT-VisDial Pytorch Implementation for the paper: Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer Gi-Cheon Kang, Junseok P
Merge multiple PDF files into one.
PDF Merger Merge multiple PDF files into one. Usage % python pdf_merger.py -h usage: pdf_merger.py [-h] [-o OUTPUT] [-f [FILES ...]] optional argumen
EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections Ruiqi Zhong, Kristy Lee*, Zheng Zhang*, Dan Klein EMN
An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"
Channel LM Prompting (and beyond) This includes an original implementation of Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. "Noisy Cha
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Megatron (1 and 2) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA.
A super simple script which uses the GitHub API to convert your markdown files to GitHub styled HTML site.
A super simple script which uses the GitHub API to convert your markdown files to GitHub styled HTML site.
Code for our ALiBi method for transformer language models.
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation This repository contains the code and models for our paper Tra
CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)
CrossNorm (CN) and SelfNorm (SN) (Accepted at ICCV 2021) This is the official PyTorch implementation of our CNSN paper, in which we propose CrossNorm
CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)
CrossNorm (CN) and SelfNorm (SN) (Accepted at ICCV 2021) This is the official PyTorch implementation of our CNSN paper, in which we propose CrossNorm
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Attention is all you need: A Pytorch Implementation This is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish V
Generate a bunch of malicious pdf files with phone-home functionality. Can be used with Burp Collaborator
Malicious PDF Generator ☠️ Generate ten different malicious pdf files with phone-home functionality. Can be used with Burp Collaborator. Used for pene
Python library for the DeepL language translation API.
The DeepL API is a language translation API that allows other computer programs to send texts and documents to DeepL's servers and receive high-quality translations. This opens a whole universe of opportunities for developers: any translation product you can imagine can now be built on top of DeepL's best-in-class translation technology.
Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics.
Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics. Jury offers a smooth and easy-to-use interface. It uses datasets for underlying metric computation, and hence adding custom metric is easy as adopting datasets.Metric.
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample
Alias-Free-Torch Simple torch module implementation of Alias-Free GAN. This repository including Alias-Free GAN style lowpass sinc filter @filter.py A
A Bot to Upload files to Many Cloud services. Powered by Telethon.
oVo MultiUpload V1.0 👀 A Bot to Upload files to Many Cloud services. Powered by Telethon _ 🎯 Follow me and star this repo for more telegram bots. @H
Pipeline for fast building text classification TF-IDF + LogReg baselines.
Text Classification Baseline Pipeline for fast building text classification TF-IDF + LogReg baselines. Usage Instead of writing custom code for specif
A Survey of Natural Language Generation in Task-Oriented Dialogue System (TOD): Recent Advances and New Frontiers
A Survey of Natural Language Generation in Task-Oriented Dialogue System (TOD): Recent Advances and New Frontiers
X-modaler is a versatile and high-performance codebase for cross-modal analytics.
X-modaler X-modaler is a versatile and high-performance codebase for cross-modal analytics. This codebase unifies comprehensive high-quality modules i
Empower Sequence Labeling with Task-Aware Language Model
LM-LSTM-CRF Check Our New NER Toolkit 🚀 🚀 🚀 Inference: LightNER: inference w. models pre-trained / trained w. any following tools, efficiently. Tra
An almost fully customizable language made in python!
Whython is a project language, the idea of it is that anyone can download and edit the language to make it suitable to what they want.
DEMix Layers for Modular Language Modeling
DEMix This repository contains modeling utilities for "DEMix Layers: Disentangling Domains for Modular Language Modeling" (Gururangan et. al, 2021). T
Evaluation suite for large-scale language models.
This repo contains code for running the evaluations and reproducing the results from the Jurassic-1 Technical Paper (see blog post), with current support for running the tasks through both the AI21 Studio API and OpenAI's GPT3 API.
Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset
KoSimCSE Korean Simple Contrastive Learning of Sentence Embeddings implementation using pytorch SimCSE Installation git clone https://github.com/BM-K/
Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT)
CIRPLANT This repository contains the code and pre-trained models for Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) For d
Doing set operations on files considered as sets of lines
CLI tool that can be used to do set operations like union on files considering them as a set of lines. Notes It ignores all empty lines with whitespac
Guesslang detects the programming language of a given source code
Detect the programming language of a source code
The official code for paper "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling".
R2D2 This is the official code for paper titled "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Mode
Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision. ICCV 2021.
Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision Download links and PyTorch implementation of "Towers of Ba
Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files) and Zeek logs.
Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files) and Zeek logs.
Enhanced version of blender's bvh add-on with more settings supported. The bvh's rest pose should have the same handedness as the armature while could use a different up/forward definiton.
Enhanced bvh add-on (importer/exporter) for blender Enhanced bvh add-on (importer/exporter) for blender Enhanced bvh importer Enhanced bvh exporter Ho
Python function to query SQLite files stored on S3
sqlite-s3-query Python function to query a SQLite file stored on S3. It uses multiple HTTP range requests per query to avoid downloading the entire fi
Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)
Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps
Code for EmBERT, a transformer model for embodied, language-guided visual task completion.
Code for EmBERT, a transformer model for embodied, language-guided visual task completion.
The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).
The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).
CredData is a set of files including credentials in open source projects
CredData is a set of files including credentials in open source projects. CredData includes suspicious lines with manual review results and more information such as credential types for each suspicious line. CredData can be used to develop new tools or improve existing tools. Furthermore, using the benchmark result of the CredData, users can choose a proper tool among open source credential scanning tools according to their use case.
Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.
New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa
😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc
------ Update September 2018 ------ It's been a year since TorchMoji and DeepMoji were released. We're trying to understand how it's being used such t
Turn NY Times crosswords into Across Lite files
NYT Crossword to Puz A windows program to convert NY Times crosswords from the web to Across Lite compatible files. To run this, first download and de
Ongoing research training transformer language models at scale, including: BERT & GPT-2
What is this fork of Megatron-LM and Megatron-DeepSpeed This is a detached fork of https://github.com/microsoft/Megatron-DeepSpeed, which in itself is
Code for EmBERT, a transformer model for embodied, language-guided visual task completion.
Code for EmBERT, a transformer model for embodied, language-guided visual task completion.
Blender 2.93 addon for loading Quake II MD2 files
io_mesh_md2 is a Blender 2.93 addon for importing Quake II MD2 files.
Natural Language Processing library built with AllenNLP 🌲🌱
Custom Natural Language Processing with big and small models 🌲🌱
PyTorch implementation for Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition.
Stochastic CSLR This is the PyTorch implementation for the ECCV 2020 paper: Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuou
Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)
Progressive Transformers for End-to-End Sign Language Production Source code for "Progressive Transformers for End-to-End Sign Language Production" (B
Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)
transformer-slt This repository gathers data and code supporting the experiments in the paper Better Sign Language Translation with STMC-Transformer.
Sign Language Transformers (CVPR'20)
Sign Language Transformers (CVPR'20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Trans
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
This repository contains code for the following two papers: VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short
Multi Task Vision and Language
12-in-1: Multi-Task Vision and Language Representation Learning Please cite the following if you use this code. Code and pre-trained models for 12-in-
(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain Mingchen Zhuge*, Dehong Gao*, Deng-Ping Fan#, Linbo Jin, Ben Chen, Haoming Zhou, Minghui
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
VisualGPT Our Paper VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Main Architecture of Our VisualGPT Downloa