845 Repositories
Python large-files Libraries
C++ Implementation of PyTorch Tutorials for Everyone
C++ Implementation of PyTorch Tutorials for Everyone OS (Compiler)\LibTorch 1.9.0 macOS (clang 10.0, 11.0, 12.0) Linux (gcc 8, 9, 10, 11) Windows (msv
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification
Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle
Knover Knover is a toolkit for knowledge grounded dialogue generation based on PaddlePaddle. Knover allows researchers and developers to carry out eff
Repository containing the project files for CEN4020's Team Utah.
inCollege-Team-Utah Repository containing the project files for CEN4020's Team Utah. Contributors: Deepak Putta Jose Ramirez Fuentes Jaason Raudales C
Automating the process of sorting files in my downloads folder by file type.
downloads-folder-automation Automating the process of sorting files in a user's downloads folder on Windows by file type. This script iterates through
Parser manager for parsing DOC, DOCX, PDF or HTML files
Parser manager Description Parser gets PDF, DOC, DOCX or HTML file via API and saves parsed data to the database. Implemented in Ruby 3.0.1 using Acti
BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).
BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).
📢 Video Chat Stream Telegram Bot. Can ⏳ Stream Live Videos, Radios, YouTube Videos & Telegram Video Files On Your Video Chat Of Channels & Groups !
Telegram Video Chat Bot (Beta) 📢 Video Chat Stream Telegram Bot 🤖 Can Stream Live Videos, Radios, YouTube Videos & Telegram Video Files On Your Vide
An adaptable Snakemake workflow which uses GATKs best practice recommendations to perform germline mutation calling starting with BAM files
Germline Mutation Calling This Snakemake workflow follows the GATK best-practice recommandations to call small germline variants. The pipeline require
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
wireguard-config-benchmark is a python script that benchmarks the download speeds for the connections defined in one or more wireguard config files
wireguard-config-benchmark is a python script that benchmarks the download speeds for the connections defined in one or more wireguard config files. If multiple configs are benchmarked it will output a file ranking them from fastest to slowest.
Easy to use Python module to extract Exif metadata from digital image files.
Easy to use Python module to extract Exif metadata from digital image files.
This implements one of result networks from Large-scale evolution of image classifiers
Exotic structured image classifier This implements one of result networks from Large-scale evolution of image classifiers by Esteban Real, et. al. Req
Strict separation of config from code.
Python Decouple: Strict separation of settings from code Decouple helps you to organize your settings so that you can change parameters without having
Django helper application to easily and non-destructively crop arbitrarily large images in admin and frontend.
django-image-cropping django-image-cropping is an app for cropping uploaded images via Django's admin backend using Jcrop. Screenshot: django-image-cr
Python reader for Linked Data in HDF5 files
Linked Data are becoming more popular for user-created metadata in HDF5 files.
Serve files with Django.
django-downloadview django-downloadview makes it easy to serve files with Django: you manage files with Django (permissions, filters, generation, ...)
A Django application that provides country choices for use with forms, flag icons static files, and a country field for models.
Django Countries A Django application that provides country choices for use with forms, flag icons static files, and a country field for models. Insta
Django-Audiofield is a simple app that allows Audio files upload, management and conversion to different audio format (mp3, wav & ogg), which also makes it easy to play audio files into your Django application.
Django-Audiofield Description: Django Audio Management Tools Maintainer: Areski Contributors: list of contributors Django-Audiofield is a simple app t
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.
Simple DDL Parser to parse SQL (HQL, TSQL, AWS Redshift, Snowflake and other dialects) ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc.
Simple DDL Parser Build with ply (lex & yacc in python). A lot of samples in 'tests/. Is it Stable? Yes, library already has about 5000+ usage per day
IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)
IndoBERTweet 🐦 🇮🇩 1. Paper Fajri Koto, Jey Han Lau, and Timothy Baldwin. IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effe
A django compressor tool that bundles css, js files to a single css, js file with webpack and updates your html files with respective css, js file path.
django-webpacker's documentation: Introduction: django-webpacker is a django compressor tool which bundles css, js files to a single css, js file with
Download YOUR files, documents from vk.
vk-documents-downloader Кароч эта симпл херня качает все ВАШИ документы с вк. Или я еблан, но в гх и тмб гугле я подобного не нашел. py main.py Login:
Tool for installing and updating MiSTer cores and other files
MiSTer Downloader This tool installs and updates all the cores and other extra files for your MiSTer. It also updates the menu core, the MiSTer firmwa
This code renames subtitle file names to your video files names, so you don't need to rename them manually.
Rename Subtitle This code renames your subtitle file names to your video file names so you don't need to do it manually Note: It only works for series
PyDownloader - Downloads files and folders at high speed (based on your interent speed).
PyDownloader - Downloads files and folders at high speed (based on your interent speed).
An HTTP server to easily download and upload files.
httpsweet An HTTP server to easily download and upload files. It was created with flexibility in mind, allowing be used in many different situations,
DirBruter is a Python based CLI tool. It looks for hidden or existing directories/files using brute force method. It basically works by launching a dictionary based attack against a webserver and analyse its response.
DirBruter DirBruter is a Python based CLI tool. It looks for hidden or existing directories/files using brute force method. It basically works by laun
Extract and visualize information from Gurobi log files
GRBlogtools Extract information from Gurobi log files and generate pandas DataFrames or Excel worksheets for further processing. Also includes a wrapp
Hasher Hash, Compare and Verify your files Translations
Hasher Hash, Compare and Verify your files Translations In order to translate Hasher to a language you must add a folder with the language abbreviatio
The only purpose of a byte-sized application is to help you create .desktop entry files for downloaded applications.
Turtle 🐢 The only purpose of a byte-sized application is to help you create .desktop entry files for downloaded applications. As of usual with elemen
Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer works with OpenVINO IR models.
Demonstration of OpenVINO techniques - Model-division and a simplest-way to support custom layers Description: Model Optimizer in Intel(r) OpenVINO(tm
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities
Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)
Galileo library for large scale graph training by JD
近年来,图计算在搜索、推荐和风控等场景中获得显著的效果,但也面临超大规模异构图训练,与现有的深度学习框架Tensorflow和PyTorch结合等难题。 Galileo(伽利略)是一个图深度学习框架,具备超大规模、易使用、易扩展、高性能、双后端等优点,旨在解决超大规模图算法在工业级场景的落地难题,提
In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqtrade so much yet.
My Freqtrade stuff In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqt
Mysterium the first tool which permits you to retrieve the most part of a Python code even the .py or .pyc was extracted from an executable file, even it is encrypted with every existing encryptage. Mysterium don't make any difference between encrypted and non encrypted files, it can retrieve code from Pyarmor or .pyc files.
Mysterium the first tool which permits you to retrieve the most part of a Python code even the .py or .pyc was extracted from an executable file, even it is encrypted with every existing encryptage. Mysterium don't make any difference between encrypted and non encrypted files, it can retrieve code from Pyarmor or .pyc files.
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
img2dataset Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine. Also supports
This Mirror Bot is a multipurpose Telegram Bot writen in Python for mirroring files on the Internet to our beloved Google Drive.
MIRROR HUNTER This Mirror Bot is a multipurpose Telegram Bot writen in Python for mirroring files on the Internet to our beloved Google Drive. Repo la
debinstaller - A tool to install .deb files in any distro.
debinstaller A tool to install .deb files in any distro. Installation for debinstaller
borb is a library for reading, creating and manipulating PDF files in python.
borb is a library for reading, creating and manipulating PDF files in python.
💻 Open recent VS Code folders and files using Ulauncher
ulauncher-vscode-recent 💻 Open recent VS Code folders and files using Ulauncher. Quickly open recently-opened VS Code project directories and files.
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm
Goblyn is a Python tool focused to enumeration and capture of website files metadata.
Goblyn Metadata Enumeration What's Goblyn? Goblyn is a tool focused to enumeration and capture of website files metadata. How it works? Goblyn will se
Tool to create 3D printable terrain with integrated path/road part files (Single material 3d printer)
BACKGROUND This has been an ongoing project of mine for a few months now. I run trails a lot and original the goal was to create a function to combine
Python script that split PDF files.
Automatic PDF Splitter This script can create new single-page PDFs files from multipaged PDFs. Requirements Python 3.0+ # Debian distros sudo apt-get
commandpack - A package of modules for working with commands, command packages, files with command packages.
commandpack Help the project financially: Donate: https://smartlegion.github.io/donate/ Yandex Money: https://yoomoney.ru/to/4100115206129186 PayPal:
shred - A cross-platform library for securely deleting files beyond recovery.
shred Help the project financially: Donate: https://smartlegion.github.io/donate/ Yandex Money: https://yoomoney.ru/to/4100115206129186 PayPal: https:
AIL LeakFeeder: A Module for AIL Framework that automate the process to feed leaked files automatically to AIL
AIL LeakFeeder: A Module for AIL Framework that automates the process to feed leaked files automatically to AIL, So basically this feeder will help you ingest AIL with your leaked files automatically.
a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot
Drive Search Bot This is a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot How to deploy? Clone this repo: git clone
Convert ACSM files to DRM-free EPUB files with one command on Linux
Knock Convert ACSM files to DRM-free EPUB files using one command. This software does not utilize Adobe Digital Editions nor Wine. It is completely fr
The system to host your files on the Discord application
Distorage The system to host your files on the Discord application Documentation Documentation Distorage How to use the package You can install it wit
theHasher Tool created for generate strong and unbreakable passwords by using Hash Functions.Generate Hashes and store them in txt files.Use the txt files as lists to execute Brute Force Attacks!
$theHasher theHasher is a Tool for generating hashes using some of the most Famous Hashes Functions ever created. You can save your hashes to correspo
This software's intent is to automate all activities related to manage Axie Infinity Scholars. It is specially aimed to mangers with large scholar roasters.
Axie Scholars Utilities This software's intent is to automate all activities related to manage Scholars. It is specially aimed to mangers with large s
Merge multiple PDF files into one.
PDF Merger Merge multiple PDF files into one. Usage % python pdf_merger.py -h usage: pdf_merger.py [-h] [-o OUTPUT] [-f [FILES ...]] optional argumen
A super simple script which uses the GitHub API to convert your markdown files to GitHub styled HTML site.
A super simple script which uses the GitHub API to convert your markdown files to GitHub styled HTML site.
TextDescriptives - A Python library for calculating a large variety of statistics from text
A Python library for calculating a large variety of statistics from text(s) using spaCy v.3 pipeline components and extensions. TextDescriptives can be used to calculate several descriptive statistics, readability metrics, and metrics related to dependency distance.
Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"
Train longer, generalize better - Big batch training This is a code repository used to generate the results appearing in "Train longer, generalize bet
Generate a bunch of malicious pdf files with phone-home functionality. Can be used with Burp Collaborator
Malicious PDF Generator ☠️ Generate ten different malicious pdf files with phone-home functionality. Can be used with Burp Collaborator. Used for pene
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample
Alias-Free-Torch Simple torch module implementation of Alias-Free GAN. This repository including Alias-Free GAN style lowpass sinc filter @filter.py A
A Bot to Upload files to Many Cloud services. Powered by Telethon.
oVo MultiUpload V1.0 👀 A Bot to Upload files to Many Cloud services. Powered by Telethon _ 🎯 Follow me and star this repo for more telegram bots. @H
Evaluation suite for large-scale language models.
This repo contains code for running the evaluations and reproducing the results from the Jurassic-1 Technical Paper (see blog post), with current support for running the tasks through both the AI21 Studio API and OpenAI's GPT3 API.
Doing set operations on files considered as sets of lines
CLI tool that can be used to do set operations like union on files considering them as a set of lines. Notes It ignores all empty lines with whitespac
A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"
VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video
Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files) and Zeek logs.
Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files) and Zeek logs.
Enhanced version of blender's bvh add-on with more settings supported. The bvh's rest pose should have the same handedness as the armature while could use a different up/forward definiton.
Enhanced bvh add-on (importer/exporter) for blender Enhanced bvh add-on (importer/exporter) for blender Enhanced bvh importer Enhanced bvh exporter Ho
Python function to query SQLite files stored on S3
sqlite-s3-query Python function to query a SQLite file stored on S3. It uses multiple HTTP range requests per query to avoid downloading the entire fi
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data arXiv This is the code base for weakly supervised NER. We provide a
CredData is a set of files including credentials in open source projects
CredData is a set of files including credentials in open source projects. CredData includes suspicious lines with manual review results and more information such as credential types for each suspicious line. CredData can be used to develop new tools or improve existing tools. Furthermore, using the benchmark result of the CredData, users can choose a proper tool among open source credential scanning tools according to their use case.
Turn NY Times crosswords into Across Lite files
NYT Crossword to Puz A windows program to convert NY Times crosswords from the web to Across Lite compatible files. To run this, first download and de
Large scale embeddings on a single machine.
Marius Marius is a system under active development for training embeddings for large-scale graphs on a single machine. Training on large scale graphs
Blender 2.93 addon for loading Quake II MD2 files
io_mesh_md2 is a Blender 2.93 addon for importing Quake II MD2 files.
Edit SRT files to delay subtitle time-stamps.
subtitle-delay A program written in Python that directly edits SRT file to delay the subtitles. Features: Will throw an error if delaying with negativ
Plugin to generate BOM + CPL files for JLCPCB
KiCAD JLCPCB tools Plugin to generate all files necessary for JLCPCB board fabrication and assembly Gerber files Excellon files BOM file CPL file Furt
A small utility that sorts your files.
FileSorter A small utility that sorts your files. TODO: Scan directory to find files(thanks @corruptmemry for this!) Split extensions to determine fil
Uncomplete archive of files from the European Nopsled Team
European Nopsled CTF Archive This is an archive of collected material from various Capture the Flag competitions that the European Nopsled team played
A automated python script that creates mark-down files to read for the aes keys and other useful information.
Archive A automated python script that creates mark-down files to read for the aes keys and other useful information. Table of Contents Benbot Automat
Python retagging utility for mkv files using mkvmerge.
pyretag A python script to retag mkv files. Setting Up pip install pyfiglet pip install rich Move the mkv files to input folder.
pydicom - Read, modify and write DICOM files with python code
pydicom is a pure Python package for working with DICOM files. It lets you read, modify and write DICOM data in an easy "pythonic" way.
Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.
Auto-Complete Google Project In this project there is an implementation for one feature of Google's search engines - AutoComplete. Autocomplete, or wo
ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.
ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem
Run Effective Large Batch Contrastive Learning on Limited Memory GPU
Gradient Cache Gradient Cache is a simple technique for unlimitedly scaling contrastive learning batch far beyond GPU memory constraint. This means tr
A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself
Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code
A central task in drug discovery is searching, screening, and organizing large chemical databases
A central task in drug discovery is searching, screening, and organizing large chemical databases. Here, we implement clustering on molecular similarity. We support multiple methods to provide a interactive exploration of chemical space.
A Simple Telegram Bot To Download And Upload Files
AquaDLBot ➠ I Can Download And Upload files To Telegram DEMO Copyright (C) 2020-2026 by AsiaXDev@Github DONATE: BUY US A COFFEE CONTACT DEV : Asia DEP
Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files
Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files. Mdformat is a Unix-style command-line tool as well as a Python library.
A Simple Telegram Bot By @AsmSafone to Download Files From Mega.nz and Upload It to Telegram
MegaDL-Bot A Simple Telegram Bot By @AsmSafone to Download Files From Mega.nz and Upload It to Telegram Features No Login Required All Mega.nz File Li
Official Implementation of LARGE: Latent-Based Regression through GAN Semantics
LARGE: Latent-Based Regression through GAN Semantics [Project Website] [Google Colab] [Paper] LARGE: Latent-Based Regression through GAN Semantics Yot
A BurpSuite extension to parse 5GC NF OpenAPI 3.0 files to assess 5G core networks
5GC_API_parse Description 5GC API parse is a BurpSuite extension allowing to assess 5G core network functions, by parsing the OpenAPI 3.0 not supporte
A Python library to utilize AWS API Gateway's large IP pool as a proxy to generate pseudo-infinite IPs for web scraping and brute forcing.
A Python library to utilize AWS API Gateway's large IP pool as a proxy to generate pseudo-infinite IPs for web scraping and brute forcing.
FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)
FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon
evtx-hunter helps to quickly spot interesting security-related activity in Windows Event Viewer (EVTX) files.
Introduction evtx-hunter helps to quickly spot interesting security-related activity in Windows Event Viewer (EVTX) files. It can process a high numbe
A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).
ClusterGCN ⠀⠀ A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019). A
Simple Telegram Bot to Download and Upload Files From Mega.nz
Mega.nz-Bot Simple Telegram Bot to Download Files From Mega.nz and Upload It to Telegram Features All Mega.nz File Links supported No login required A
MHS2 Save file editing tools. Transfers save files between players, switch and pc version, encrypts and decrypts.
SaveTools MHS2 Save file editing tools. Transfers save files between players, switch and pc version, encrypts and decrypts. Credits Written by Asteris
Maltego transforms to pivot between PE files based on their VirusTotal codeblocks
VirusTotal Codeblocks Maltego Transforms Introduction These Maltego transforms allow you to pivot between different PE files based on codeblocks they
Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.
MidiBERT-Piano Authors: Yi-Hui (Sophia) Chou, I-Chun (Bronwin) Chen Introduction This is the official repository for the paper, MidiBERT-Piano: Large-
A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography
A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography
An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.
An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.
Telegram Radio - A User-bot who continuously play random audio files (from the famous telegram music channel @mveargasm) in the intended voice chat.
MvEargasmDJ: This is my submission for the Telegram Radio Project of Baivaru. Which required a userbot to continiously play random audio files from th