75 Repositories
Python metadata Libraries
Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets.
Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps find essential insights from training results and improves AI performance.
Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive
google-drive-to-sqlite Create a SQLite database containing metadata from Google
Metadata-Extractor - Metadata Extractor Script can be used to read in exif metadata
Metadata Extractor The exifextract script can be used to read in exif metadata f
This repository contains the Unix Game challenges and metadata
This repository contains the Unix Game challenges and metadata
DietPDF aims at reducing PDF file size while not degrading quality nor losing metadata
DietPDF aims at reducing PDF file size while not degrading quality nor losing metadata
Python command line tool and python engine to label table fields and fields in data files.
Python command line tool and python engine to label table fields and fields in data files. It could help to find meaningful data in your tables and data files or to find Personal identifable information (PII).
A data structure that extends pyspark.sql.DataFrame with metadata information.
MetaFrame A data structure that extends pyspark.sql.DataFrame with metadata info
The game company we work for has two events that we want to track: buy an item and join a guild. Each of them has metadata characteristic of such events.
The game company we work for has two events that we want to track: buy an item and join a guild. Each of them has metadata characteristic of such events.
Music, Album and Playlist downloader for JioSaavn
jiosaavn-dl Music, Album and Playlist downloader for JioSaavn Features Downloads tracks, albums and playlists in maximum available quality (320kbps AA
Fetch papers and metadata.
Fetch PubMed Central for open-access papers as well as Sci-Hub
SickNerd aims to slowly enumerate Google Dorks via the googlesearch API then requests found pages for metadata
CLI tool for making Google Dorking a passive recon experience. With the ability to fetch and filter dorks from GHDB.
WebApp served by OAK PoE device to visualize various streams, metadata and AI results
DepthAI PoE WebApp | Bootstrap 4 & Vue.js SPA Dashboard Based on dashmin (https:
Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata
Bib Parser Convenient script to parse .bib files with the ACM Digital Library li
Keepsake is a Python library that uploads files and metadata (like hyperparameters) to Amazon S3 or Google Cloud Storage
Keepsake Version control for machine learning. Keepsake is a Python library that uploads files and metadata (like hyperparameters) to Amazon S3 or Goo
Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art!
osu-Extract Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art! Requirements python3 mutagen pillow Us
A tool for fixing inconsistent timestamp metadata (atime, ctime, and mtime).
Mtime Fixer Mtime Fixer is a tool for fixing inconsistent timestamp metadata (atime, ctime, and mtime). Sometimes timestamp metadata of folders are in
ASF Sentinel-1 Metadata Download tool
ASF Sentinel-1 Metadata Download tool Copyright: 2021-2022 Antonio Valentino Small Python tool (asfsmd) that allows to download XML files containing S
A Python reference implementation of the CF data model
cfdm A Python reference implementation of the CF data model. References Compliance with FAIR principles Documentation https://ncas-cms.github.io/cfdm
YAML metadata extension for Python-Markdown
YAML metadata extension for Python-Markdown This extension adds YAML meta data handling to markdown with all YAML features. As in the original, metada
Neptune client library - integrate your Python scripts with Neptune
Lightweight experiment tracking tool for AI/ML individuals and teams. Fits any workflow. Neptune is a lightweight experiment logging/tracking tool tha
Media file renamer and organizion tool
mnamer mnamer (media renamer) is an intelligent and highly configurable media organization utility. It parses media filenames for metadata, searches t
Fast and robust date extraction from web pages, with Python or on the command-line
Find original and updated publication dates of any web page. From the command-line or within Python, all the steps needed from web page download to HTML parsing, scraping, and text analysis are included.
Download your Spotify playlists and songs along with album art and metadata
spotDL Download your Spotify playlists and songs along with album art and metadata The fastest, easiest, and most accurate command-line music download
A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.
databooks is a package for reducing the friction data scientists while using Jupyter notebooks, by reducing the number of git conflicts between different notebooks and assisting in the resolution of the conflicts.
Manipulate EXIF and IFD metadata.
Tyf Copyright Distribution Support this project Buy Ѧ and: Send Ѧ to AUahWfkfr5J4tYakugRbfow7RWVTK35GPW Vote arky on Ark blockchain and earn Ѧ weekly
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.
ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.
Urial (URI Addition tooL) intelligently updates URIs stored in Finder comments of macOS files
Urial Urial (URI addition tool) is a simple but intelligent command-line tool to add or replace URIs found inside macOS Finder comments. Table of cont
A python script for extracting/removing exif data from images by @AbirHasan2005
Image-Exif A Python script for extracting exif metadata from images. How to use? Using this script you can extract exif data from image and save in .c
Given a metadata file with relevant schema, an SQL Engine can be run for a subset of SQL queries.
Mini-SQL-Engine Given a metadata file with relevant schema, an SQL Engine can be run for a subset of SQL queries. The query engine supports Project, A
Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists with a Plex server.
PlexMusicSync Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists (m3u, m3u8) with a Plex server. The song file
「📖」Tool created to extract metadata from a domain
Metafind is an OSINT tool created with the aim of automating the search for metadata of a particular domain from the search engine known as Google.
Add Ranges and page numbers to IIIF Manifest from a CSV.
Add Ranges and page numbers to IIIF Manifest from CSV specific to a workflow of the Bibliotheca Hertziana.
Generate your own NFTs and their metadata based on your desired probabilities.
Generate your own NFTs and their metadata based on your desired probabilities. Use your own art assets too! Perfect for use with Candy Machine.
An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.
An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models. Hyperactive: is very easy to lear
Centralized whale instance using github actions, sourcing metadata from bigquery-public-data.
Whale Demo Instance: Bigquery Public Data This is a fully-functioning demo instance of the whale data catalog, actively scraping data from Bigquery's
Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.
Predicting Yelp Review Quality Table of Contents Introduction Motivation Goal and Central Questions The Data Data Storage and ETL EDA Data Pipeline Da
Glyph Metadata Palette
This plugin for Glyphs3 allows you to associate arbitrary structured metadata to each glyph in your font.
RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation (CIKM'17)
RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation This is the implementation of RATE: Overcoming Noise and Spar
Build custom OSINT tools and APIs (Ping, Traceroute, Scans, Archives, DNS, Scrape, Whois, Metadata & built-in database for more info) with this python package
Build custom OSINT tools and APIs with this python package - It includes different OSINT modules (Ping, Traceroute, Scans, Archives, DNS, Scrape, Whoi
ICS-Visualizer is an interactive Industrial Control Systems (ICS) network graph that contains up-to-date ICS metadata
ICS-Visualizer is an interactive Industrial Control Systems (ICS) network graph that contains up-to-date ICS metadata (Name, company, port, user manua
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.
ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.
A tool for batch processing large fasta files and accompanying metadata table to upload to repositories via API
Fasta Uploader A tool for batch processing large fasta files and accompanying metadata table to repositories via API The python fasta_uploader.py scri
Analyze metadata of your Python project.
Analyze metadata of your Python projects Setup: Clone repo py-m venv venv (venv) pip install -r requirements.txt specify the folders which you want to
Rover is a command line interface application that allows through browse through mission data, images, metadata from the NASA Official Website
🤖 rover Rover is a command line interface application that allows through browse through mission data, images, metadata from the NASA Official Websit
Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.
Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.
Enrich IP addresses with metadata and security IoC
Stratosphere IP enrich Get an IP address and enrich it with metadata and IoC You need API keys for VirusTotal and PassiveTotal (RiskIQ) How to use fro
Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python
AICSImageIO Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Pure Python Features Supports reading metadata and imaging
MetaStalk is a tool that can be used to generate graphs from the metadata of JPEG, TIFF, and HEIC images
MetaStalk About MetaStalk is a tool that can be used to generate graphs from the metadata of JPEG, TIFF, and HEIC images, which are tested. More forma
Photini - A free, easy to use, digital photograph metadata (Exif, IPTC, XMP) editing application for Linux, Windows and MacOS.
A free, easy to use, digital photograph metadata (Exif, IPTC, XMP) editing application for Linux, Windows and MacOS. "Metadata" is said to mea
Python CMR is an easy to use wrapper to the NASA EOSDIS Common Metadata Repository API.
This repository is a copy of jddeal/python_cmr which is no longer maintained. It has been copied here with the permission of the original author for t
[ACM MM2021] MGH: Metadata Guided Hypergraph Modeling for Unsupervised Person Re-identification
Introduction This project is developed based on FastReID, which is an ongoing ReID project. Projects BUC In projects/BUC, we implement AAAI 2019 paper
Tooling for converting STAC metadata to ODC data model
Tooling for converting STAC metadata to ODC data model.
Easy to use Python module to extract Exif metadata from digital image files.
Easy to use Python module to extract Exif metadata from digital image files.
Unique image & metadata generation using weighted layer collections.
nft-generator-py nft-generator-py is a python based NFT generator which programatically generates unique images using weighted layer files. The progra
A daily updated JSON dataset of all the Open House London venues, events, and metadata
Open House London listings data All of it. Automatically scraped hourly with updates committed to git, autogenerated per-day CSV's, and autogenerated
Youtube playlist downloader with full metadata support
ytrake GUI tool to embed metadata for albums on Youtube with youtube-dl. Requires youtube-dl v2021.06.06. Post-processing Album metadata: Usage ytrake
Goblyn is a Python tool focused to enumeration and capture of website files metadata.
Goblyn Metadata Enumeration What's Goblyn? Goblyn is a tool focused to enumeration and capture of website files metadata. How it works? Goblyn will se
Generate YARA rules for OOXML documents using ZIP local header metadata.
apooxml Generate YARA rules for OOXML documents using ZIP local header metadata. To learn more about this tool and the methodology behind it, check ou
Hierarchical Metadata-Aware Document Categorization under Weak Supervision (WSDM'21)
Hierarchical Metadata-Aware Document Categorization under Weak Supervision This project provides a weakly supervised framework for hierarchical metada
Code I use to automatically update my videos' metadata on YouTube
mCodingYouTube This repository contains the code I use to automatically update my videos' metadata on YouTube, including: titles, descriptions, tags,
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a
A machine learning software for extracting information from scholarly documents
GROBID GROBID documentation Visit the GROBID documentation for more detailed information. Summary GROBID (or Grobid, but not GroBid nor GroBiD) means
Enumerate Microsoft 365 Groups in a tenant with their metadata
Enumerate Microsoft 365 Groups in a tenant with their metadata Description The all_groups.py script allows to enumerate all Microsoft 365 Groups in a
Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic.
Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic. Exclusiv
Textpipe: clean and extract metadata from text
textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata
Textpipe: clean and extract metadata from text
textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata
Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)
trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow
Extract embedded metadata from HTML markup
extruct extruct is a library for extracting embedded metadata from HTML markup. Currently, extruct supports: W3C's HTML Microdata embedded JSON-LD Mic
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Newspaper3k: Article scraping & curation Inspired by requests for its simplicity and powered by lxml for its speed: "Newspaper is an amazing python li
scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot
instagram_scraper This is a minimalistic Instagram scraper written in Python. It can fetch media, accounts, videos, comments etc. `Comment` and `Like`
Search for documents in a domain through Google. The objective is to extract metadata
MetaFinder - Metadata search through Google _____ __ ___________ .__ .___ / \
Download song lyrics and metadata from Genius.com 🎶🎤
LyricsGenius: a Python client for the Genius.com API lyricsgenius provides a simple interface to the song, artist, and lyrics data stored on Genius.co
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Newspaper3k: Article scraping & curation Inspired by requests for its simplicity and powered by lxml for its speed: "Newspaper is an amazing python li
Python module for handling audio metadata
Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg The