26 Repositories
Python glue-catalog Libraries
AI Summer's complete catalog of articles
Learn Deep Learning with AI Summer A collection of all articles (almost 100) written for the AI Summer blog organized by topic. Deep Learning Theory M
List of Land Cover datasets in the GEE Catalog
List of Land Cover datasets in the GEE Catalog A list of all the Land Cover (or discrete) datasets in Google Earth Engine. Values, Colors and Descript
A middle-to-high level algorithm book designed with coding interview at heart!
Hands-on Algorithmic Problem Solving A one-stop coding interview prep book! About this book In short, this is a middle-to-high level algorithm book de
Streaming Finance Data with AWS Lambda
A data pipeline consisting of an AWS lambda function reading data from yfinance API, an AWS Kinesis stream to receive & store data in S3 buckets and AWS Glue crawler & Athena to run SQL queries.
The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines.
The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines. It includes tools for downloading pipelines and their dependencies and tools for measuring their performace.
⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).
BERT-of-Theseus Code for paper "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing". BERT-of-Theseus is a new compressed BERT by progre
AWS Glue ETL Code Samples
AWS Glue ETL Code Samples This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilit
Automatic earthquake catalog building workflow: EQTransformer + Siamese EQTransformer + PickNet + REAL + HypoInverse
Automatic regional-scale earthquake catalog building workflow: EQTransformer + Siamese EQTransforme
Pytorch implementation for Patient Knowledge Distillation for BERT Model Compression
Patient Knowledge Distillation for BERT Model Compression Knowledge distillation for BERT model Installation Run command below to install the environm
Intake is a lightweight package for finding, investigating, loading and disseminating data.
Intake: A general interface for loading data Intake is a lightweight set of tools for loading and sharing data in data science projects. Intake helps
AWS Glue PySpark - Apache Hudi Quick Start Guide
AWS Glue PySpark - Apache Hudi Quick Start Guide Disclaimer: This is a quick start guide for the Apache Hudi Python Spark connector, running on AWS Gl
Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy
Introduction ImagePy is an open source image processing framework written in Python. Its UI interface, image data structure and table data structure a
Small flask based opds catalog designed to serve a directory via OPDS
teenyopds Small flask based opds catalog designed to serve a directory via OPDS, it has currently only been verified to work with KyBook 3 on iOS but
Utils to quickly evaluate many 🤗 models on the GLUE tasks
Utils to quickly evaluate many 🤗 models on the GLUE tasks
EODAG is a command line tool and a plugin-oriented Python framework for searching, aggregating results and downloading remote sensed images while offering a unified API for data access regardless of the data provider
EODAG (Earth Observation Data Access Gateway) is a command line tool and a plugin-oriented Python framework for searching, aggregating results and downloading remote sensed images while offering a unified API for data access regardless of the data provider
Glue is a python project to link visualizations of scientific datasets across many files.
Glue Glue is a python project to link visualizations of scientific datasets across many files. Click on the image for a quick demo: Features Interacti
Tutor plugin for integration of Open edX with a Richie course catalog
Richie plugin for Tutor This is a plugin to integrate Richie, the learning portal CMS, with Open edX. The integration takes the form of a Tutor plugin
A Pythonic Data Catalog powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
DeltaCAT DeltaCAT is a Pythonic Data Catalog powered by Ray. Its data storage model allows you to define and manage fast, scalable, ACID-compliant dat
Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques
PURPOSE OF PROJECT That this project will be created by the SOC/Incident Response Community Develop a Catalog of Incident Response Playbook for every
AI Dungeon Catalog Archive Toolkit
AI Dungeon Content Archive Toolkit (AID CAT) AID CAT is a command-line utility that will allow you to download JSON backups of: Your private and publi
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Service request portal on top of Ansible Tower
Squest - A service request portal based on Ansible Tower Squest is a Web portal that allow to expose Tower based automation as a service. If you want
Turn a STAC catalog into a dask-based xarray
StackSTAC Turn a list of STAC items into a 4D xarray DataArray (dims: time, band, y, x), including reprojection to a common grid. The array is a lazy
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
CKAN: The Open Source Data Portal Software CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana