1736 Repositories
Python Twitter-NLP-Analysis Libraries
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Code for our paper "Interactive Analysis of CNN Robustness"
Perturber Code for our paper "Interactive Analysis of CNN Robustness" Datasets Feature visualizations: Google Drive Fine-tuning checkpoints as saved m
This library contains a Tensorflow implementation of the paper Stability Analysis of Unfolded WMMSE for Power Allocation
UWMMSE-stability Tensorflow implementation of Stability Analysis of UWMMSE Overview This library contains a Tensorflow implementation of the paper Sta
ViSD4SA, a Vietnamese Span Detection for Aspect-based sentiment analysis dataset
UIT-ViSD4SA PACLIC 35 General Introduction This repository contains the data of the paper: Span Detection for Vietnamese Aspect-Based Sentiment Analys
SciBERT is a BERT model trained on scientific text.
SciBERT is a BERT model trained on scientific text.
Documentation and issues for Pylance - Fast, feature-rich language support for Python
Documentation and issues for Pylance - Fast, feature-rich language support for Python
Tindicators is a Python library to calculate the values of various technical indicators
Tindicators is a Python library to calculate the values of various technical indicators
Terraform module to ship CloudTrail logs stored in a S3 bucket into a Kinesis stream for further processing and real-time analysis.
AWS infrastructure to ship CloudTrail logs from S3 to Kinesis This repository contains a Terraform module to ship CloudTrail logs stored in a S3 bucke
The first online catalogue for Arabic NLP datasets.
Masader The first online catalogue for Arabic NLP datasets. This catalogue contains 200 datasets with more than 25 metadata annotations for each datas
novel deep learning research works with PaddlePaddle
Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 目录 计算机视觉(Computer Vision) 自然语言处理(Natrual Language Processing) 知识图谱(Knowledge Graph) 时空数据挖掘(Spa
💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"
Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes Official PyTorch implementation and EmoCause evaluatio
Propose a principled and practically effective framework for unsupervised accuracy estimation and error detection tasks with theoretical analysis and state-of-the-art performance.
Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles This project is for the paper: Detecting Errors and Estimating
Deep Markov Factor Analysis (NeurIPS2021)
Deep Markov Factor Analysis (DMFA) Codes and experiments for deep Markov factor analysis (DMFA) model accepted for publication at NeurIPS2021: A. Farn
NLP, Machine learning
Netflix-recommendation-system NLP, Machine learning About Recommendation algorithms are at the core of the Netflix product. It provides their members
A simple demonstration of integrating a sentiment analysis tool in a django project
sentiment-analysis A simple demonstration of integrating a sentiment analysis tool in a django project (watch the video .mp4) To run this project : pi
Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine .
TwitterScraper Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine . Screenshot Data Users Only
Hapi is a Python library for building Conceptual Distributed Model using HBV96 lumped model & Muskingum routing method
Current build status All platforms: Current release info Name Downloads Version Platforms Hapi - Hydrological library for Python Hapi is an open-sourc
Medical image analysis framework merging ANTsPy and deep learning
ANTsPyNet A collection of deep learning architectures and applications ported to the python language and tools for basic medical image processing. Bas
The Multi-Mission Maximum Likelihood framework (3ML)
PyPi Conda The Multi-Mission Maximum Likelihood framework (3ML) A framework for multi-wavelength/multi-messenger analysis for astronomy/astrophysics.
Python-based Space Physics Environment Data Analysis Software
pySPEDAS pySPEDAS is an implementation of the SPEDAS framework for Python. The Space Physics Environment Data Analysis Software (SPEDAS) framework is
BioMASS - A Python Framework for Modeling and Analysis of Signaling Systems
Mathematical modeling is a powerful method for the analysis of complex biological systems. Although there are many researches devoted on produ
MaD GUI is a basis for graphical annotation and computational analysis of time series data.
MaD GUI Machine Learning and Data Analytics Graphical User Interface MaD GUI is a basis for graphical annotation and computational analysis of time se
A collection of papers about Transformer in the field of medical image analysis.
A collection of papers about Transformer in the field of medical image analysis.
Analysis of rationale selection in neural rationale models
Neural Rationale Interpretability Analysis We analyze the neural rationale models proposed by Lei et al. (2016) and Bastings et al. (2019), as impleme
novel deep learning research works with PaddlePaddle
Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 目录 计算机视觉(Computer Vision) 自然语言处理(Natrual Language Processing) 知识图谱(Knowledge Graph) 时空数据挖掘(Spa
Python and OpenCV-based scene cut/transition detection program & library.
Video Scene Cut Detection and Analysis Tool Latest Release: v0.5.6.1 (October 11, 2021) Main Webpage: py.scenedetect.com Documentation: manual.scenede
DataPrep — The easiest way to prepare data in Python
DataPrep — The easiest way to prepare data in Python
Twitter Media Downloader (Telegram Bot)
Twitter Media Downloader (Telegram Bot)
Python Library for Signal/Image Data Analysis with Transport Methods
PyTransKit Python Transport Based Signal Processing Toolkit Website and documentation: https://pytranskit.readthedocs.io/ Installation The library cou
A cross-lingual COVID-19 fake news dataset
CrossFake An English-Chinese COVID-19 fake&real news dataset from the ICDMW 2021 paper below: Cross-lingual COVID-19 Fake News Detection. Jiangshu Du,
This is an official pytorch implementation of Fast Fourier Convolution.
Fast Fourier Convolution (FFC) for Image Classification This is the official code of Fast Fourier Convolution for image classification on ImageNet. Ma
Smart discord chatbot integrated with Dialogflow
academic-NLP-chatbot Smart discord chatbot integrated with Dialogflow to interact with students naturally and manage different classes in a school. De
NLP topic mdel LDA - Gathered from New York Times website
NLP topic mdel LDA - Gathered from New York Times website
Implementation of ICCV21 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers
Implementation of ICCV 2021 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers arxiv This repository is based on detr Recently, DETR
Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.
Web Scrapping Popular Youtube Tech Channels with Selenium Data Mining, Data Wrangling, and Exploratory Data Analysis About the Data Web scrapi
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
African language Speech Recognition - Speech-to-Text
Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l
DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis.
DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis. The main goal of the package is to accelerate the process of computing estimates of forward reachable sets for nonlinear dynamical systems.
Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.
Statistical Analysis 📈 This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr
A collection of learning outcomes data analysis using Python and SQL, from DQLab.
Data Analyst with PYTHON Data Analyst berperan dalam menghasilkan analisa data serta mempresentasikan insight untuk membantu proses pengambilan keputu
PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.
A part of HyRiver software stack that provides access to NHD+ V2 data through NLDI and WaterData web services
Python Package for DataHerb: create, search, and load datasets.
The Python Package for DataHerb A DataHerb Core Service to Create and Load Datasets.
PyPSA: Python for Power System Analysis
1 Python for Power System Analysis Contents 1 Python for Power System Analysis 1.1 About 1.2 Documentation 1.3 Functionality 1.4 Example scripts as Ju
A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.
The leading use-case for the staircase package is for the creation and analysis of step functions. Pretty exciting huh. But don't hit the close button
HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets
HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets that can be described as multidimensional arrays o
Additional tools for particle accelerator data analysis and machine information
PyLHC Tools This package is a collection of useful scripts and tools for the Optics Measurements and Corrections group (OMC) at CERN. Documentation Au
Toolchest provides APIs for scientific and bioinformatic data analysis.
Toolchest Python Client Toolchest provides APIs for scientific and bioinformatic data analysis. It allows you to abstract away the costliness of runni
A data analysis using python and pandas to showcase trends in school performance.
A data analysis using python and pandas to showcase trends in school performance. A data analysis to showcase trends in school performance using Panda
An end-to-end regression problem of predicting the price of properties in Bangalore.
Bangalore-House-Price-Prediction An end-to-end regression problem of predicting the price of properties in Bangalore. Deployed in Heroku using Flask.
Repositori untuk menyimpan material Long Course STMKGxHMGI tentang Geophysical Python for Seismic Data Analysis
Long Course "Geophysical Python for Seismic Data Analysis" Instruktur: Dr.rer.nat. Wiwit Suryanto, M.Si Dipersiapkan oleh: Anang Sahroni Waktu: Sesi 1
The OHSDI OMOP Common Data Model allows for the systematic analysis of healthcare observational databases.
The OHSDI OMOP Common Data Model allows for the systematic analysis of healthcare observational databases.
Universal data analysis tools for atmospheric sciences
U_analysis Universal data analysis tools for atmospheric sciences Script written in python 3. This file defines multiple functions that can be used fo
TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.
TCube: Domain-Agnostic Neural Time series Narration This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narrat
基于nonebot2的twitter推送插件
HanayoriBot(Twitter插件) ✨ 基于NoneBot2的Twitter推送插件,自带百度翻译接口 ✨ 简介 本插件基于NoneBot2与go-cqhttp,可以及时将Twitter用户的最新推文推送至群聊,并且自带基于百度翻译的推文翻译接口,及时跟进你所关注的Vtuber的外网动态。
An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!
Social Media Scraper An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line! Go to the website » Vie
BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.
BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia. Its intended use is as input for neural models in natural language processing.
Tribuo - A Java machine learning library
Tribuo - A Java prediction library (v4.1) Tribuo is a machine learning library in Java that provides multi-class classification, regression, clusterin
A python script to download twitter space, only works on running spaces (for now).
A python script to download twitter space, only works on running spaces (for now).
SurvTRACE: Transformers for Survival Analysis with Competing Events
⭐ SurvTRACE: Transformers for Survival Analysis with Competing Events This repo provides the implementation of SurvTRACE for survival analysis. It is
A Python API wrapper for the Twitter API!
PyTweet PyTweet is an api wrapper made for twitter using twitter's api version 2! Installation Windows py3 -m pip install PyTweet Linux python -m pip
Exploratory Data Analysis for Employee Retention Dataset
Exploratory Data Analysis for Employee Retention Dataset Employee turn-over is a very costly problem for companies. The cost of replacing an employee
Twitter Bootstrap for Django Form - A simple Django template tag to work with Bootstrap
Twitter Bootstrap for Django Form - A simple Django template tag to work with Bootstrap
Nanosensor Image Processor (NanoImgPro), a python-based image analysis tool for dopamine nanosensors
NanoImgPro Nanosensor Image Processor (NanoImgPro), a python-based image analysis tool for dopamine nanosensors NanoImgPro.py contains the main class
Train 🤗-transformers model with Poutyne.
poutyne-transformers Train 🤗 -transformers models with Poutyne. Installation pip install poutyne-transformers Example import torch from transformers
Google's Meena transformer chatbot implementation
Here's my attempt at recreating Meena, a state of the art chatbot developed by Google Research and described in the paper Towards a Human-like Open-Domain Chatbot.
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English ⚖️ 🏆 🧑🎓 👩⚖️ Dataset Summary Inspired by the recent widespread use of th
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation
Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch with a familiar Pandas-compatible API.
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Angora is a mutation-based fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without symbolic execution.
Angora Angora is a mutation-based coverage guided fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without s
HyDiff: Hybrid Differential Software Analysis
HyDiff: Hybrid Differential Software Analysis This repository provides the tool and the evaluation subjects for the paper HyDiff: Hybrid Differential
Twitter Scraper
Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I reverse–engineered. No API rate limits. No restrictions. Extremely fast.
A Python library for choreographing your machine learning research.
A Python library for choreographing your machine learning research.
Tweet stream in OBS browser source
Tweetron TweetronはOBSブラウザーソースを使用してツイートを画面上に表示するツールソフトです Windowsのみ対応 (Windows10動作確認済) ダウンロード こちらから最新版をダウンロードしてください (現在ベータテスト版を配布しています) Download ver0.0.
Tablexplore is an application for data analysis and plotting built in Python using the PySide2/Qt toolkit.
Tablexplore is an application for data analysis and plotting built in Python using the PySide2/Qt toolkit.
Signature remover is a NLP based solution which removes email signatures from the rest of the text.
Signature Remover Signature remover is a NLP based solution which removes email signatures from the rest of the text. It helps to enchance data conten
customer churn prediction prevention in telecom industry using machine learning and survival analysis
Telco Customer Churn Prediction - Plotly Dash Application Description This dash application allows you to predict telco customer churn using machine l
Tickergram is a Telegram bot to look up quotes, charts, general market sentiment and more.
Tickergram is a Telegram bot to look up quotes, charts, general market sentiment and more.
Analyse a forensic target (such as a directory) to find and report files found and not found from CIRCL hashlookup public service
Analyse a forensic target (such as a directory) to find and report files found and not found from CIRCL hashlookup public service. This tool can help a digital forensic investigator to know the context, origin of specific files during a digital forensic investigation.
Training RNNs as Fast as CNNs
News SRU++, a new SRU variant, is released. [tech report] [blog] The experimental code and SRU++ implementation are available on the dev branch which
First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.
dbt-osmosis First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we wan
Beyond Paragraphs: NLP for Long Sequences
Beyond Paragraphs: NLP for Long Sequences
Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks
Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks, which modifies the input text with a textual template and directly uses PLMs to conduct pre-trained tasks. This library provides a standard, flexible and extensible framework to deploy the prompt-learning pipeline. OpenPrompt supports loading PLMs directly from huggingface transformers. In the future, we will also support PLMs implemented by other libraries.
Google and Stanford University released a new pre-trained model called ELECTRA
Google and Stanford University released a new pre-trained model called ELECTRA, which has a much compact model size and relatively competitive performance compared to BERT and its variants. For further accelerating the research of the Chinese pre-trained model, the Joint Laboratory of HIT and iFLYTEK Research (HFL) has released the Chinese ELECTRA models based on the official code of ELECTRA. ELECTRA-small could reach similar or even higher scores on several NLP tasks with only 1/10 parameters compared to BERT and its variants.
pcnaDeep integrates cutting-edge detection techniques with tracking and cell cycle resolving models.
pcnaDeep: a deep-learning based single-cell cycle profiler with PCNA signal Welcome! pcnaDeep integrates cutting-edge detection techniques with tracki
Tools for collecting social media data around focal events
Social Media Focal Events The focalevents codebase provides tools for organizing data collected around focal events on social media. It is often diffi
A Chinese to English Neural Model Translation Project
ZH-EN NMT Chinese to English Neural Machine Translation This project is inspired by Stanford's CS224N NMT Project Dataset used in this project: News C
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.
Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".
Paradigm Shift in NLP Welcome to the webpage for "Paradigm Shift in Natural Language Processing". Some resources of the paper are constantly maintaine
Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.
Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza
ThePhish: an automated phishing email analysis tool
ThePhish ThePhish is an automated phishing email analysis tool based on TheHive, Cortex and MISP. It is a web application written in Python 3 and base
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"
Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor
Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning
Manifold-SCA Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning The repo is org
Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)
DeepNLP-models-Pytorch Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ: NLP with Deep Learning) This is not for Pytorch be
An IPython Notebook tutorial on deep learning for natural language processing, including structure prediction.
Table of Contents: Introduction to Torch's Tensor Library Computation Graphs and Automatic Differentiation Deep Learning Building Blocks: Affine maps,
WRENCH: Weak supeRvision bENCHmark
🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development
A Twitter Bot that retweets and likes tweets with the hashtag #girlscriptwoc and #girlscript, and also follows the user.
GirlScript Winter of Contributing Twitter Bot A Twitter Bot that retweets and likes tweets with the hashtag #girlscriptwoc and #girlscript, and also f
Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.
Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.
📊📈 Serves up Pandas dataframes via the Django REST Framework for use in client-side (i.e. d3.js) visualizations and offline analysis (e.g. Excel)
📊📈 Serves up Pandas dataframes via the Django REST Framework for use in client-side (i.e. d3.js) visualizations and offline analysis (e.g. Excel)