1678 Repositories
Python Drug-Rating-Prediction-Portal-from-Drug-Reviews-Dataset-NLP-ML Libraries
Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction".
TGIN Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction". Files in the folder dataset/ electr
Citation Intent Classification in scientific papers using the Scicite dataset an Pytorch
Citation Intent Classification Table of Contents About the Project Built With Installation Usage Acknowledgments About The Project Citation Intent Cla
This repository has a implementations of data augmentation for NLP for Japanese.
daaja This repository has a implementations of data augmentation for NLP for Japanese: EDA: Easy Data Augmentation Techniques for Boosting Performance
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction
EquiBind: geometric deep learning for fast predictions of the 3D structure in which a small molecule binds to a protein
Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision
Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation
A repo for materials relating to the tutorial of CS-332 NLP
CS-332-NLP A repo for materials relating to the tutorial of CS-332 NLP Contents Tutorial 1: Introduction Corpus Regular expression Tokenization Tutori
What are the best Systems? New Perspectives on NLP Benchmarking
What are the best Systems? New Perspectives on NLP Benchmarking In Machine Learning, a benchmark refers to an ensemble of datasets associated with one
Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.
Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.
Benchmarking Pipeline for Prediction of Protein-Protein Interactions
B4PPI Benchmarking Pipeline for the Prediction of Protein-Protein Interactions How this benchmarking pipeline has been built, and how to use it, is de
GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.
GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.
Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated
Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated. This engine can later be used for downstream tasks in NLP such as Q&A, summarization, generation, and natural language understanding (NLU).
HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records
HiPAL Code for KDD'22 Applied Data Science Track submission -- HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electro
Computational inteligence project on faces in the wild dataset
Table of Contents The general idea How these scripts work? Loading data Needed modules and global variables Parsing the arrays in dataset Extracting a
Course materials for a 3-day seminar "Machine Learning and NLP: Advances and Applications" at New College of Florida
Machine Learning and NLP: Advances and Applications This repository hosts the course materials used for a 3-day seminar "Machine Learning and NLP: Adv
A python script that will automate the boring task of login to the captive portal again and again
A python script that will automate the boring task of login to the captive portal again and again
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the Github. Currently, it just works properly on Python but not bad at other languages (thanks to GPT-2's power).
Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset
SW-CV-ModelZoo Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset Framework: TF/Keras 2.7 Training SQLite D
HAIS_2GNN: 3D Visual Grounding with Graph and Attention
HAIS_2GNN: 3D Visual Grounding with Graph and Attention This repository is for the HAIS_2GNN research project. Tao Gu, Yue Chen Introduction The motiv
Complete* list of autonomous driving related datasets
AD Datasets Complete* and curated list of autonomous driving related datasets Contributing Contributions are very welcome! To add or update a dataset:
Code for ML, domain generation, graph generation of ABC dataset
This is the repository for codes for ML, domain generation, graph generation of Asymmetric Buckling Columns (ABC) dataset in the paper "Learning Mechanically Driven Emergent Behavior with Message Passing Neural Networks".
L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources.
L3Cube-MahaCorpus L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources. We expand the existing Marathi monolingual
CodeContests is a competitive programming dataset for machine-learning
CodeContests CodeContests is a competitive programming dataset for machine-learning. This dataset was used when training AlphaCode. It consists of pro
A model which classifies reviews as positive or negative.
SentiMent Analysis In this project I built a model to classify movie reviews fromn the IMDB dataset of 50K reviews. WordtoVec : Neural networks only w
JaQuAD: Japanese Question Answering Dataset
JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension (2022, Skelter Labs)
Trafffic prediction analysis using hybrid models - Machine Learning
Hybrid Machine learning Model Clone the Repository Create a new Directory as assests and download the model from the below link Model Link To Start th
Text Analysis & Topic Extraction on Android App user reviews
AndroidApp_TextAnalysis Hi, there! This is code archive for Text Analysis and Topic Extraction from user_reviews of Android App. Dataset Source : http
A python package for deep multilingual punctuation prediction.
This python library predicts the punctuation of English, Italian, French and German texts. We developed it to restore the punctuation of transcribed spoken language.
A transformer which can randomly augment VOC format dataset (both image and bbox) online.
VocAug It is difficult to find a script which can augment VOC-format dataset, especially the bbox. Or find a script needs complex requirements so it i
To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.
To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.
Estimation of the CEFR complexity score of a given word, sentence or text.
NLP-Swedish … allows to estimate CEFR (Common European Framework of References) complexity score of a given word, sentence or text. CEFR scores come f
Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5
NLP-Summarizer Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5 This project aimed to provide in
CCP dataset from Clothing Co-Parsing by Joint Image Segmentation and Labeling
Clothing Co-Parsing (CCP) Dataset Clothing Co-Parsing (CCP) dataset is a new clothing database including elaborately annotated clothing items. 2, 098
Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.
COVID19-WEB-SCRAPER Open Source Tech Lab - Project [SEMESTER IV] OSTL Assignments OSTL Assignments - 1 OSTL Assignments - 2 Project COVID19 India Data
A notebook to analyze Amazon Recommendation Review Dataset.
Amazon Recommendation Review Dataset Analyzer A notebook to analyze Amazon Recommendation Review Dataset. Features Calculates distinct user count, dis
This machine learning model was developed for House Prices
This machine learning model was developed for House Prices - Advanced Regression Techniques competition in Kaggle by using several machine learning models such as Random Forest, XGBoost and LightGBM.
1900-2016 Olympic Data Analysis in Python by plotting different graphs
🔥 Olympics Data Analysis 🔥 In Data Science field, there is a big topic before creating a model for future prediction is Data Analysis. We can find o
Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc)
Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc). Structured a custom ensemble model and a neural network. Found a outperformed model for heart failure prediction accuracy of 88 percent.
Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.
T-IAI-901-MSC2022 - GROUP 18 Gestion de projet Notre travail a été organisé et réparti dans un Trello. https://trello.com/b/X3s2fpPJ/ia-projet Install
Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification
Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification Introduction. This package includes the pyth
Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included.
pixel_character_generator Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included. Dataset TinyHero D
Semantic Segmentation Suite in TensorFlow
Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.
Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.
Breast Cancer Classification Model is applied on a different dataset
Breast Cancer Classification Model is applied on a different dataset
ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.
VOODOO Revealing supercooled liquid beyond lidar attenuation Explore the docs » Report Bug · Request Feature Table of Contents About The Project Built
Credit Card Fraud Detection, used the credit card fraud dataset from Kaggle
Credit Card Fraud Detection, used the credit card fraud dataset from Kaggle
Sematic-Segmantation - Semantic Segmentation on MIT ADE20K dataset in PyTorch
Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch impleme
Soccerdata - Efficiently scrape soccer data from various sources
SoccerData is a collection of wrappers over soccer data from Club Elo, ESPN, FBr
A Python command-line utility for validating that the outputs of a given Declarative Form Azure Portal UI JSON template map to the input parameters of a given ARM Deployment Template JSON template
A Python command-line utility for validating that the outputs of a given Declarative Form Azure Portal UI JSON template map to the input parameters of a given ARM Deployment Template JSON template
nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.
nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.
Training a deep learning model on the noisy CIFAR dataset
Training-a-deep-learning-model-on-the-noisy-CIFAR-dataset This repository contai
What games should I design next quarter?
Project1_Next-Quarter-Game-Design-Prediction What games should I design next quarter? 상황 : 게임회사 데이터팀 합류 문제 : '다음 분기에 어떤 게임을 설계해야할까' Data Description N
A machine learning model for Covid case prediction
CovidcasePrediction A machine learning model for Covid case prediction Problem Statement Using regression algorithms we can able to track the active c
GraphNLI: A Graph-based Natural Language Inference Model for Polarity Prediction in Online Debates
GraphNLI: A Graph-based Natural Language Inference Model for Polarity Prediction in Online Debates Vibhor Agarwal, Sagar Joglekar, Anthony P. Young an
This git repo contains the implementation of my ML project on Heart Disease Prediction
Introduction This git repo contains the implementation of my ML project on Heart Disease Prediction. This is a real-world machine learning model/proje
The code uses SegFormer for Semantic Segmentation on Drone Dataset.
SegFormer_Segmentation The code uses SegFormer for Semantic Segmentation on Drone Dataset. The details for the SegFormer can be obtained from the foll
The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.
This repository contains the raw dataset used in NHNet [1] for the task of News Story Headline Generation. The code of data processing and training is available under Tensorflow Models - NHNet.
This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project
Common Voice Utils This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project. It aims t
Final Project for the Intel AI Readiness Boot Camp NLP (Jan)
NLP Boot Camp (Jan) Synopsis Full Name: Prameya Mohanty Name of your School: Delhi Public School, Rourkela Class: VIII Title of the Project: iTransect
Espial is an engine for automated organization and discovery of personal knowledge
Live Demo (currently not running, on it) Espial is an engine for automated organization and discovery in knowledge bases. It can be adapted to run wit
Mednlp - Medical natural language parsing and utility library
Medical natural language parsing and utility library A natural language medical
Breast-Cancer-Classification - Using SKLearn breast cancer dataset which contains 569 examples and 32 features classifying has been made with 6 different algorithms
Breast-Cancer-Classification - Using SKLearn breast cancer dataset which contains 569 examples and 32 features classifying has been made with 6 different algorithms
Using Logistic Regression and classifiers of the dataset to produce an accurate recall, f-1 and precision score
Using Logistic Regression and classifiers of the dataset to produce an accurate recall, f-1 and precision score
This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Models used for prediction Diabetes and further the basic theory and working of Gold nanoparticles.
GoldNanoparticles This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Mode
Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease
Heart_Disease_Classification Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease Dataset
A smart Chat bot that can help to know about corona virus and Make prediction of corona using X-ray.
TRINIT_Hum_kuchh_nahi_karenge_ML01 Document Link https://github.com/Jatin-Goyal-552/TRINIT_Hum_kuchh_nahi_karenge_ML01/blob/main/hum_kuchh_nahi_kareng
To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction
To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction. The challenge aims to adress the problems of medical imbalanced data classification.
LSTM model - IMDB review sentiment analysis
NLP - Movie review sentiment analysis The colab notebook contains the code for building a LSTM Recurrent Neural Network that gives 87-88% accuracy on
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from t
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊
Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".
GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear
In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.
Transformers are all you need In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a
Breast-Cancer-Prediction
Breast-Cancer-Prediction Trying to predict whether the cancer is benign or malignant using REGRESSION MODELS in Python. Team Members NAME ROLL-NUMBER
RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.
ruCLIP-SB RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and re
Notebooks, slides and dataset of the CorrelAid Machine Learning Winter School
CorrelAid Machine Learning Winter School Welcome to the CorrelAid ML Winter School! Task The problem we want to solve is to classify trees in Roosevel
Computer Vision Paper Reviews with Key Summary of paper, End to End Code Practice and Jupyter Notebook converted papers
Computer-Vision-Paper-Reviews Computer Vision Paper Reviews with Key Summary along Papers & Codes. Jonathan Choi 2021 The repository provides 100+ Pap
Deep Inertial Prediction (DIPr)
Deep Inertial Prediction For more information and context related to this repo, please refer to our website. Getting Started (non Docker) Note: you wi
Most Simple & Powefull web3 Trade Bot (WINDOWS LINUX) Suport BSC ETH
Most Simple & Powefull Trade Bot (WINDOWS LINUX) What Are Some Pros And Cons Of Owning A Sniper Bot? While having a sniper bot is typically an advanta
This repository provides the official code for GeNER (an automated dataset Generation framework for NER).
GeNER This repository provides the official code for GeNER (an automated dataset Generation framework for NER). Overview of GeNER GeNER allows you to
Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill
Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill This is a port of the amazing openskill.js package
TalkingHead-1KH is a talking-head dataset consisting of YouTube videos
TalkingHead-1KH Dataset TalkingHead-1KH is a talking-head dataset consisting of YouTube videos, originally created as a benchmark for face-vid2vid: On
NLP applications using deep learning.
NLP-Natural-Language-Processing NLP applications using deep learning like text generation etc. 1- Poetry Generation: Using a collection of Irish Poem
A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.
A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.
WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary
WikiPron WikiPron is a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary, as well as a database of pronuncia
Drug prediction
I have collected data about a set of patients, all of whom suffered from the same illness. During their course of treatment, each patient responded to one of 5 medications, Drug A, Drug B, Drug c, Drug x and y. Part of our job is to build a model to find out which drug might be appropriate for a future patient with the same illness. The feature sets of this dataset are Age, Sex, Blood Pressure, and Cholesterol of patients, and the target is the drug that each patient responded to.
An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters
CNN-Filter-DB An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters Paul Gavrikov, Janis Keuper Paper: htt
FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction
FaceExtraction FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction Occlusions often occur in face images in the wild, tr
Tracking Progress in Question Answering over Knowledge Graphs
Tracking Progress in Question Answering over Knowledge Graphs Table of contents Question Answering Systems with Descriptions The QA Systems Table cont
Leaf: Multiple-Choice Question Generation
Leaf: Multiple-Choice Question Generation Easy to use and understand multiple-choice question generation algorithm using T5 Transformers. The applicat
OntoProtein: Protein Pretraining With Ontology Embedding
OntoProtein This is the implement of the paper "OntoProtein: Protein Pretraining With Ontology Embedding". OntoProtein is an effective method that mak
CSPML (crystal structure prediction with machine learning-based element substitution)
CSPML (crystal structure prediction with machine learning-based element substitution) CSPML is a unique methodology for the crystal structure predicti
A beginner’s guide to train and deploy machine learning pipelines in Python using PyCaret
This model involves Insurance bill prediction, which was subsequently deployed on Heroku PaaS
Housing Price Prediction
This project aim was to predict the price of houses in the Boston area during the great financial crisis through regression, as well as classify houses into different quality categories according to their features. For this we used a kaggle data set.
Predict the spans of toxic posts that were responsible for the toxic label of the posts
toxic-spans-detection An attempt at the SemEval 2021 Task 5: Toxic Spans Detection. The Toxic Spans Detection task of SemEval2021 required participant
Gems & Holiday Package Prediction
Predictive_Modelling Gems & Holiday Package Prediction This project is based on 2 cases studies : Gems Price Prediction and Holiday Package prediction
Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning
Machine_Learning Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning This project is based on 2 case-studies:
Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch
Semantic Segmentation Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch Features Applicable to followin
This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.
This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.
Detic ros - A simple ROS wrapper for Detic instance segmentation using pre-trained dataset
Detic ros - A simple ROS wrapper for Detic instance segmentation using pre-trained dataset
Twitter bot that uses NLP models to summarize news articles referenced in a user's twitter timeline
Twitter-News-Summarizer Twitter bot that uses NLP models to summarize news articles referenced in a user's twitter timeline 1.) Extracts all tweets fr
This is a NLP based project to extract effective date of the contract from their text files.
Date-Extraction-from-Contracts This is a NLP based project to extract effective date of the contract from their text files. Problem statement This is