103 Repositories
Python topic-modelling Libraries
Rich Prosody Diversity Modelling with Phone-level Mixture Density Network
Phone Level Mixture Density Network for TTS This repo contains pytorch implementation of paper Rich Prosody Diversity Modelling with Phone-level Mixtu
Repositório para o #alurachallengedatascience1
1° Challenge de Dados - Alura A Alura Voz é uma empresa de telecomunicação que nos contratou para atuar como cientistas de dados na equipe de vendas.
KoBERTopic은 BERTopic을 한국어 데이터에 적용할 수 있도록 토크나이저와 BERT를 수정한 코드입니다.
KoBERTopic 모델 소개 KoBERTopic은 BERTopic을 한국어 데이터에 적용할 수 있도록 토크나이저와 BERT를 수정했습니다. 기존 BERTopic : https://github.com/MaartenGr/BERTopic/tree/05a6790b21009d
An official repository for tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a University of Edinburgh master's course.
PMR computer tutorials on HMMs (2021-2022) This is a repository for computer tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a Univer
OMLT: Optimization and Machine Learning Toolkit
OMLT is a Python package for representing machine learning models (neural networks and gradient-boosted trees) within the Pyomo optimization environment.
Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations
TopClus The source code used for Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations, published in WWW 2022. Requ
PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
Text Analysis & Topic Extraction on Android App user reviews
AndroidApp_TextAnalysis Hi, there! This is code archive for Text Analysis and Topic Extraction from user_reviews of Android App. Dataset Source : http
CLNTM - Contrastive Learning for Neural Topic Model
Contrastive Learning for Neural Topic Model This repository contains the impleme
BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions
BERTopic BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. Copula and functional Principle Component Analysis (fPCA) are statistical models that allow these properties to be simulated (Joe 2014). As such, copula generated data have shown potential to improve the generalization of machine learning (ML) emulators (Meyer et al. 2021) or anonymize real-data datasets (Patki et al. 2016).
Continual Learning of Long Topic Sequences in Neural Information Retrieval
ContinualPassageRanking Repository for the paper "Continual Learning of Long Topic Sequences in Neural Information Retrieval". In this repository you
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Active Transport Analytics Model: A new strategic transport modelling and data visualization framework
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”
Active Transport Analytics Model (ATAM) is a new strategic transport modelling and data visualization framework for Active Transport as well as emerging micro-mobility modes
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”) is a new strategic transport modelling and data visualization framew
Python package for downloading ECMWF reanalysis data and converting it into a time series format.
ecmwf_models Readers and converters for data from the ECMWF reanalysis models. Written in Python. Works great in combination with pytesmo. Citation If
Modelling the 30 salamander problem from `Pure Mathematics` by Martin Liebeck
Salamanders on an island The Problem From A Concise Introduction to Pure Mathematics By Martin Liebeck Critic Ivor Smallbrain is watching the horror m
Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab
PySDM PySDM is a package for simulating the dynamics of population of particles. It is intended to serve as a building block for simulation systems mo
NLP techniques such as named entity recognition, sentiment analysis, topic modeling, text classification with Python to predict sentiment and rating of drug from user reviews.
This file contains the following documents sumbited for Baruch CIS9665 group 9 fall 2021. 1. Dataset: drug_reviews.csv 2. python codes for text classi
Modified prey-predator system - Modified prey–predator model describes the rate of change for each species by adding coupling terms.
Modified prey-predator system We aim to study the behaviors of the modified prey–predator model and establish the effects of several parameters that p
News-Articles-and-Essays - NLP (Topic Modeling and Clustering)
NLP T5 Project proposal Topic Modeling and Clustering of News-Articles-and-Essays Students: Nasser Alshehri Abdullah Bushnag Abdulrhman Alqurashi OVER
A modular dynamical-systems model of Ethereum's validator economics.
CADLabs Ethereum Economic Model A modular dynamical-systems model of Ethereum's validator economics, based on the open-source Python library radCAD, a
3D online shooter written on Panda3D 1.10.10 and Python 3.10.1
на русском itch.io page Droid Game 3D This is a fresh game that was developed using the Panda3D game engine and Python language in the PyCharm IDE (I
Historical battle simulation package for Python
Jomini v0.1.4 Jomini creates military simulations by using mathematical combat models. Designed to be helpful for game developers, students, history e
MQTT Explorer - MQTT Subscriber client to explore topic hierarchies
mqtt-explorer MQTT Explorer - MQTT Subscriber client to explore topic hierarchies Overview The MQTT Explorer subscriber client is designed to explore
Some bravo or inspiring research works on the topic of curriculum learning.
Awesome Curriculum Learning Some bravo or inspiring research works on the topic of curriculum learning. A curated list of awesome Curriculum Learning
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing. It can be mainly used for non-English language to get accurate and relevant scraped text.
Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer)
Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer) Introduction By applying the
Reaction SMILES-AA mapping via language modelling
rxn-aa-mapper Reactions SMILES-AA sequence mapping setup conda env create -f conda.yml conda activate rxn_aa_mapper In the following we consider on ex
Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"
BP-Transformer This repo contains the code for our paper BP-Transformer: Modeling Long-Range Context via Binary Partition Zihao Ye, Qipeng Guo, Quan G
PG-19 Language Modelling Benchmark
PG-19 Language Modelling Benchmark This repository contains the PG-19 language modeling benchmark. It includes a set of books extracted from the Proje
A collection and example code of every topic you need to know about in the basics of Python.
The Python Beginners Guide: Master The Python Basics Tonight This guide is a collection of every topic you need to know about in the basics of Python.
A Python description of the Kinematic Bicycle Model with an animated example.
Kinematic Bicycle Model Abstract A python library for the Kinematic Bicycle model. The Kinematic Bicycle is a compromise between the non-linear and li
Simulation and Parameter Estimation in Geophysics
Simulation and Parameter Estimation in Geophysics - A python package for simulation and gradient based parameter estimation in the context of geophysical applications.
This repository is an individual project made at BME with the topic of self-driving car simulator and control algorithm.
BME individual project - NEAT based self-driving car This repository is an individual project made at BME with the topic of self-driving car simulator
A Unified Framework for Hydrology
Unified Framework for Hydrology The Python package unifhy (Unified Framework for Hydrology) is a hydrological modelling framework which combines inter
Pipeline for training LSA models using Scikit-Learn.
Latent Semantic Analysis Pipeline for training LSA models using Scikit-Learn. Usage Instead of writing custom code for latent semantic analysis, you j
Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration
Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration This is the official repository for the EMNLP 2021 long pa
Linear programming solver for paper-reviewer matching and mind-matching
Paper-Reviewer Matcher A python package for paper-reviewer matching algorithm based on topic modeling and linear programming. The algorithm is impleme
Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"
Graph Neural Topic Model (GNTM) This is the pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Persp
Various Algorithms for Short Text Mining
Short Text Mining in Python Introduction This package shorttext is a Python package that facilitates supervised and unsupervised learning for short te
Numba-accelerated Pythonic implementation of MPDATA with examples in Python, Julia and Matlab
PyMPDATA PyMPDATA is a high-performance Numba-accelerated Pythonic implementation of the MPDATA algorithm of Smolarkiewicz et al. used in geophysical
This is an example of a reproducible modelling project
An example of a reproducible modelling project What are we doing? This example was created for the 2021 fall lecture series of Stanford's Center for O
Conversational text Analysis using various NLP techniques
PyConverse Let me try first Installation pip install pyconverse Usage Please try this notebook that demos the core functionalities: basic usage noteb
Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations
Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations Code repo for paper Trans-Encoder: Unsupervised sentence-pa
Python library for interactive topic model visualization. Port of the R LDAvis package.
pyLDAvis Python library for interactive topic model visualization. This is a port of the fabulous R package by Carson Sievert and Kenny Shirley. pyLDA
Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.
Code for Discovering Topics in Long-tailed Corpora with Causal Intervention ACL2021 Findings Usage 0. Prepare environment Requirements: python==3.6 te
Topic Inference with Zeroshot models
zeroshot_topics Table of Contents Installation Usage License Installation zeroshot_topics is distributed on PyPI as a universal wheel and is available
Biterm Topic Model (BTM): modeling topics in short texts
Biterm Topic Model Bitermplus implements Biterm topic model for short texts introduced by Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. Actua
topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API
NLP Space News Topic Modeling Photos by nasa.gov (1, 2, 3, 4, 5) and extremetech.com Table of Contents Project Idea Data acquisition Primary data sour
neo Tool is great one in binary exploitation topic
neo Tool is great one in binary exploitation topic. instead of doing several missions by many tools and windows, you can now automate this in one tool in one session.. Enjoy it
StackNet is a computational, scalable and analytical Meta modelling framework
StackNet This repository contains StackNet Meta modelling methodology (and software) which is part of my work as a PhD Student in the computer science
Python package to Create, Read, Write, Edit, and Visualize GSFLOW models
pygsflow pyGSFLOW is a python package to Create, Read, Write, Edit, and Visualize GSFLOW models API Documentation pyGSFLOW API documentation can be fo
Explainable Zero-Shot Topic Extraction
Zero-Shot Topic Extraction with Common-Sense Knowledge Graph This repository contains the code for reproducing the results reported in the paper "Expl
QuizGame is a quiz with different topics. You can choose a topic and take the quiz
QuizGame is a quiz with different topics. You can choose a topic and take the quiz. In the end you will get your result. The program is under active development, so there may be errors or flaws in it.
A project that automatically sends you a Medium article on a topic of your choosing to your email address daily.
Daily Article from Medium ✏️ About A project that automatically sends you a Medium article on a topic of your choosing to your email address daily. No
Summer: compartmental disease modelling in Python
Summer: compartmental disease modelling in Python Summer is a Python-based framework for the creation and execution of compartmental (or "state-based"
Deep-learning X-Ray Micro-CT image enhancement, pore-network modelling and continuum modelling
EDSR modelling A Github repository for deep-learning image enhancement, pore-network and continuum modelling from X-Ray Micro-CT images. The repositor
This project aims to be a handler for input creation and running of multiple RICEWQ simulations.
What is autoRICEWQ? This project aims to be a handler for input creation and running of multiple RICEWQ simulations. What is RICEWQ? From the descript
Analysis code and Latex source of the manuscript describing the conditional permutation test of confounding bias in predictive modelling.
Git repositoty of the manuscript entitled Statistical quantification of confounding bias in predictive modelling by Tamas Spisak The manuscript descri
This repo stores the codes for topic modeling on palliative care journals.
This repo stores the codes for topic modeling on palliative care journals. Data Preparation You first need to download the journal papers. bash 1_down
How will electric vehicles affect traffic congestion and energy consumption: an integrated modelling approach
EV-charging-impact This repository contains the code that has been used for the Queue modelling for the paper "How will electric vehicles affect traff
Concept Modeling: Topic Modeling on Images and Text
Concept is a technique that leverages CLIP and BERTopic-based techniques to perform Concept Modeling on images.
Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021]
Neural Material Official code repository for the paper: Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021] Henzler, Deschai
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!
Auto-Research A no-code utility to generate a detailed well-cited survey with topic clustered sections (draft paper format) and other interesting arti
CMPE 204 Modelling Project
CISC/CMPE 204 Modelling Project Welcome to the major project for CISC/CMPE 204 (Fall 2021)! Change this README.md file to summarize your project (few
Customisable pharmacokinetic model accessible via bash CLI allowing for variable dose calculations as well as intravenous and subcutaneous administration calculations
Pharmacokinetic Modelling Group Project A PharmacoKinetic (PK) modelling function for analysis of injected solute dynamics over time, developed by Gro
Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA
n-stage Latent Dirichlet Allocation (n-LDA) Proposed n-LDA & A Novel Approach for classical LDA Latent Dirichlet Allocation (LDA) is a generative prob
A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch
A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch
KaziText is a tool for modelling common human errors.
KaziText KaziText is a tool for modelling common human errors. It estimates probabilities of individual error types (so called aspects) from grammatic
OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere.
opendrift OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere. Do
NLP topic mdel LDA - Gathered from New York Times website
NLP topic mdel LDA - Gathered from New York Times website
This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"
GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields Project Page | Paper | Supplementary | Video | Slides | Blog | Talk If
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"
Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor
A Tensorflow based library for Time Series Modelling with Gaussian Processes
Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob
Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx
Anchored CorEx: Hierarchical Topic Modeling with Minimal Domain Knowledge Correlation Explanation (CorEx) is a topic model that yields rich topics tha
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm
Dataloader tools for language modelling
Installation: pip install lm_dataloader Design Philosophy A library to unify lm dataloading at large scale Simple interface, any tokenizer can be inte
ETM - R package for Topic Modelling in Embedding Spaces
ETM - R package for Topic Modelling in Embedding Spaces This repository contains an R package called topicmodels.etm which is an implementation of ETM
Modelling and Implementation of Cable Driven Parallel Manipulator System with Tension Control
Cable Driven Parallel Robots (CDPR) is also known as Cable-Suspended Robots are the emerging and flexible end effector manipulation system. Cable-driven parallel robots (CDPRs) are categorized as a type of parallel manipulators
Civsim is a basic civilisation simulation and modelling system built in Python 3.8.
Civsim Introduction Civsim is a basic civilisation simulation and modelling system built in Python 3.8. It requires the following packages: perlin_noi
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
OCTIS : Optimizing and Comparing Topic Models is Simple! OCTIS (Optimizing and Comparing Topic models Is Simple) aims at training, analyzing and compa
A Python library created to assist programmers with complex mathematical functions
libmaths libmaths was created not only as a learning experience for me, but as a way to make mathematical models in seconds for Python users using mat
:boar: :bear: Deep Learning based Python Library for Stock Market Prediction and Modelling
bulbea "Deep Learning based Python Library for Stock Market Prediction and Modelling." Table of Contents Installation Usage Documentation Dependencies
Top2Vec is an algorithm for topic modeling and semantic search.
Top2Vec is an algorithm for topic modeling and semantic search. It automatically detects topics present in text and generates jointly embedded topic, document and word vectors.
Fast, flexible and easy to use probabilistic modelling in Python.
Please consider citing the JMLR-MLOSS Manuscript if you've used pomegranate in your academic work! pomegranate is a package for building probabilistic
Supervised domain-agnostic prediction framework for probabilistic modelling
A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data
Module for statistical learning, with a particular emphasis on time-dependent modelling
Operating system Build Status Linux/Mac Windows tick tick is a Python 3 module for statistical learning, with a particular emphasis on time-dependent
A Python library created to assist programmers with complex mathematical functions
libmaths was created not only as a learning experience for me, but as a way to make mathematical models in seconds for Python users using mat
Topic Modelling for Humans
gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
A Python toolbox for gaining geometric insights into high-dimensional data
"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview
Audio library for modelling loudness
Loudness Loudness is a C++ library with Python bindings for modelling perceived loudness. The library consists of processing modules which can be casc
A standard framework for modelling Deep Learning Models for tabular data
PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike.
Topic Modelling for Humans
gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
A Python toolbox for gaining geometric insights into high-dimensional data
"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview
Topic Modelling for Humans
gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ
Fast topic modeling platform
The state-of-the-art platform for topic modeling. Full Documentation User Mailing List Download Releases User survey What is BigARTM? BigARTM is a pow
Machine learning, in numpy
numpy-ml Ever wish you had an inefficient but somewhat legible collection of machine learning algorithms implemented exclusively in NumPy? No? Install