2082 Repositories
Python text-analysis Libraries
Yesitsme - Simple OSINT script to find Instagram profiles by name and e-mail/phone
Simple OSINT script to find Instagram profiles by name and e-mail/phone
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2
Pretty-doc - Composable text objects with python
pretty-doc from __future__ import annotations from dataclasses import dataclass
Deasciify-highlighted - A Python script for deasciifying text to Turkish and copying clipboard
deasciify-highlighted is a Python script for deasciifying text to Turkish and copying clipboard.
A Unified Framework and Analysis for Structured Knowledge Grounding
UnifiedSKG 📚 : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection This material is supplementray code for paper accepted in ICDAR 2021 We h
[AI6122] Text Data Management & Processing
[AI6122] Text Data Management & Processing is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course is Prof. Sun Aixin.
Desktop utility to download images/videos/music/text from various websites, and more
Desktop utility to download images/videos/music/text from various websites, and more
A Python library for generating new text from existing samples.
ReMarkov is a Python library for generating text from existing samples using Markov chains. You can use it to customize all sorts of writing from birt
Python data processing, analysis, visualization, and data operations
Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio
Text editor on python tkinter to convert english text to other languages with the help of ployglot.
Transliterator Text Editor This is a simple transliteration program which is used to convert english word to phonetically matching word in another lan
A python notification tool used for sending you text messages when certain conditions are met in the game, Neptune's Pride.
A python notification tool used for sending you text messages when certain conditions are met in the game, Neptune's Pride.
The text based version of my App Blocker that I planning on converting to GUI soon.
App-Blocker The text based version of my App Blocker that I planning on converting to GUI soon. Currently I am just uploading the appblocker.py file,
Unsupervised text tokenizer focused on computational efficiency
YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)
Bot that embeds a random hysterical meme from Reddit into your text channel as an embedded message, using an API call.
Discord_Meme_Bot 🤣 Bot that embeds a random hysterical meme from Reddit into your text channel as an embedded message, using an API call. Add the bot
Sentiment analysis translations of the Bhagavad Gita
Sentiment and Semantic Analysis of Bhagavad Gita Translations It is well known that translations of songs and poems not only breaks rhythm and rhyming
Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.
Fully Adaptive Bayesian Algorithm for Data Analysis FABADA FABADA is a novel non-parametric noise reduction technique which arise from the point of vi
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper
Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach
Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach Thanh Luan Nguyen, Tri Nhu Do, Georges Kaddoum
Long text token classification using LongFormer
Long text token classification using LongFormer
This project consists of data analysis and data visualization (done using python)of all IPL seasons from 2008 to 2019 and answering the most asked questions about the IPL.
IPL-data-analysis This project consists of data analysis and data visualization of all IPL seasons from 2008 to 2019 and answering the most asked ques
Project for the discipline of Visual Data Analysis at EMAp FGV.
Analysis of the dissemination of fake news about COVID-19 on Twitter This project was the final work for the discipline of Visual Data Analysis of the
The final project of "Applying AI to EHR Data" of "AI for Healthcare" nanodegree - Udacity.
Patient Selection for Diabetes Drug Testing Project Overview EHR data is becoming a key source of real-world evidence (RWE) for the pharmaceutical ind
Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization
Dynamic Stock Industrial Classification Use graph-based analysis to re-classify stocks and experiment different re-classification methodologies to imp
A collection of useful functions for writers to analyze text/stories.
AuthorTools AuthorTools provides a multitude of functions for easily analyzing (your?) writing. AuthorTools is made especially for creative writers wi
Self-supervised spatio-spectro-temporal represenation learning for EEG analysis
EEG-Oriented Self-Supervised Learning and Cluster-Aware Adaptation This repository provides a tensorflow implementation of a submitted paper: EEG-Orie
Madanalysis5 - A package for event file analysis and recasting of LHC results
Welcome to MadAnalysis 5 Outline What is MadAnalysis 5? Requirements Downloading
Aircraft design optimization made fast through modern automatic differentiation
Aircraft design optimization made fast through modern automatic differentiation. Plug-and-play analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.
Frappe tinymce - Frappe app to replace default text editor with tinymce
Frappe tinyMCE tinyMCE Text Editor for frappe apps Replace frappe's Quill Text E
Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)
Arabic-Phonetic-Output You can input the phonetic version of any Arabic text her
Stock-Prediction - prediction of stock market movements using sentiment analysis and deep learning.
Stock-Prediction- In this project, we aim to enhance the prediction of stock market movements using sentiment analysis and deep learning. We divide th
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
ALSPAC data analysis studying links between screen-usage and mental health issues in children. Provided data has been synthesised.
ADSMH - Mental Health and Screen Time Group coursework for Applied Data Science at the University of Bristol. Overview The data set that you have was
El Niño - Southern Oscillation analysis compared to minimum flow rates of rivers in northeast Brazil
ENSO (El Niño - Southern Oscillation) analysis in northeast Brazil É comprovada a influência dos fenômenos El Niño e La Niña nas secas no nordesde bra
2021 AI CUP Competition on Traditional Chinese Scene Text Recognition - Intermediate Contest
繁體中文場景文字辨識 程式碼說明 組別:這就是我 成員:蔣明憲 唐碩謙 黃玥菱 林冠霆 蕭靖騰 目錄 環境套件 安裝方式 資料夾布局 前處理-製作偵測訓練註解檔 前處理-製作分類訓練樣本 part.py : 從 json 裁切出分類訓練樣本 Class.py : 將切出來的樣本按照文字分類到各資料夾
Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.
DistilBERT-Text-mining-authorship-attribution Dataset used: https://www.kaggle.com/azimulh/tweets-data-for-authorship-attribution-modelling/version/2
EchoDNS - Analyze your DNS traffic super easy, shows all requested DNS traffic
EchoDNS - Analyze your DNS traffic super easy, shows all requested DNS traffic
Image-generation-baseline - MUGE Text To Image Generation Baseline
MUGE Text To Image Generation Baseline Requirements and Installation More detail
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.
Lbl2Vec Lbl2Vec is an algorithm for unsupervised document classification and unsupervised document retrieval. It automatically generates jointly embed
A menu for pygame. Simple, and easy to use
pygame-menu Source repo on GitHub, and run it on Repl.it Introduction Pygame-menu is a python-pygame library for creating menus and GUIs. It supports
This is RFA-Toolbox, a simple and easy-to-use library that allows you to optimize your neural network architectures using receptive field analysis (RFA) and create graph visualizations of your architecture.
ReceptiveFieldAnalysisToolbox This is RFA-Toolbox, a simple and easy-to-use library that allows you to optimize your neural network architectures usin
👄 The most accurate natural language detection library for Python, suitable for long and short text alike
1. What does this library do? Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a prepr
Sentiment-Analysis and EDA on the IMDB Movie Review Dataset
Sentiment-Analysis and EDA on the IMDB Movie Review Dataset The main part of the work focuses on the exploration and study of different approaches whi
ioztat is a storage load analysis tool for OpenZFS
ioztat is a storage load analysis tool for OpenZFS. It provides iostat-like statistics at an individual dataset/zvol level.
A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries
A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries by disassembling and analyzing the compiled python byte-code(.pyc) files across all python versions (including Python 3.10.*)
Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity
PyPortfolioOpt has recently been published in the Journal of Open Source Software 🎉 PyPortfolioOpt is a library that implements portfolio optimizatio
Practical Machine Learning with Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Semi-Automated Data Processing
Perform semi automated exploratory data analysis, feature engineering and feature selection on provided dataset by visualizing every possibilities on each step and assisting the user to make a meaningful decision to achieve a low-bias and low-variance model.
Security audit Python project dependencies against security advisory databases.
Security audit Python project dependencies against security advisory databases.
Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle.
2019-indian-election-eda Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle. This project is a part of the Cou
API Server for VoIP analysis (CDR + Audio CODECs)
Swagger generated server Overview This server was generated by the swagger-codegen project. By using the OpenAPI-Spec from a remote server, you can ea
A Telegram bot written in python.
telegram_bot This bot is currently a beta project. Features A telegram bot which can: Send current COVID-19 cases/stats of Germany Send current worth
AB-test-analyzer - Python class to perform AB test analysis
AB-test-analyzer Python class to perform AB test analysis Overview This repo con
Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products
Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products.
Repo for The Crown: Exploratory Analysis of Nim Malware DEF CON 615 talk
Repo for "The Crown: Exploratory Analysis of Nim Malware" DEF CON 615 talk
Multi-Stage Episodic Control for Strategic Exploration in Text Games
XTX: eXploit - Then - eXplore Requirements First clone this repo using git clone https://github.com/princeton-nlp/XTX.git Please create two conda envi
Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis
Introduction This is an implementation of our paper Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis.
HuSpaCy: industrial-strength Hungarian natural language processing
HuSpaCy: Industrial-strength Hungarian NLP HuSpaCy is a spaCy model and a library providing industrial-strength Hungarian language processing faciliti
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by
Repository for the AugmentedPCA Python package.
Overview This Python package provides implementations of Augmented Principal Component Analysis (AugmentedPCA) - a family of linear factor models that
Specification language for generating Generalized Linear Models (with or without mixed effects) from conceptual models
tisane Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships TL;DR: Analysts can use Tisane to author gener
This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".
Introduction This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents". If
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.
Towards Boosting the Accuracy of Non-Latin Scene Text Recognition
Convolutional Recurrent Neural Network + CTCLoss | STAR-Net Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition" Depende
Active Transport Analytics Model: A new strategic transport modelling and data visualization framework
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”
Active Transport Analytics Model (ATAM) is a new strategic transport modelling and data visualization framework for Active Transport as well as emerging micro-mobility modes
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”) is a new strategic transport modelling and data visualization framew
Sentinel-1 SAR time series analysis for OSINT use
SARveillance Sentinel-1 SAR time series analysis for OSINT use. Description Generates a time lapse GIF of the Sentinel-1 satellite images for the loca
Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.
Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.
This is an AI that is supposed to say you if your text is formal or not
This is an AI that is supposed to say you if your text is formal or not. It's written in Python 3 and has some german examples (because I'm german yk) in the text.json file. This file contains the text which the AI. The TXT-file isn't important but necessary.
Send SMS text messages via email with as many accounts as you want :)
SMS-Spammer Send SMS text messages via email with as many accounts as you want :) Example Set Up Guide! To start log into the gmail account you would
Text classification on IMDB dataset using Keras and Bi-LSTM network
Text classification on IMDB dataset using Keras and Bi-LSTM Text classification on IMDB dataset using Keras and Bi-LSTM network. Usage python3 main.py
People tracker on the Internet: OSINT analysis and research tool by Jose Pino
trape (stable) v2.0 People tracker on the Internet: Learn to track the world, to avoid being traced. Trape is an OSINT analysis and research tool, whi
A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion
List Of English Words A text file containing over 466k English words. While searching for a list of english words (for an auto-complete tutorial) I fo
Always know what to expect from your data.
Great Expectations Always know what to expect from your data. Introduction Great Expectations helps data teams eliminate pipeline debt, through data t
Natural Language Processing Best Practices & Examples
NLP Best Practices In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive bus
macOS development environment setup: Setting up a new developer machine can be an ad-hoc, manual, and time-consuming process.
dev-setup Motivation Setting up a new developer machine can be an ad-hoc, manual, and time-consuming process. dev-setup aims to simplify the process w
Jupyter notebook and datasets from the pandas Q&A video series
Python pandas Q&A video series Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas. Jupyter Note
Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 Tensorflow 2.0
NLP-Models-Tensorflow, Gathers machine learning and tensorflow deep learning models for NLP problems, code simplify inside Jupyter Notebooks 100%. Tab
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)
100 pandas puzzles Puzzles notebook Solutions notebook Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of panda
FMA: A Dataset For Music Analysis
FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information
Python library for the analysis of dynamic measurements
Python library for the analysis of dynamic measurements The goal of this library is to provide a starting point for users in metrology and related are
Streamlit App For Product Analysis - Streamlit App For Product Analysis
Streamlit_App_For_Product_Analysis Здравствуйте! Перед вами дашборд, позволяющий
Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis
Delta_Conformity_Sociopatterns_Analysis ∆-Conformity is a local homophily measur
Moji sends text and fun facts from different APIs wit da use of a notification deamon
Moji sends text and fun facts from different APIs wit da use of a notification deamon. Can be runned via dmenu or rofi.
Text editor on python to convert english text to malayalam(Romanization/Transiteration).
Manglish Text Editor This is a simple transiteration (romanization ) program which is used to convert manglish to malayalam (converts njaan to ഞാൻ ).
Trained T5 and T5-large model for creating keywords from text
text to keywords Trained T5-base and T5-large model for creating keywords from text. Supported languages: ru Pretraining Large version | Pretraining B
Practical Time-Series Analysis, published by Packt
Practical Time-Series Analysis This is the code repository for Practical Time-Series Analysis, published by Packt. It contains all the supporting proj
U-Time: A Fully Convolutional Network for Time Series Segmentation
U-Time & U-Sleep Official implementation of The U-Time [1] model for general-purpose time-series segmentation. The U-Sleep [2] model for resilient hig
A simple flask application to collect annotations for the Turing Change Point Dataset, a benchmark dataset for change point detection algorithms
AnnotateChange Welcome to the repository of the "AnnotateChange" application. This application was created to collect annotations of time series data
Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)
taganomaly Anomaly detection labeling tool, specifically for multiple time series (one time series per category). Taganomaly is a tool for creating la
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection
Deep learning for time series forecasting Flow forecast is an open-source deep learning for time series forecasting framework. It provides all the lat
Algorithms for outlier, adversarial and drift detection
Alibi Detect is an open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline d
Survival analysis in Python
What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical commu
:spaghetti: Pastas is an open-source Python framework for the analysis of hydrological time series.
Pastas: Analysis of Groundwater Time Series Pastas: what is it? Pastas is an open source python package for processing, simulating and analyzing groun
(JMLR' 19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Python Outlier Detection (PyOD) Deployment & Documentation & Stats & License PyOD is a comprehensive and scalable Python toolkit for detecting outlyin
Hierarchical Time Series Forecasting with a familiar API
scikit-hts Hierarchical Time Series with a familiar API. This is the result from not having found any good implementations of HTS on-line, and my work
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis
bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P
Highly comparative time-series analysis
〰️ hctsa 〰️ : highly comparative time-series analysis hctsa is a software package for running highly comparative time-series analysis using Matlab (fu