MLR - Machine Learning Research

Charles

Last update: Oct 20, 2022

Related tags

Machine Learning python machine-learning natural-language-processing computer-vision deep-learning research-paper aaai cvpr iccv conference-paper iclr icml neurips

Overview

Machine Learning Research

1. Project Topic

1.1. Exsiting research

Benmark:

https://paperswithcode.com/sota
ACL anthology for NLP papers:

http://www.aclweb.org/anthology/
Online proceedings of major ML conferences:
- NeurIPS
- ICML, ICLR, CVPR, EMNLP, NAACL
Online preprint servers:

http://arxiv.org/
Top paper menioned on Twitter:

http://www.arxiv-sanity.com/
Others:

1.2. Datasets and Tasks

Huggingface Datasets:

https://huggingface.co/datasets
Kaggle has many datasets, though some of them are too small for Deep Learning:

https://www.kaggle.com/datasets
SOTA NLP:

https://paperswithcode.com/sota

https://nlpprogress.com/

https://gluebenchmark.com/leaderboard

https://www.conll.org/previous-tasks
A small list of well-known standard datasets for common NLP tasks: https://machinelearningmastery.com/datasets-natural-language-processing/
An alphabetical list of free or public domain text datasets:

https://github.com/niderhoff/nlp-datasets
Wikipedia has a list of machine learning text datasets, tabulated with useful information such as dataset size: https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research#Text_data
Datahub has lots of datasets, though not all of it is Machine Learning focused:

https://datahub.io/collections
Microsoft Research has a collection of datasets (look under the ‘Dataset directory’ tab): https://www.microsoft.com/en-us/research/academic-program/data-science-microsoft-research/?from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fprojects%2Fdata-science-initiative%2F%20datasets.aspx#!dataset-directory
A script to search arXiv papers for a keyword, and extract important information such as performance metrics on a task:

https://huyenchip.com/2018/10/04/sotawhat.html
Datasets for machine translation:

http://statmt.org
Syntactic corpora for many languages:

https://universaldependencies.org

2. Project Advice

Processing Data

StanfordNLP: a Python library providing tokenization, tagging, parsing, and other capabilities:

https://stanfordnlp.github.io/stanfordnlp/
Other software from the Stanford NLP group: http://nlp.stanford.edu/software/index.shtml
NLTK, a lightweight Natural Language Toolkit package in Python: http://nltk.org/
spaCy, another Python package that can do preprocessing, but also includes neural models (e.g. Language Models):

https://spacy.io/

3. Top Tiers ML&AI Conferences

Site
- ML
  - NeurIPS
  - ICML
  - ICLR
  - AISTATS
- CV
  - CVPR
  - ICCV
  - ECCV
- NLP
  - ACL
  - NAACL
  - EMNLP
- Data
  - KDD
  - CIKM
  - ICDM
  - SDM
  - PAKDD
  - PKDD/ECML
  - RECSYS
  - SIGIR
  - WWW
  - WSDM
- AI
  - AAAI
  - ICANN
  - IJCAI
  - UAI
NeurIPS: Neural Information Processing Systems (formerly abbreviated NIPS). NeurIPS has gotten huge over the past few years as AI has become so important. Has a focus on neural networks, but not exclusively.

https://nips.cc
ICML: International Conference on Machine Learning. Has a general machine learning focus.

https://icml.cc
ICLR: International Conference on Learning Representations. ICLR was really the first conference focused on deep learning. It’s called “learning representations” because the motivation behind deep learning is to automatically learn higher-level features, or representations, that summarize data in useful ways. Deep Learning describes the structure of our current best solution to the problem of learning these representations.

https://iclr.cc
AAAI: Association for the Advancement of Artificial Intelligence. AAAI is a little more applications focused, and a little less theoretical than some of the other AI conferences.

http://www.aaai.org
CVPR: Computer Vision and Pattern Recognition.

https://www.thecvf.com
ICCV: International Conference on Computer Vision.

https://www.thecvf.com

4. Reference

Practical Tips for Final Projects Notes

List of great ML/AI conferences

Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

0 Jan 31, 2022

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

3k Jan 8, 2023

cuML - RAPIDS Machine Learning Library

cuML - GPU Machine Learning Algorithms cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions t

3.1k Dec 28, 2022

mlpack: a scalable C++ machine learning library --

4.2k Jan 1, 2023

A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

11.6k Jan 2, 2023

A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc

4.2k Dec 29, 2022

50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

[Due to the time taken @ uni, work + hell breaking loose in my life, since things have calmed down a bit, will continue commiting!!!] [By the way, I'm

1.4k Jan 1, 2023

Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

663 Dec 31, 2022

Uplift modeling and causal inference with machine learning algorithms

Disclaimer This project is stable and being incubated for long-term support. It may contain new experimental code, for which APIs are subject to chang

3.7k Jan 7, 2023

MLR - Machine Learning Research

Related tags

Overview

Machine Learning Research

1. Project Topic

1.1. Exsiting research

1.2. Datasets and Tasks

2. Project Advice

Processing Data

3. Top Tiers ML&AI Conferences

4. Reference

You might also like...

Implemented four supervised learning Machine Learning algorithms

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

cuML - RAPIDS Machine Learning Library

mlpack: a scalable C++ machine learning library --

A toolkit for making real world machine learning and data analysis applications in C++

A library of extension and helper modules for Python's data analysis and machine learning libraries.

50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

Machine Learning toolbox for Humans

Uplift modeling and causal inference with machine learning algorithms

Owner

Charles

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

Python Research Framework

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A comprehensive repository containing 30+ notebooks on learning machine learning!

MIT-Machine Learning with Python–From Linear Models to Deep Learning