Dive into Machine Learning

Overview

Dive into Machine Learning Creative Commons License Awesome

Hi there! You might find this guide helpful if:

For some great alternatives, jump to the end or check out Nam Vu's guide, Machine Learning for Software Engineers.

Of course, there is no easy path to expertise. Also, I'm not an expert! I just want to connect you with some great resources from experts. Applications of ML are all around us. I think it's in the public interest for more people to learn more about ML, especially hands-on, because there are many different ways to learn.

Whatever motivates you to dive into machine learning, if you know a bit of Python, these days you can get hands-on with a machine learning "Hello World!" in minutes.

Let's get started

Tools you'll need

If you prefer local installation

  • Python. Python 3 is the best option.
  • Jupyter Notebook. (Formerly known as IPython Notebook.)
  • Some scientific computing packages:
    • numpy
    • pandas
    • scikit-learn
    • matplotlib

You can install Python 3 and all of these packages in a few clicks with the Anaconda Python distribution. Anaconda is popular in Data Science and Machine Learning communities. (Use whichever tool you want.)

Cloud-based options

Some options you can use from your browser:

For other options, see:

Let's go!

Learn how to use Jupyter Notebook (5-10 minutes). (You can learn by screencast instead.)

Now, follow along with this brief exercise: An introduction to machine learning with scikit-learn. Do it in ipython or a Jupyter Notebook, coding along and executing the code in a notebook.

I'll wait.

What just happened?

You just classified some hand-written digits using scikit-learn. Neat huh?

Dive in

A Visual Introduction to Machine Learning

Let's learn a bit more about Machine Learning, and a couple of common ideas and concerns. Read "A Visual Introduction to Machine Learning, Part 1" by Stephanie Yee and Tony Chu.

A Visual Introduction to Machine Learning, Part 1

It won't take long. It's a beautiful introduction ... Try not to drool too much!

A Few Useful Things to Know about Machine Learning

OK. Let's dive deeper.

Read "A Few Useful Things to Know about Machine Learning" by Prof. Pedro Domingos. It's densely packed with valuable information, but not opaque.

Take a little time with this one. Take notes. Don't worry if you don't understand it all yet.

The whole paper is packed with value, but I want to call out two points:

  • Data alone is not enough. This is where science meets art in machine-learning. Quoting Domingos: "... the need for knowledge in learning should not be surprising. Machine learning is not magic; it can’t get something from nothing. What it does is get more from less. Programming, like all engineering, is a lot of work: we have to build everything from scratch. Learning is more like farming, which lets nature do most of the work. Farmers combine seeds with nutrients to grow crops. Learners combine knowledge with data to grow programs."
  • More data can beat a cleverer algorithm. Listen up, programmers. We like cool tools. Resist the temptation to reinvent the wheel, or to over-engineer solutions. Your starting point is to Do the Simplest Thing that Could Possibly Work. Quoting Domingos: "Suppose you’ve constructed the best set of features you can, but the classifiers you’re getting are still not accurate enough. What can you do now? There are two main choices: design a better learning algorithm, or gather more data. [...] As a rule of thumb, a dumb algorithm with lots and lots of data beats a clever one with modest amounts of it. (After all, machine learning is all about letting data do the heavy lifting.)"

When you work on a real Machine Learning problem, you should focus your efforts on your domain knowledge and data before optimizing your choice of algorithms. Prefer to do simple things until you have to increase complexity. You should not rush into neural networks because you think they're cool. To improve your model, get more data. Then use your knowledge of the problem to explore and process the data. You should only optimize the choice of algorithms after you have gathered enough data, and you've processed it well.

Jargon note

Just about time for a break...

Totally optional: some podcast episodes of note

First, download an interview with Prof. Domingos on the _Data Skeptic_podcast (2018). Prof. Domingos wrote the paper we read earlier. You might also start reading his book, The Master Algorithm by Prof. Pedro Domingos, a clear and accessible overview of machine learning. (It's available as an audiobook too.)

Next, subscribe to more machine learning and data science podcasts! These are great, low-effort resources that you can casually learn more from. To learn effectively, listen over time, with plenty of headspace. By the way, don't speed up technical podcasts, that can hinder your comprehension.

Subscribe to Talking Machines.

I suggest this listening order:

  • Download the "Starting Simple" episode, and listen to that soon. It supports what we read from Domingos. Ryan Adams talks about starting simple, as we discussed above. Adams also stresses the importance of feature engineering. Feature engineering is an exercise of the "knowledge" Domingos writes about. In a later episode, they share many concrete tips for feature engineering.
  • Then, over time, you can listen to the entire podcast series (start from the beginning).

Want to subscribe to more podcasts? Here's a good listicle of suggestions, and another.

OK! Take a break, come back refreshed.


Play to learn

Next, play along from one or more of notebooks.

  • Dr. Randal Olson's Example Machine Learning notebook: "let's pretend we're working for a startup that just got funded to create a smartphone app that automatically identifies species of flowers from pictures taken on the smartphone. We've been tasked by our head of data science to create a demo machine learning model that takes four measurements from the flowers (sepal length, sepal width, petal length, and petal width) and identifies the species based on those measurements alone."
  • Various notebooks by topic:
  • Notebooks in a series:
    • ageron/handson-ml2 - "Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python." Scikit-Learn, Keras, TensorFlow 2.

Find more great Jupyter Notebooks when you're ready:


Immerse yourself

Pick one of the courses below and start on your way.

Recommended course: Prof. Andrew Ng's Machine Learning on Coursera

Prof. Andrew Ng's Machine Learning is a popular and esteemed free online course. I've seen it recommended often. And emphatically.

You might like to have a pet project to play with, on the side. When you are ready for that, you could explore one of these Awesome Public Datasets, paperswithcode.com/datasets, or datasetlist.com.

Also, it's recommended to grab a textbook to use as an in-depth reference. The two I saw recommended most often were Understanding Machine Learning and Elements of Statistical Learning. You only need to use one of the two options as your main reference; here's some context/comparison to help you pick which one is right for you. You can download each book free as PDFs at those links - so grab them!

Tips for this course

Tips for studying on a busy schedule

It's hard to make time available every week. So, you can try to study more effectively within the time you have available. Here are some ways to do that:

Take my tips with a grain of salt

I am not a machine learning expert. I'm just a software developer and these resources/tips were useful to me as I learned some ML on the side.

Other courses

More free online courses I've seen recommended. (Machine Learning, Data Science, and related topics.)

Getting Help: Questions, Answers, Chats

Start with the support forums and chats related to the course(s) you're taking.

Check out datascience.stackexchange.com and stats.stackexchange.com – such as the tag, machine-learning. There are some subreddits, like /r/LearningMachineLearning and /r/MachineLearning.

Don't forget about meetups. Also, nowadays there are many active and helpful online communities around the ML ecosystem. Look for chat invitations on project pages and so on.

Supplement: Learning Pandas well

You'll want to get more familiar with Pandas.

Supplement: Cheat Sheets

Some good cheat sheets I've come across. (Please submit a Pull Request to add other useful cheat sheets.)

Assorted Tips and Resources

Risks

"Machine learning systems automatically learn programs from data." Pedro Domingos, in "A Few Useful Things to Know about Machine Learning." The programs you generate will require maintenance. Like any way of creating programs faster, you can rack up technical debt.

Here is the abstract of Machine Learning: The High-Interest Credit Card of Technical Debt:

Machine learning offers a fantastically powerful toolkit for building complex systems quickly. This paper argues that it is dangerous to think of these quick wins as coming for free. Using the framework of technical debt, we note that it is remarkably easy to incur massive ongoing maintenance costs at the system level when applying machine learning. The goal of this paper is highlight several machine learning specific risk factors and design patterns to be avoided or refactored where possible. These include boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, changes in the external world, and a variety of system-level anti-patterns.

If you're following this guide, you should read that paper. You can also listen to a podcast episode interviewing one of the authors of this paper.

That's not a comprehensive list, only a collection of starting-points to learn more.

Skilling up

What are some ways to practice?

One way: competitions and challenges

You need practice. On Hacker News, user olympus commented to say you could use competitions to practice and evaluate yourself. Kaggle and ChaLearn are hubs for Machine Learning competitions. (You can find more competitions here or here.)

You also need understanding. You should review what Kaggle competition winners say about their solutions, for example, the "No Free Hunch" blog. These might be over your head at first but once you're starting to understand and appreciate these, you know you're getting somewhere.

Competitions and challenges are just one way to practice! Machine Learning isn't just about Kaggle competitions.

Another way: try doing some practice studies

Here's a complementary way to practice: do practice studies.

  1. Ask a question. Start exploring some data. The "most important thing in data science is the question" (Dr. Jeff T. Leek). So start with a question. Then, find real data. Analyze it. Then ...
  2. Communicate results. When you think you have a novel finding, ask for review. When you're still learning, ask in informal communities (some are linked below).
  3. Learn from feedback. Consider learning in public, it works great for some folks. (Don't pressure yourself though! Do what works for you.)

How can you come up with interesting questions? Here's one way. Pick a day each week to look for public datasets and write down some questions that come to mind. Also, sign up for Data is Plural, a newsletter of interesting datasets. When a question inspires you, try exploring it with the skills you're learning.

This advice, to do practice studies and learn from review, is based on a conversation with Dr. Randal S. Olson. Here's more advice from Olson, quoted with permission:

I think the best advice is to tell people to always present their methods clearly and to avoid over-interpreting their results. Part of being an expert is knowing that there's rarely a clear answer, especially when you're working with real data.

As you repeat this process, your practice studies will become more scientific, interesting, and focused. Also, here's a video about the scientific method in data science.)

More machine learning career-related links

Some communities to know about

Peer review

OpenReview.net "aims to promote openness in scientific communication, particularly the peer review process."

  • Open Peer Review: We provide a configurable platform for peer review that generalizes over many subtle gradations of openness, allowing conference organizers, journals, and other "reviewing entities" to configure the specific policy of their choice. We intend to act as a testbed for different policies, to help scientific communities experiment with open scholarship while addressing legitimate concerns regarding confidentiality, attribution, and bias.
  • Open Publishing: Track submissions, coordinate the efforts of editors, reviewers and authors, and host… Sharded and distributed for speed and reliability.
  • Open Access: Free access to papers for all, free paper submissions. No fees.
More about OpenReview.net
  • Open Discussion: Hosting of accepted papers, with their reviews, comments. Continued discussion forum associated with the paper post acceptance. Publication venue chairs/editors can control structure of review/comment forms, read/write access, and its timing.
  • Open Directory: Collection of people, with conflict-of-interest information, including institutions and relations, such as co-authors, co-PIs, co-workers, advisors/advisees, and family connections.
  • Open Recommendations: Models of scientific topics and expertise. Directory of people includes scientific expertise. Reviewer-paper matching for conferences with thousands of submissions, incorporating expertise, bidding, constraints, and reviewer balancing of various sorts. Paper recommendation to users.
  • Open API: We provide a simple REST API [...]
  • Open Source: We are committed to open source. Many parts of OpenReview are already in the OpenReview organization on GitHub. Some further releases are pending a professional security review of the codebase.
  • OpenReview.net is created by Andrew McCallum’s Information Extraction and Synthesis Laboratory in the College of Information and Computer Sciences at University of Massachusetts Amherst

  • OpenReview.net is built over an earlier version described in the paper Open Scholarship and Peer Review: a Time for Experimentation published in the ICML 2013 Peer Review Workshop.

  • OpenReview is a long-term project to advance science through improved peer review, with legal nonprofit status through Code for Science & Society. We gratefully acknowledge the support of the great diversity of OpenReview Sponsors––scientific peer review is sacrosanct, and should not be owned by any one sponsor.

Production, Deployment, MLOps

If you are learning about MLOps but find it overwhelming, these resources might help you get your bearings:

Recommended awesomelists to save/star/watch:


Deep Learning

Take note: some experts warn us not to get too far ahead of ourselves, and encourage learning ML fundamentals before moving onto deep learning. That's paraphrasing from some of the linked coursework in this guide — for example, Prof. Andrew Ng encourages building foundations in ML before studying DL. Perhaps you're ready for that now, or perhaps you'd like to get started soon and learn some DL in parallel to your other ML learnings.

When you're ready to dive into Deep Learning, here are some helpful resources.

Easier sharing of deep learning models and demos

  • Replicate "makes it easy to share a running machine learning model"
    • Easily try out deep learning models from your browser
    • The demos link to papers/code on GitHub, if you want to dig in and see how something works
    • The models run in containers built by cog, "containers for machine learning." It's an open-source tool for putting models into reproducible Docker containers.

Collaborate with Domain Experts

Machine Learning can be powerful, but it is not magic.

Whenever you apply Machine Learning to solve a problem, you are going to be working in some specific problem domain. To get good results, you or your team will need "substantive expertise" (to re-use a phrase from earlier), which is related to "domain knowledge." Learn what you can, for yourself... But you should also collaborate with experts. You'll have better results if you collaborate with subject-matter experts and domain experts.

Machine Learning and User Experience (UX)

I couldn't say it better:

Machine learning won’t figure out what problems to solve. If you aren’t aligned with a human need, you’re just going to build a very powerful system to address a very small—or perhaps nonexistent—problem.

That quote is from "The UX of AI" by Josh Lovejoy. In other words, You Are Not The User. Suggested reading: Martin Zinkevich's "Rules of ML Engineering", Rule #23: "You are not a typical end user"


Big data

Here are some useful links regarding Big Data and ML.

See also: the MLOps section!

If you are working with data-intensive applications at all, I'll recommend this book:

  • Designing Data-Intensive Applications by Martin Kleppman. (You can start reading it online, free, via Safari Books.) It's not specific to Machine Learning, but you can bridge that gap yourself.

More Data Science materials

Here are some additional Data Science resources:

Aside: Bayesian Statistics and Machine Learning

From the "Bayesian Machine Learning" overview on Metacademy:

... Bayesian ideas have had a big impact in machine learning in the past 20 years or so because of the flexibility they provide in building structured models of real world phenomena. Algorithmic advances and increasing computational resources have made it possible to fit rich, highly structured models which were previously considered intractable.

Here are some awesome resources for learning Bayesian methods.

(↑ Back to top)


Finding Open-Source Libraries

Natural Language Processing (NLP)

This is just a small

Non-sequitur

These next two links are not related to ML. But since you're here, I have a hunch you might find them interesting too:


More ways to "Dive into Machine Learning"

Here are some other guides to learning Machine Learning. They can be alternatives or supplements to this guide.

(↑ Back to top)

Comments
  • [Contributions Welcome!] Look into book about

    [Contributions Welcome!] Look into book about "Test-Driven Approach" to learning Machine Learning; add a note in the guide

    opened by floer32 17
  • Need advice about how to evaluate your proficiency

    Need advice about how to evaluate your proficiency

    Please don't sell yourself as a Machine Learning expert while you're still in the Danger Zone. Don't build bad products or publish junk science. This guide can't tell you how you'll know you've "made it" into Machine Learning competence ... let alone expertise. It's hard to evaluate proficiency without schools or other institutions. This is a common problem for self-taught people. Your best bet may be: expert peers.

    If you know a good way to evaluate Machine Learning proficiency, please submit a Pull Request to share it with us.

    Need to tell people how they know they're out of the Danger Zone or how they know they are hire-able.

    opened by floer32 9
  • Added cybersecurity awesome lists

    Added cybersecurity awesome lists

    Changes:

    • Bullet points on reference material related to cyber security.
    • Minor reformatting

    The fsecurity link seems to be temporarily unavailable but I haven't removed it yet.

    opened by ankitkul 8
  • Image linked from Photo Bucket not showing in readme

    Image linked from Photo Bucket not showing in readme

    Specifically in this section: https://github.com/hangtwenty/dive-into-machine-learning#a-few-useful-things-to-know-about-machine-learning The image at the end of this section (is from)/(hosted on) Photo Bucket and is not showing up. If possible we could use an alternative host for the image.

    opened by meetsha 8
  • Added Data8 course and inferential thinking link

    Added Data8 course and inferential thinking link

    Issue #74

    • Data8 course link
    • Computational and Inferential Thinking textbook link.

    Sub-points include:

    • Definition of data science as per the textbook
    • Idea behind the course as described by Prof. Micheal Jordon in the youtube video

    Let me know if this looks good to you.

    Thanks.

    opened by ankitkul 7
  • Need a section about Deep Learning intro/way-out/next-steps if people are interested ... from someone who knows what they are talking about !

    Need a section about Deep Learning intro/way-out/next-steps if people are interested ... from someone who knows what they are talking about !

    (braindump)

    I've largely avoided having much deep learning links or neural nets links ... because it seems like it can be dangerous for beginners to jump ahead to those when they are not ready. I don't know enough about deep learning to really situate this beside just a couple links in a list ... let alone situate it in a sober, smart way :)

    Recently, this MIT book about Deep Learning has been published, and judging by reactions it seems like a worthwhile resource to link to. I still don't want "everything and the kitchen sink" but this would be a good "essential resource" on the subject. Or maybe an expert knows a better choice.

    So. Might want a small subsection to link to these two things, with the appropriate one or two sentence caveat that these are advanced topics, refer back to the pyramid showing that algorithm is least important (for many problems), etc etc. Maybe link to a Talking Machines episode too to give the appropriate context.

    enhancement help wanted question 
    opened by floer32 6
  • [notes to self/ thinking out loud] regarding human learning styles -

    [notes to self/ thinking out loud] regarding human learning styles - "impasse-driven learning"

    in this ticket, 'learning' refers to humans learning a topic, in general - it is not machine-learning hehe

    I learned Python by hacking first, and getting serious later. [...] If this is your style, join me in getting a bit ahead of yourself

    I've recently learned the term/concept of "impasse-driven learning" and it was a 💡 for me. Realized this is my preferred learning style. Some interesting papers exist (though I'm surprised how few): https://scholar.google.com/scholar?q=impasse-driven+learning

    this guide is oriented towards that learning style. so I had the thought of adding a small note or link that points that out.

    opened by floer32 5
  • Update README URLs based on HTTP redirects

    Update README URLs based on HTTP redirects

    Created with https://github.com/dkhamsing/frankenstein

    HTTPS Corrected URLs

    Was | Now --- | --- http://arstechnica.co.uk/security/2016/02/the-nsas-skynet-program-may-be-killing-thousands-of-innocent-people/ | https://arstechnica.co.uk/security/2016/02/the-nsas-skynet-program-may-be-killing-thousands-of-innocent-people/ http://cacm.acm.org/blogs/blog-cacm/169199-data-science-workflow-overview-and-challenges/fulltext | https://cacm.acm.org/blogs/blog-cacm/169199-data-science-workflow-overview-and-challenges/fulltext http://creativecommons.org/licenses/by/4.0/ | https://creativecommons.org/licenses/by/4.0/ http://datascience.stackexchange.com/ | https://datascience.stackexchange.com/ http://fivethirtyeight.blogs.nytimes.com/fivethirtyeights-2012-forecast/ | https://fivethirtyeight.blogs.nytimes.com/fivethirtyeights-2012-forecast/ http://jvns.ca/blog/2014/06/19/machine-learning-isnt-kaggle-competitions | https://jvns.ca/blog/2014/06/19/machine-learning-isnt-kaggle-competitions http://rayli.net/blog/data/coursera-machine-learning-review/ | https://rayli.net/blog/data/coursera-machine-learning-review/ http://sebastianraschka.com/faq/docs/ml-curriculum.html | https://sebastianraschka.com/faq/docs/ml-curriculum.html http://simplystatistics.org/2014/05/22/10-things-statistics-taught-us-about-big-data-analysis/ | https://simplystatistics.org/2014/05/22/10-things-statistics-taught-us-about-big-data-analysis/ http://softwareengineeringdaily.com/2015/11/17/machine-learning-and-technical-debt-with-d-sculley/ | https://softwareengineeringdaily.com/2015/11/17/machine-learning-and-technical-debt-with-d-sculley/ http://stats.stackexchange.com/questions/29713/what-is-covariance-in-plain-language | https://stats.stackexchange.com/questions/29713/what-is-covariance-in-plain-language http://stats.stackexchange.com/questions/tagged/machine-learning?sort=frequent&pageSize=15 | https://stats.stackexchange.com/questions/tagged/machine-learning?sort=frequent&pageSize=15 http://trendct.org/2016/08/05/real-world-data-cleanup-with-python-and-pandas/ | https://trendct.org/2016/08/05/real-world-data-cleanup-with-python-and-pandas/ http://www.datascienceweekly.org/data-science-resources/data-science-moocs | https://www.datascienceweekly.org/data-science-resources/data-science-moocs http://www.datascienceweekly.org/data-science-resources/the-big-list-of-data-science-resources | https://www.datascienceweekly.org/data-science-resources/the-big-list-of-data-science-resources http://www.forbes.com/sites/anthonykosner/2013/12/29/why-is-machine-learning-cs-229-the-most-popular-course-at-stanford/ | https://www.forbes.com/sites/anthonykosner/2013/12/29/why-is-machine-learning-cs-229-the-most-popular-course-at-stanford/ http://www.kaggle.com/competitions | https://www.kaggle.com/competitions http://www.tamr.com/landing-pages/getting-data-right/ | https://www.tamr.com/landing-pages/getting-data-right/ http://www.tensorflow.org/ | https://www.tensorflow.org/

    Other Corrected URLs

    Was | Now --- | --- http://c2.com/cgi/wiki?DoSimpleThings | http://wiki.c2.com/?DoSimpleThings http://conda.pydata.org/docs/_downloads/conda-pip-virtualenv-translator.html | https://conda.io/docs/_downloads/conda-pip-virtualenv-translator.html http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blob/master/dga_detection/DGA_Domain_Detection.ipynb | http://nbviewer.jupyter.org/github/ClickSecurity/data_hacking/blob/master/dga_detection/DGA_Domain_Detection.ipynb http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blob/master/java_classification/java_classification.ipynb | http://nbviewer.jupyter.org/github/ClickSecurity/data_hacking/blob/master/java_classification/java_classification.ipynb http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blob/master/sql_injection/sql_injection.ipynb | http://nbviewer.jupyter.org/github/ClickSecurity/data_hacking/blob/master/sql_injection/sql_injection.ipynb http://nbviewer.ipython.org/github/jmsteinw/Notebooks/blob/master/IndeedJobs.ipynb | http://nbviewer.jupyter.org/github/jmsteinw/Notebooks/blob/master/IndeedJobs.ipynb http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/Labeled%20Faces%20in%20the%20Wild%20recognition.ipynb | http://nbviewer.jupyter.org/github/ogrisel/notebooks/blob/master/Labeled%20Faces%20in%20the%20Wild%20recognition.ipynb http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/things_in_pandas.ipynb | http://nbviewer.jupyter.org/github/rasbt/python_reference/blob/master/tutorials/things_in_pandas.ipynb http://www.jacksimpson.co/materials-for-learning-machine-learning/ | http://jacksimpson.co/materials-for-learning-machine-learning/ https://docs.google.com/document/d/1YN6BVdReNAYc8B0fjQ84yzDflqmeEPj7S0Xc-9_26R0/ | https://docs.google.com/document/d/1YN6BVdReNAYc8B0fjQ84yzDflqmeEPj7S0Xc-9_26R0/edit https://drive.google.com/folderview?id=0ByIrJAE4KMTtaGhRcXkxNHhmY2M | https://drive.google.com/drive/folders/0ByIrJAE4KMTtaGhRcXkxNHhmY2M https://medium.com/machine-learnings/a-humans-guide-to-machine-learning-e179f43b67a0 | https://machinelearnings.co/a-humans-guide-to-machine-learning-e179f43b67a0 https://www.coursera.org/specializations/jhudatascience | https://www.coursera.org/specializations/jhu-data-science

    opened by ReadmeCritic 5
  • Validate pull requests with Travis

    Validate pull requests with Travis

    Hello, I wrote a tool that can validate README links (valid URLs, not duplicate). It can be run when someone submits a pull request.

    It is currently being used by

    • https://github.com/vsouza/awesome-ios
    • https://github.com/matteocrippa/awesome-swift
    • https://github.com/dkhamsing/open-source-ios-apps

    Examples

    • https://travis-ci.org/matteocrippa/awesome-swift/builds/96526196 ok ✅
    • https://travis-ci.org/matteocrippa/awesome-swift/builds/96722421 link redirected / rename 🔴
    • https://travis-ci.org/dkhamsing/open-source-ios-apps/builds/96763135 bad link / project deleted 🔴
    • https://travis-ci.org/dkhamsing/open-source-ios-apps/builds/95754715 dupe 🔴

    If you are interested, connect this repo to https://travis-ci.org/ and add a .travis.yml file to the project.

    See https://github.com/dkhamsing/awesome_bot for options, more information Feel free to leave a comment :smile:

    opened by awesome-bot 5
  • Add link to examples of codes for kaggle competitions

    Add link to examples of codes for kaggle competitions

    This will make it easy for people to get started with kaggle contests and machine learning in general as they will be able to see some practical examples.

    opened by apeeyush 5
  • Google crash course on ML

    Google crash course on ML

    Issue #93:

    • [x] add link in the Deep Learning section.
    • [x] add link in the Other Guides section.

    I have used my best judgement to decide the order within each section but please let me know if you think otherwise.

    opened by ankitkul 4
Owner
Michael Floering
"Computers perform repetitive tasks, people solve problems" — Jez Humble
Michael Floering
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022
Turns your machine learning code into microservices with web API, interactive GUI, and more.

Turns your machine learning code into microservices with web API, interactive GUI, and more.

Machine Learning Tooling 2.8k Jan 2, 2023
The easy way to combine mlflow, hydra and optuna into one machine learning pipeline.

mlflow_hydra_optuna_the_easy_way The easy way to combine mlflow, hydra and optuna into one machine learning pipeline. Objective TODO Usage 1. build do

shibuiwilliam 9 Sep 9, 2022
Iris-Heroku - Putting a Machine Learning Model into Production with Flask and Heroku

Puesta en ProducciĂłn de un modelo de aprendizaje automĂĄtico con Flask y Heroku L

JesĂšs Guillen 1 Jun 3, 2022
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft 366 Jan 3, 2023
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 7, 2023
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

Daniel Formoso 5.7k Dec 30, 2022
A comprehensive repository containing 30+ notebooks on learning machine learning!

A comprehensive repository containing 30+ notebooks on learning machine learning!

Jean de Dieu Nyandwi 3.8k Jan 9, 2023
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

null 2 Aug 23, 2022
Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

Teng (Elijah)  Xue 0 Jan 31, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 8, 2023
cuML - RAPIDS Machine Learning Library

cuML - GPU Machine Learning Algorithms cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions t

RAPIDS 3.1k Dec 28, 2022
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.2k Jan 1, 2023
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 2, 2023
A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc

Sebastian Raschka 4.2k Dec 29, 2022
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

[Due to the time taken @ uni, work + hell breaking loose in my life, since things have calmed down a bit, will continue commiting!!!] [By the way, I'm

Daniel Han-Chen 1.4k Jan 1, 2023
Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

Yandex 663 Dec 31, 2022
Uplift modeling and causal inference with machine learning algorithms

Disclaimer This project is stable and being incubated for long-term support. It may contain new experimental code, for which APIs are subject to chang

Uber Open Source 3.7k Jan 7, 2023