Last Updated on September 29, 2016
This was a really hard post to write because I want it to be really valuable.
I sat down with a blank page and asked the really hard question of what are the very best libraries, courses, papers and books I would recommend to an absolute beginner in the field of Machine Learning.
I really agonized over what to include and what to exclude. I had to work hard to put myself in the shoes of a programmer and beginner at machine learning and think about what resources would best benefit them.
I picked the best for each type of resource. If you are a true beginner and excited to get started in the field of machine learning, I hope you find something useful. My suggestion would be to pick one thing, one book or one library and read it cover to cover or work through all of the tutorials. Pick one and stick to it, then once you master it, pick another and repeat. Let’s get into it.
What You Will Learn
I am an advocate of “learn just enough to be dangerous and start trying things”.
This is how I learned to program and I’m sure many other people learned that way too. Know your limitations and exploit your strengths. If you know how to program, leverage that to get deep into machine learning fast. Then have the discipline to go and learn the math for the technique before you implement it a production system.
Find a library and read the documentation, follow the tutorials and start trying things out. The following are the best open source machine learning programming libraries out there. I don’t think they are all suitable for using in your production system, but they are ideal for learning, exploring and prototyping.
Start with a library in a language you know well then move on to other more powerful libraries. If you’re a good programmer, you know you can move from language to language reasonably easily. It’s all the same logic, just differing syntax and APIs.
- R Project for Statistical Computing: This is an environment and a lisp-like scripting language. All the stats stuff you could ever want to do will be provided in to R, including amazing plotting. The Machine Learning category on CRAN (think: third-party Machine Learning packages) has code written by leaders in the field with state of the art methods, as well as anything else you can think of. Learning R is a must if you want to prototype and explore quickly. It just might not be the first place you start.
- WEKA: This is a Data Mining workbench providing API, and a number of command line and graphical user interfaces for the whole data mining lifecycle. You can prepare data, visualize explore, build classification, regression and clustering models and many algorithms are provided built in as well as provided in third party plugins. Not related to WEKA, Mahout is a good Java framework for Machine Learning on Hadoop infrastructure if that is more your thing. If you’re new to big data and machine learning, stick with WEKA and learn one thing at a time.
- Scikit Learn: Machine Learning in Python built on top of NumPy and SciPy. If you are a Python or a Ruby programmer, this is the library for you. It’s friendly, powerful and comes with excellent documentation. Orange would be a good alternative if you’d like to try something else.
- Octave: If you are familiar with MatLab or you’re a NumPy programmer looking for something different, consider Octave. It is an environment for numerical computing just like Matlab and makes it easy to write programs to solve linear and non-linear problems, such as those that underlie most machine learning algorithms. If you have an engineering background, this might be a good place for you to start.
- BigML: Maybe you don’t want to do any programming. You can drive tools like WEKA completely without programming. You can go one step further and use services like BigML that offer machine learning interfaces on the web where you can explore building models all in the browser.
Pick a platform and use it to do your practical machine learning education. Don’t just read, do.
Video is a very popular way to get started in machine learning.
I watch a lot of machine learning videos on YouTube and VideoLectures.Net. The risk is that all you will do is consume and fail to take action. I recommend you should always take notes when watching a video, even if you discard the notes later. I also recommend trying out whatever it is you’re learning in the lecture.
Frankly, none of the video courses I have seen are really suitable for a beginner, for a true beginner. They all presuppose a working knowledge of at least linear algebra and probability theory, and more.
Andrew Ng’s Stanford lectures are probably the best place to start for a course, otherwise there are one-off videos I recommend.
- Stanford Machine Learning: Available via Coursera and taught by Andrew Ng. In addition to enrolling, you can watch all the lectures anytime and get the handouts and lecture notes from the actual Stanford CS229 course. The course includes homework and quizzes and focuses on linear algebra and using Octave.
- Caltech Learning from Data: Available via edX and taught by Yaser Abu-Mostafa. All the lectures and materials are available on the CalTech site. Again, like the Stanford class, you can take it at your own pace and complete the homework and assignments. It covers similar subjects and goes into a little bit more details and is more mathematical. The homework is probably too challenging for a beginner.
- Machine Learning Category on VideoLectures.Net: This is an easy place to drown in the overload of content. Look for videos that seem interesting and try them out. Bail if it’s at the wrong level or take notes if you’re enjoying it. I find I keep coming back to refresh myself on topics and to pickup entirely new topics. Also, it’s great to see what the masters of the field actually look like.
- “Getting In Shape For The Sport Of Data Science” – Talk by Jeremy Howard: A talk to a local R users group on the practical process for doing well in competitive machine learning. This is very valuable because so few people talk about what it’s actually like to work on a problem and how to do it. I not-so-secretly fantasise about funding a web reality TV show that follows participants in machine leaning competitions. That’s how into it I am!
If you are not used to reading research papers, you will find the language very stiff. A paper is like a snippet of a textbook, but describes an experiment or some other frontier of the field. Nevertheless, there are some papers that you might find interesting if you are looking to get started in machine learning.
- The Discipline of Machine Learning: A white paper defining the discipline of Machine Learning by Tom Mitchell. This was a piece of the argument Mitchell used to convince the President of CMU to create a standalone Machine Learning department for a subject that will still be around in 100 years (also see this short interview with Tom Mitchell).
- A Few Useful Things to Know about Machine Learning: This is a great paper because it pulls back from specific algorithms and motivates a number of important issues such as feature selection generalizability and model simplicity. This is all good stuff to get right and think clearly about from the beginning.
I’ve only listed two important papers, because reading papers can really bog you down.
Beginner Machine Learning Books
There are a lot of machine learning books and very few are written for beginners.
What is a beginner really?
Most likely you’re coming to machine learning from another field, most likely computer science, programming or statistics. Even then, most books expect you to have a grounding in at least linear algebra and probability theory.
Nevertheless, there are a few books out there that encourage eager programmers to get started by teaching the minimum intuition for an algorithm and point to tools and libraries so that you can run off to and try things out.
Most notably Programming Collective Intelligence, Machine Learning for Hackers and Data Mining: Practical Machine Learning Tools and Techniques for Python, R, and Java respectively. If in doubt, grab one of these three books!
- Programming Collective Intelligence: Building Smart Web 2.0 Applications (Affiliate Link): This book was written for you dear programmer. It’s lite on theory, heavy on code examples and practical web problems and solutions. Buy it, read it, do the exercises.
- Machine Learning for Hackers (Affiliate Link): I’d recommend this book after reading Programming Collective Intelligence (above). It again provides worked examples that are practical, but it has a more of a data analysis flavor and uses R. I really like this book!
- Machine Learning: An Algorithmic Perspective (Affiliate Link). This book is like a more advanced version of Programming Collective Intelligence (above). It has similar aims (get programmers started in Machine Learning), but it includes maths and references as well as examples and snippets in python. I’d recommend reading this after reading Programming Collective Intelligence if you’re still interested.
- Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (Affiliate Link): I actually started with this book, actually it was the first edition and it was about the year 2000. I was a Java programmer and this book and the companion library WEKA provided a perfect environment for me to try things out, implement my own algorithms as plug-ins and generally practice Machine Learning and the broader process of Data Mining. I highly recommend this book and this path.
- Machine Learning (Affiliate Link): This is an old book and does include formulas and lots of references. It’s a textbook but is also very accessible with grounded motivations for each algorithm.
A lot of people bang on about some great machine learning textbooks. I do too, and they are great. They are just not a great place for a beginner to start I think.
I thought deeply about this post and I also went off and looked at other people’s lists of resources to make sure I didn’t miss anything important.
For completeness, here are some other great lists of resources around the web for getting started in machine learning.
Have you read or used any of the resources here?
What did you think?
Did I leave out a critically useful resource for a programmer interested in getting started in machine learning?
Please leave a comment and let me know about it!