Last Updated on June 7, 2016
You can rise up and take on your desire to become an a machine learning practitioner and data scientist.
You have to work hard, learn the skills and demonstrate that you can deliver results, but you don’t need a fancy degree or a fancy background.
In this post I want to demonstrate that this is possible and even common.
You will discover that top managers and CEOs are looking at results and not backgrounds and that programmers and engineers like you are rising up to win competitions and take jobs in machine learning.
Results Trump Background
It does not matter what school you went to, what degrees you may or may not have or what companies you have worked at.
Machine learning is a meritocracy. The results you can deliver define your value.
In a post on Fastcolabs titled “The Rise Of The DIY Data Scientist” the then president of Kaggle, Jeremy Howard makes it clear results matter, not your background. He aid:
The people who win competitions are generally not Stanford-educated or Ivy League American Mathematicians. The world’s best data scientists based on their actual performance haven’t gone to famous schools.
If you are employing a data scientist, you should evaluate them based on their ability to deliver results. Howard continued:
If you want to hire a juggler for your circus, you would have him juggle for you and see how many things he can juggle. If you are going to hire someone to create predictive models, look at how well predictive their models predict.
If you are results focused when entering the field of machine learning, you can make astonishingly rapid progress.
In a Gigaom post in 2012 titled “Why becoming a data scientist might be easier than you think” Andrew Ng then Stanford professor and Coursera co-founder is quoted as saying:
Machine learning has matured to the point by where if you take one class you can actually become pretty good at applying it.
In that same post, the authors point out how many top Kaggle competitors at the time had little training other than a free online course.
In a recent example, Henk van Veen (known as Triskelion) showed how he went from humble programmer to Kaggle master within a year by consistently participating in machine learning and focusing on methods, tools and techniques that deliver results.
In his post titled “Reflecting Back on One Year of Kaggle Contests” he commented:
I became Kaggle Master mostly through ensemble learning, team work, sharing, powerful ML tools and the law of large numbers.
Amateurs Beat Experts
You do not need to be an expert in a field in order to create useful and accurate predictive models in that field.
In fact, if the goal is to create useful and accurate predictive models then expert knowledge may be a hindrance rather than a help.
In an interview for New Scientist (republished on Slate) titled “Specialist Knowledge Is Useless and Unhelpful“, Jeremy Howard commented that:
Your decades of specialist knowledge are not only useless, they’re actually unhelpful; your sophisticated techniques are worse than generic methods.
Competitions are held on Kaggle in specific specific and business domains and the observation repeatability made is that amateurs are beating the experts.
Experts versed in a specific domain come in and use their traditional. More often than not, the experts do not win the competitions. The classical methods from the specialized domain do not perform. It is the creative and inquisitive data scientists that beat out the experts.
We’ve discovered that creative-data scientists can solve problems in every field better than experts in those fields can. … People who can just see what the data is actually telling them without being distracted by industry assumptions or specialist knowledge.
Results Not Degrees
Getting results matters more than the degrees you have.
This has been true in programming for a long time and is true for applied machine learning. You are useful and valuable if you can effectively analyze a problem and design and deliver a solution.
In a previous post I talked about how degrees are a short-cut that other people can use to evaluate your capability. That you can build those short cut credentials in other ways as well, such as building a portfolio of machine learning projects.
This portfolio approach is exactly the approach used by artists and is the approach that programmers use to get interesting and high paying jobs without formal training.
Managers that are looking for and hiring data scientists and machine learning practitioners are looking to portfolios of work in candidates more than the degrees.
In a recently released book “The Data Analytics Handbook: CEOs & Managers” CEOs from companies such as Cloudera, Y-Hat, HG Data, Stylitics (and many more) were interviewed. They were asked about what they look for in candidates when hiring and a common theme in their answers was that they look at a candidates completed projects.
This theme was also recognized by the authors of the book and highlighted as one of the top five takeaways from all of the interviews: Top Takeaway 3: Do your own projects to break into the industry
There is a learning gap between academia and industry that is best filled by doing projects. Find some sports statistics and do your own analysis. Learn R so that you can complete this analysis, not just to learn R itself. Also try Kaggle.
Derek Steer, the CEO and co-founder at Mode Analytics comments that building models and working no problems in an applied setting is the best way to learn.
I think that the best way to learn skills so that you can apply them practically in the future is to start with a project, then learn all the skills necessary to complete it as you go.
Dean Abbott the co-founder at Smarter Remarketer agrees.
… start building models. Work on projects. It helps to work with someone who has done it before. Data preparation is harder to teach because there are so many ways for you to do it incorrectly. It is hard to teach in a way where you cover all “incorrect” approaches
Rohan Deuskar the CEO and co-founder at Stylitics uses this approach to evaluate job candidates, where they must complete a project to be considered for a job.
We will also give them a raw data set to take home and have them share five interesting things they see in the data. They would also be asked to present their findings in a couple of PowerPoint slides because part of the data analyst role according to me is being able to convey your findings to people who haven’t spent the time you have on the data.
Finally, Tom Wheeler the senior curriculum developer at Cloudera drives the point home that again it does not matter about your degress or lack of them, that creativity and ability to learn define amazing data scientists:
Just like there are lots of amazing programmers who don’t have a PhD in computer science, so too are there amazing data scientists who started working after getting a Masters or Bachelors degree in one of those areas. If they have an inquisitive personality and a lot of self-motivation, they tend to can quickly gain any other skills they need through real-world experience.
In this post you discovered that results can trump background. That you can learn machine learning fast and even become a Kaggle master if you focus on the tools and methods that get results.
You discovered that amateurs are beating out experts at their own game by focusing on results and developing general skills for building predictive models.
Finally, you learned that managers and CEOs are looking to the ability to complete projects and use projects to learn and demonstrate skills in order to evaluate the skill of an analyst or data scientist.
The lessons that you can take away from this post is to focus and develop on your tenacity, your speed of execution and your creativity.
Let go of needing to be a domain expert and focus on delivering results.
Let go of your need for a fancy degree and develop a portfolio of projects to demonstrate your skills.