Last Updated on June 7, 2016
Programmers should get involved in the field of machine learning because they are uniquely skilled to make huge contributions.
In this post you will learn that as a programmer it can be easy to overlook the skills you have and overvalue those things you don’t know. You will learn about four opportunities for programmers to start making an impact in the field of machine learning almost immediately.
What You Will Learn
Professional Development Practices
The discipline of professional software development (or software engineering if you like that term) is all about how to the design, implementation and maintenance of reliable software systems that solve problems. Your skills as a developer are valuable and you can apply them to the field of machine learning.
Here are some examples:
- Structure: When developing software, you structure a project. For example, there is a directory for source code, one for assets, one for documentation, and if you’re using a compiled language you have a directory for binaries. Using a well defined structure for a software development project is a best practice that introduces separation and consistency that supports collaboration. Anyone on the project will know where to make contributions to the project, and when same convention is adopted across projects, anyone in the organization can quickly navigate the project.
- Automation: In a software project, you use build systems to automate common tasks for the project. Whether you are using a Make, Ant, Rake or any similar build system, it is natural to take common development tasks and put them as targets that can be repeated a whim, and organized into hierarchies of increasing leverage.
- Repeatability: The convention-based structure you apply to projects and the automation you achieve with build systems allow tasks in a given project to be 100% repeatable. Anyone can check out the project and build it. Anyone can follow the release process and build a binary or deploy an update to the website. Repeatability is a default when developing software systems.
- Testability: A class has one responsibility, a function does one thing. Simplification of systems creates small modular code that can be tested. You write automated tests as a measure of quality control to demonstrate unambiguously that the code does what it was designed to do and to detect any regressions you introduce when changes are made.
- Maintainability: The behaviours above lead to one of the most important factors of professional software development which is maintainability. A successfully completed software project will spend most of it’s life in maintenance. Development is only a small fraction of the overall life of a piece of software, maintenance is the norm. We make software maintainable by making it structure, automated, repeatable, and testable.
These practices of professional software development can be brought over to the field of machine learning. They can have the most effect in the early phases of a machine learning project. Three examples include:
- When data is being derived from the original source into a form suitable for a given learning method. This process can be made automated and repeatable and the derived data stored in a directory structure separate from the original source.
- When different machine learning methods are being tested to see which is the most appropriate for the problem. The testing of methods can be automated so that the results are repeatable and can be repeated if (when) bugs are found in the testing protocol.
- When a method is selected and implemented to address a complex problem. It can be designed to be tailored to the problem and implemented to be testable and well documented to ensure it meets the broader requirements of the project, including nonfunctional requirements such as acceptance criteria on the performance and accuracy of the algorithm.
Production Level Implementations
A novel machine learning method is typically proposed by a machine learning researcher or team of researchers. It is common for a novel method to be presented with a prototype or demonstration implementation of the algorithm.
A problem is that the code is written by researchers that may or may not be trained in the discipline of software development. Nevertheless, the goal of the implementation is to present a working prototype of the method.
If a business or other organization is looking to harness one of these power tools, their options are limited. They may decide to adapt and run the prototype code in their production system. It is common for research code to be released under no obvious licence or sometime a permissive open source license. The code will be written to address toy problems for demonstration purposes and the programming quality of the system may be variable, although in some cases can be only good enough to demonstrate the proof of concept.
The only real option is to reimplement the method using good software engineering practices. There is an opportunity for developers to implement production level implementations of powerful machine learning methods that are in demand. In addition to getting a job to this effect you can also develop production quality software tools, libraries and APIs that organizations could use to address their problems.
Get the Word Out
Machine learning methods are presented in the languages of research, such as dry research papers, academic presentations, monographs, lectures, and textbooks. There are power tools that are effectively hidden away from mainstream software development, even mainstream applied machine learning. This is a fact. The migration of useful methods from research to operations can take decades.
There is an opportunity for programmers that know some machine learning to find out about what methods are working and help to get the word out. You will have to learn just enough to be able to recognize these gems and have the imagination to think about where the methods could be applied in business or online, and have the ability to communicate or even implement those ideas. You don’t even need be a developer to take this on.
Put Machine Learning in Applications
As a programmer, you already know how to make applications for users. They may be applications on the web, mobile or on the desktop, or even something else more exotic. Perhaps the biggest opportunity for programmers like you is to put machine learning methods in the applications you are developing.
This is not as big and scary as you may initially believe. Remember that machine learning methods address a specific decision problem. Incorporating machine learning means identifying a complex problem in your application that can appropriately be solved by machine learning or more likely build an application around a suitable problem. It also means that you need to learn enough machine learning to make this happen, but you have already started that journey.
In this post you learned that programmers should get into machine learning because programmers are uniquely skilled to make huge contributions. Four contributions that programmers can make to the field of machine learning are:
- Bring professional software development practices to machine learning projects.
- Build production-level implementations of machine learning methods.
- Get the word out for novel machine learning methods
- Put machine learning methods in applications.
What are some software development practices that you think could make a big difference when experimenting and testing machine learning algorithms? Leave a comment.
About Jason Brownlee
Jason Brownlee, PhD is a machine learning specialist who teaches developers how to get results with modern machine learning methods via hands-on tutorials.