Last Updated on June 18, 2019
This is a project spotlight with Shashank Singh a programmer and machine learning enthusiast.
What You Will Learn
Could you please introduce yourself?
I did Bachelors of Technology in Computer Science. I co-founded a startup at 23, spectacularly crashed it by 26th birthday. After that I was feeling particularly low and pretty dry of inspiration for quite some time.
I moved to Mumbai, India to joined Idyllic Software and I came in contact with amazing people with such diverse point of views on life and created a small informal meet-up of problem-solvers called “Coffee Break”.
A life altering moment for me was just around the corner when I saw two kids begging for food outside a pub I used to frequent. I knew I wanted to help these kids in any way I can. This kicked off a thought process resulting in my project Helping Faceless.
What is your project called and what does it do?
The Helping Faceless project (and Android application) is trying to combat child trafficking with use of state-of-art face recognition and data analytics.
How did you get started?
We started with a simple Ruby on Rails API server to accept information from apps and other sources. Slowly but steadily we have been adding complexity around this simple server to create more functionality.
To keep growing complexity in check, we use Service-Oriented Architecture, the whole system is broken into smaller modular application connecting with each other on wire. So at end of we use whatever language or framework is best suited for the task at hand.
Our current technology stack is as follows:
- Server Side : Ruby on Rails
- Client Side: Java for Android, Objective C for IOS, Web frontend for NGO’s
- Analytics: Python (Scipy/Pandas/Numpy/scipy.stats FTW!! ). We are in process of integrating Apache Storm and Apache Mahout for analytics and subsequent report generation.
We Use Heroku, Linode as VPS. Airbrake guys were amazing and they helped us with a beefier free account to catch bugs and errors. Also we use Heap Analytics to talk figure out service usage in terms of traffic.
For face recognition needs we use library from University of Michigan called OpenBR (Open Biometrics). It’s modular design makes it much more easier to drop it into our pipeline (see the 2013 paper Open Source Biometric Recognition). This modular design gives it a distinct advantage over OpenCV, also making experimentation quite simple.
If you want to help us out our code is available at Github, just fork it and start coding 🙂
What are some interesting discoveries you made?
Face recognition almost sounds magical on TV shows, but in reality it pretty much sucks unless well you are tech giant like Facebook.
We circumvented this high error rate by setting up a process akin to well oiled manufacturing process. Every piece of intelligence reaching our system is validated, then its transformed into understandable chunks or groups.
Photographs go into a separate pipeline to be matched to each other to create a giant similarity matrix. We then take top 20% similarity score images and run them through our crowdsourcing portion for people to verify our assumptions, this weeds out the false positives and gives us a much more pristine data points which are then further sieved through better 3rd party face recognition algorithms.
Moreover we are in process of setting up advanced reporting and intelligence system on this data using Apache Mahout.
What do you want to do next on the project?
With ideal wish-list being so big, we had to prune it to fit into realistic timelines but these are few things I would love to have.
- Gamification of Pledge and based on frequency of contributions.
- App side face recognition.
- Real-time alerts in case a child goes missing.
- Take it nationwide and even to more south-east asian countries like the Philippines .
- Human trafficking: Currently the model we are using for face recognition is only trained on face from age of 10-20, we want to extend it by increasing training data.
- Establish a platform for NGO and governmental organization to safely share data.
Our slides give a much better birds eyes view of our vision and goals: Helping Faceless Slidedeck
Do you have a machine learning side project?
If you have an interesting machine learning side project and are interested in being profiled like Shashank, please contact me.
About Jason Brownlee
Jason Brownlee, PhD is a machine learning specialist who teaches developers how to get results with modern machine learning methods via hands-on tutorials.