It would have been easy for a present like HBOs Silicon Valley to show off a machine learning app that classifies hotdogs with some clever post processing drop in some fake static screenshots and call it a day. If the team wanted to create a real app for branding intents they could have strung together a few APIs in hackathon way and moved on to the next silly gag. But to their credit, Tim Anglade, the engineer behind the viral spoof app Not Hotdog, probably put more thought think into his AI than at the least one AI startup to pitch on Sand Hill this week.
Rather than use something like Googles Cloud Vision API, Anglade actually got into the weeds, experimenting with TensorFlow and Keras.Because Not Hotdog had to run locally on mobile devices, Anglade faced a slew of timely challenges that any machine learning developer exploring applications on mobile could relate to.
In a Medium post, Anglade discusses how he initially got to work retraining the Inception architecture with transfer learning on a few thousand images of hotdogs employing an eGPU attached to his laptop.But even still, his model was too bloated to operate reliably on mobile devices.
So he tried using SqueezeNet, a leaner network that would require far less memory to operate. Regrettably, despite its compact sizing, its performance was hampered by over and underfitting.
Even when given a large dataset of hotdog and not hotdog develop images, the model wasnt quite able to grasp the abstract concepts of what generally constitutes a hotdog and instead seemed to use bad heuristics as a crutch( red sauce= hotdog ).
Fortunately, Google had just published their MobileNets newspaper, putting forth a novel style to run neural networks on mobile devices. The solution presented by Google offered a middle ground between the bloated Inception and the frail SqueezeNet. And more importantly, it allowed Anglade to easily tune the network to balance accuracy and compute availability.
Anglade use an open source Keras implementation from GitHub as a jumping off phase. He then made a number of changes to streamline the model and optimize it for a single specialized employ case.
The final model was trained on a dataset of 150, 000 images. A majority, 147,000 images, were not hotdogs, while 3,000 of the images were of hotdogs. This ratio was intentional to reflect the fact that most objects in the world are not hotdogs.
You can check out the rest of the narrative here, whereAnglade discuss all of his approach in detail. He goes on to explain a fun technique for using CodePush to live-inject updates to his neural net after submitting it to the app store. And while this app was created as a complete gag, Anglade saves hour at the end for an insightful deliberation about the importance of UX/ UI and the biases he had to account for when during the training process.
Make sure to visit: CapGeneration.com