SF Machine Learning to Look at Google Research Tool in Production

The furious rise of Japanese news app SmartNews,¬†which has become the country’s most downloaded news app, is starting to see it’s popularity quickly cross the Pacific into the smartphones of Americans.

SmartNews delivers an extremely clean and responsive user experience with curated news items from around the web delivered at a time of the day chosen by the reader. Big deal? Not really.. but the app has a super awesome user interface with not only main news item tabs but also the option to add custom tabs to your liking.. wow, unique! Not at all.

The explosion of people using SmartNews has been twofold. Firstly, removing the barriers for use by not having to log in via third party applications to enjoy preferences, just download to your phone and you are ready to go. Secondly, SmartNews makes heavy use of machine learning tools and algorithms cooked up by a hardcore team of Data Scientists.

Kohei Nakaji is one of SmartNews’ Engineers that uses machine learning and natural language processing to make sure the app extracts, ranks and serves up the most relevant news content to users based on a set of predetermined parameters including location.

SF BayArea Machine Learning Meetup will be graced with his presence on Monday 27th April where he will be talking about how SmartNews make use of the open sourced project word2vec in ‘Globally Scalable Web Document Classification Using word2vec‘.

Word2vec is a research project which is not part of Google’s product offering but is gathering popularity as a way of computing vector representations of words for natural language processing in apps that need to unravel the unstructured nature of large data sets related to language.

At a very high level apps like SmartNews will use word vectoring to assess the relationship between words and categorise them by how likely it is that they are related.

When tools like word2vec decipher patterns over billions of words, they will eventually be able to serve up content based on algorithms with astonishing accuracy which ultimately, is what keeps the user engaged and coming back for more.

Deep learning neural networks are fed natural language in numerical form by word2vec which allows them to understand and compare numerical data with pattern recognition for labelling.

Monday’s session will give an unusual and unique insight into how word2vec is being used to allow one of the world’s hottest startups to leverage machine learning to take on the might of already established news providers.



About Gary Donovan

Machine Learning and Data Science blogger, hacker, consultant living in Melbourne, Australia. Passionate about the people and communities that drive forward the evolution of technology.
Show Buttons
Share On Facebook
Share On Twitter
Share On Linkedin
Share On Pinterest
Share On Stumbleupon
Contact us
Hide Buttons