Right Google is losing in ML or not caught up to speed ? I don't know when you l...

nostrademons · on Feb 2, 2019

Not losing, but commoditized - in those fields, there are now perfectly good open-source alternatives.

I was at Google from 2009-2014. When I joined, Google was literally the only place you could work if you wanted to do data science on web-scale data sets. Nobody else had the infrastructure or the data. Now if you want to do Search, ElasticSearch has basically the same algorithms & data structures as Google's search server, with Dremel + some extra features thrown in. (The default ranking algorithm continues to suck, though.) If you want to do deep learning, you reach for Keras, and it'll use TensorFlow behind the scenes but with a much more fluent API. Hadoop was a major PITA to use when I joined Google; now in many ways it's easier & more robust than MapReduce, and the ecosystem has many more tools. Spark compares well with Flume. Zookeeper over Chubby. There are a number of NoSQL databases that operate on the same principle as BigTable, though I'd pick BigTable over them for robustness. Take your pick of HTML5 parsers (I even wrote and open-sourced one while I was at Google). Google was struggling mightily with headless WebKit for JS execution when I left, now you can stand up a Splash proxy in minutes or use one of the many SaaS versions. Protobufs, Bazel, gRPC, LevelDB have all been open-sourced, as have many other projects.

paganel · on Feb 2, 2019

The big advantage of big companies like Google is that they have lots (and I mean lots) of data and for them that data is comparatively cheap to store and manipulate.

I mean, I first wrote a text-categorization algorithm using a k-NN algorithm about 12-13 years ago, and in order to make it run with acceptable results I only needed to manually categorize about 200 articles for each category training set. That was very doable, both in terms of time spent for constructing the training set and in terms of storage costs. Now, I have been thinking for some time to write a ML algorithm that would automatically identify the forests from present-day satellite images or from some 1950s Soviet maps (which are very good on the details). I’m pretty sure that there already is some OS code that does that, but the training set requirements I think would “kill” the project for me. I read a couple of days ago (the article was shared here in HN) about some people at Stanford implementing a ML algorithm for identifying ships included in satellite images, and I remember reading that they used 1 million high-res images as a training set. Now, for me as a hobbyist or even for a small-ish company there’s no cheap way to store that training set. Never mind the costs of labeling those 1 million training images. Otherwise I totally agree with you, we live in a golden age of AI/ML code being made available for the general public, but unfortunately is the data that makes all the difference.

scottlocklin · on Feb 2, 2019

They kicked off the deep learning trend when they bought deep mind I guess. Otherwise what innovation are you talking about?

Switching from KNN to DL in machine translations is impressive as a technical achievement ... but not really an innovation, and I doubt all this "innovation" impacts their bottom line in any way.

netheril96 · on Feb 2, 2019

> Otherwise what innovation are you talking about?

Quantity: Google has the highest number of deep learning papers accepted into top conferences among all institutions, even when papers from DeepMind are not counted in Google's.

Quality: Transformer and the recent BERT have, pun intended, transformed the entire NLP field. Batch normalization is now a staple of all neural networks, as are its descendants instance normalization, group normalization, etc.

These are just on top of my head. Google may have done many things wrong these days, but it definitely has not lost any edge in machine learning.

vl · on Feb 3, 2019

While I don't know real numbers, back of the enveloper estimations for hardware costs alone (based on GCP TPU/GPU pricing) give order of hundreds thousands for BERT, and tens of millions for AlphaGo and friends. Notice how very few organizations in the world are in position to commit these kind of resources to AI problems, and that only Google and China are choosing to do so.

netheril96 · on Feb 5, 2019

And? Most scientific breakthroughs after the World War II require expensive equipments and materials. That fact doesn't make the achievements from Google, from Bell Labs, from CERN, from Fermilab less innovative.

joshuamorton · on Feb 2, 2019

Brain existed prior to deep mind, and is a leader in it's own right.

mav3rick · on Feb 2, 2019

Hardware innovation in the Pixel Camera also uses ML. "Night Sight" is the result of work across teams

scottlocklin · on Feb 2, 2019

Would love to know more if you have some references for this.

mav3rick · on Feb 2, 2019

Here you go - https://www.blog.google/products/pixel/see-light-night-sight...

dekhn · on Feb 2, 2019

deep learning at google was due to distbelief, not deepmind. deepmind brought reinforcement learning.