Not losing, but commoditized - in those fields, there are now perfectly good open-source alternatives.
I was at Google from 2009-2014. When I joined, Google was literally the only place you could work if you wanted to do data science on web-scale data sets. Nobody else had the infrastructure or the data. Now if you want to do Search, ElasticSearch has basically the same algorithms & data structures as Google's search server, with Dremel + some extra features thrown in. (The default ranking algorithm continues to suck, though.) If you want to do deep learning, you reach for Keras, and it'll use TensorFlow behind the scenes but with a much more fluent API. Hadoop was a major PITA to use when I joined Google; now in many ways it's easier & more robust than MapReduce, and the ecosystem has many more tools. Spark compares well with Flume. Zookeeper over Chubby. There are a number of NoSQL databases that operate on the same principle as BigTable, though I'd pick BigTable over them for robustness. Take your pick of HTML5 parsers (I even wrote and open-sourced one while I was at Google). Google was struggling mightily with headless WebKit for JS execution when I left, now you can stand up a Splash proxy in minutes or use one of the many SaaS versions. Protobufs, Bazel, gRPC, LevelDB have all been open-sourced, as have many other projects.
The big advantage of big companies like Google is that they have lots (and I mean lots) of data and for them that data is comparatively cheap to store and manipulate.
I mean, I first wrote a text-categorization algorithm using a k-NN algorithm about 12-13 years ago, and in order to make it run with acceptable results I only needed to manually categorize about 200 articles for each category training set. That was very doable, both in terms of time spent for constructing the training set and in terms of storage costs. Now, I have been thinking for some time to write a ML algorithm that would automatically identify the forests from present-day satellite images or from some 1950s Soviet maps (which are very good on the details). I’m pretty sure that there already is some OS code that does that, but the training set requirements I think would “kill” the project for me. I read a couple of days ago (the article was shared here in HN) about some people at Stanford implementing a ML algorithm for identifying ships included in satellite images, and I remember reading that they used 1 million high-res images as a training set. Now, for me as a hobbyist or even for a small-ish company there’s no cheap way to store that training set. Never mind the costs of labeling those 1 million training images. Otherwise I totally agree with you, we live in a golden age of AI/ML code being made available for the general public, but unfortunately is the data that makes all the difference.
I was at Google from 2009-2014. When I joined, Google was literally the only place you could work if you wanted to do data science on web-scale data sets. Nobody else had the infrastructure or the data. Now if you want to do Search, ElasticSearch has basically the same algorithms & data structures as Google's search server, with Dremel + some extra features thrown in. (The default ranking algorithm continues to suck, though.) If you want to do deep learning, you reach for Keras, and it'll use TensorFlow behind the scenes but with a much more fluent API. Hadoop was a major PITA to use when I joined Google; now in many ways it's easier & more robust than MapReduce, and the ecosystem has many more tools. Spark compares well with Flume. Zookeeper over Chubby. There are a number of NoSQL databases that operate on the same principle as BigTable, though I'd pick BigTable over them for robustness. Take your pick of HTML5 parsers (I even wrote and open-sourced one while I was at Google). Google was struggling mightily with headless WebKit for JS execution when I left, now you can stand up a Splash proxy in minutes or use one of the many SaaS versions. Protobufs, Bazel, gRPC, LevelDB have all been open-sourced, as have many other projects.