Honesty: how many times was the exact same data processed? Was the result cherry picked and the best one published? For the sake of integrity how is it possible to scientifically improve on this result? (example, your AI outputs some life altering decision?)
Repeatability: In science, if a result can be independently verified, it gives validity to the "conclusion" or result. Most AI results cannot be independently verified. Not indepentently verifiable really ought to give the "science" the same status as an 1800's "Dr. Bloobar's miracule AI cure".
Numerical Analysis: performing billions / trillions of computations on bit-restricted numerical values will introduce a lot of forward propagated errors (noise). What does that do? Commentary: Video cards don't care if a few bits of your 15 million pixel display are off by a few LSB bits, they do that 60 or 120 frames a second and you don't notice. It is an integral part of their design. The issue is, how does this impact AI models? This affects repeatability -> honesty.
If error of a quantized size is a necessarily required property to achieve "AI learning that converges", there is still an opportunity for canonicalization -- a way to map "different" converged models to explain why they are effectively the "same". This does not seem to be a "thing", why not?
In my opinion, in 2020, the AI emperor still has no clothes.
For most of the engineering applications I work on, AI is useless.
When we talk about controlling machines, our control algorithms have mathematically proven strict error bounds, such that if we provide an input with a particular maximal error (e.g. from a sensor that has some error tolerance), we can calculate what's the maximum error possible in the response that our model would produce, and then use that to evaluate whether this is even an input that should be handled by the current algorithm or not.
These control algorithms all take some inputs, and use them to "predict" what will happen, and using that prediction, compute some response to correct it. These predictions need to happen much faster than real time, since you often need to perform an optimization step to compute an "optimal" response.
These predictions are usually computed using a reduced-order model, e.g., if you had to solve a PDE over 10^9 unknowns to compute the actual prediction, you can instead reduce that to a system with 10 unknowns, by doing some pre-computation a priori. Most tools to do these kinds of reductions developed in the last 60 years come with tight error bounds, that tell you, depending on your inputs, the training data, etc. what's the largest error than the prediction can have, so you can just plug these in into your control pipeline.
People have been plugin in neural-networks to control robots, cars, and pretty much anything you can imagine into these pipelines for 10 years, yet nobody knows what the upper bound on the errors that these neural-networks give for a particular input, training set, etc.
Until that changes, machine learning just makes your whole pipeline unreliable, and e.g. a car manufacturer must tell you that in "autonomous driving" mode you are liable for everything your car does, and not them, so you have to keep your hands on the driving wheels and pay attention at all times, which... kind of defeats the point of autonomous driving.
---
Prediction: we won't have any tight error bounds for real-world neural networks in the 2020-2030 time frame. These are all non-linear by design (that's why they are good), error bounds for simple non-linear interpolants are pretty much non-existent, people have tried for 20-30 years, and real-world NNs are anything but simple.
Control algorithms are a part of the problem. What about input data? There's nothing that comes close to NNs in answering a question, say, "Is there pedestrian ahead and what he/she will probably do?"
A control system doesn't need to be end-to-end neural, by the way.
What about it? You get input data from data sources which, e.g., in a car it would be a sensor. The manufacturer of the sensor provides you with the guaranteed sensor accuracy for some inputs, which gives you the upper bound on the input error from that source.
That is, in a reliable control pipeline, the upper bounds on the errors of data sources are known a priori.
Sure, sensors can malfunction, but that's a different problem that's solved differently (e.g. via resiliency using multiple sensors).
> A control system doesn't need to be end-to-end neural, by the way.
Who's talking about end-to-end neural nets for control? If a single part of your control pipeline has unknown error bounds, your whole control system has unknown error bounds. That is, it suffices for your control pipeline to use a NN somewhere for it to become unreliable.
This doesn't mean that you can't use control systems with unknown error bounds somewhere in your product, but it does mean that you can't trust those control systems. This is why drivers still need to keep their hands on the steering wheel on a Tesla: the parts of the pipeline doing the autonomous driving use NNs for image recognition, and the errors on that are unknown.
This is also why all "self driving" cars have simpler data sources like ultrasonic sensors, radar, lidar, etc. which can be processed without NNs to avoid collisions reliably. You might still use NNs to improve the experience but those NNs are going to be overridden by reliable control pipelines when required.
> That is, in a reliable control pipeline, the upper bounds on the errors of data sources are known a priori.
Now we have natural neural networks in the control loop for some reason despite their unknown error bounds. To give another example: is there a sensor for vehicle placement relative to road edges with known upper error bound, which is less than width of the road? No, we have GPS, radar, lidar, camera data that we need to interpret somehow.
A car that reliably avoids collisions (Can it, though? It needs to predict road situation to do it reliably), but can occasionally veer off the road, doesn't strike me as particularly safe.
> to be overridden by reliable control pipelines when required.
Those reliable pipelines needs to be mostly reactive. And there's a limit on what they can do. You can't avoid a collision when a car emerges from around a corner with 0.1 seconds to react. You need complex processing of those "simple" data sources to detect zones that can't be observed right now and to assess a probability of the said situation.
All in all, we already have unreliable human part in the control loop of a vehicle. A control system that is provably robust in all real world situations will be the ultimate achievement and not a prerequirement for wide use of self-driving cars.
> Now we have natural neural networks in the control loop for some reason despite their unknown error bounds.
Not in any automatic control loops. All control loops that do this have a human as the final piece of the pipeline, and that human is legally responsible for the outcome of the control loop. That defeats the point of automatic control.
> To give another example: is there a sensor for vehicle placement relative to road edges with known upper error bound, which is less than width of the road?
No, which is why these control pieplines have a human in control. For the experimental pipelines that do not have a human in the end of the control loop, they do have other control pipelines to avoid collisions, and the only that their control algorithms guarantee is a lack of collisions, not the ability for the car to stay on a lane. That is, the car might leave the lane under some conditions, but if it does, it will detect other objects and avoid crashing into them (although those objects might crash into it).
> All in all, we already have unreliable human part in the control loop of a vehicle. A control system that is provably robust in all real world situations will be the ultimate achievement and not a prerequirement for wide use of self-driving cars.
Right now, control loops without proven error bounds are not allowed by certification bodies on any control-loop in charge of preserving human lives in the aerospace, automotive, medical, and industrial robotics industries.
Allowing control-loops without known error bounds to be in charge of human lives would lower the current standards of these industries. Could be done (the government would need to create a new kind of regulation for this), but at this point it is unclear how that would look like.
> Well actually, they are allowed in the automotive world, re: Tesla.
This is false: Tesla's "autopilot" isn't a control-loop, much less a certified control-loop. What Tesla calls "autopilot" is actually a "driving assistant". It is not in charge of controlling the car, but it is allowed to assist the driver to control the car.
This is a subtle but very important difference, since this is the reason that Tesla tells its owners to always keep the hands on the steering wheel, and Tesla cars are required to disable their "driving assistant" in countries like Germany if the driver hands leave the steering wheel. It is also the reason that Tesla is not liable if a car with "autopilot" kills somebody, because the "autopilot" isn't technically driving the car, the driver is.
The word "autopilot" comes from the aerospace industry, where pilots are not required to keep their hand on the controls or pay attention when the "autopilot" is on, and the manufacturer is liable if the "autopilot" screws up (e.g. see Boing). Tesla's usage of the word "autopilot" to refer to driving assistants is misleading and dangerous.
A true car "autopilot" is what people call "Level 5" in the autonomous driving community. Elon Musk promised to ship 10.000 self-driving Level 5 Tesla taxis in 2019, and now mid 2020. We'll see about that.Elon Musk has been saying that Level 5 will happen next year for the last 10 years, so at least when it comes to autonomous driving, their predictions have been consistently wrong.
Most CEOs who adventure to predict when Level 5 will happen say something like "not before 2030". And Waymo's goal is to achieve """Level 5""" in a very small restricted area of Phoenix downtown for some restricted weather conditions some time between 2020-2030, but it is unclear what the road after achieving it would be for certification, and the road from there to actual, real, Level 5 is still unclear at this point.
I would argue most machine learning papers that use public datasets have code available and are often also reproduced independently (sometimes just because of somebody's need to port between PyTorch/TensorFlow). Reproducibility is still a big problem in reinforcement learning, however.
People are definitely thinking carefully about issues of noise and quantization error. Low-precision or quantized neural networks are increasingly popular at both train and test time. And people deliberately introduce noise into neural networks for various reasons (dropout, robustness certificates) and then have to think about the effect on performance. Typically things are quite reproducible in these situations btw, for a given noise distribution.
Re: canonicalization, the theoretical work on "neural tangent kernels" might be relevant.
>I would argue most machine learning papers that use public datasets have code available and are often also reproduced independently
lol have you ever tried? i have several github issues on published models because i couldn't recreate that have responses like "i don't remember the parameters i used and we've moved on".
i’ve only run into stuff like that trying to get RL agents to train. Maybe GANs and other “hard to train” models are bad too idk. But generally things do actually seem to be reproducible.
>Human babies don’t get tagged data sets, yet they manage just fine, and it’s important for us to understand how that happens
I do not really understand this. Human babies get a constant stream of labeled information from their parents. Contextualized speech is being fed to them for years. Toddlers repeat everything you say. Is this referring to something else that babies can do?
I'm curious to know what you mean by "labelled information".
I'm guessing that what you are calling "labelled information" is various forms of encouragement or discouragement that could be considered positively and negatively "labelled" examples.
If that is the case, linguistics research back in the '70s found that infants get almost no negative examples of, in particular, language. For example, a parent will not correct a child by saying, "no you can't say 'eated' because then you could also say 'sitted'". Instead they will correct by saying "no, you should say 'ate'" etc. That is important because there was a famous proof in inductive inference (the precursor to computational learning theory) that languages higher in the Chomsky hierarchy than regular languages cannot be learned from positive examples alone. And yet, babies eventually learn to speak human languages, which are assumed to be at least context-free. Chomsky used these findings to support his claim of a "universal grammar" or innate language endowment [1].
If you are talking about multi-class labelling, that's even harder to imagine. In machine learning, a multi-class classifier will map inputs to some set of categorical labels (i.e. a set of integers) but the mapping from those labels to concepts that a human would recognise, such as 1:cat, 2:bat, 3:hat, etc, must be perormed manually, because the classifier and humans do not have a shared understanding of what e.g. "cat", "bat" and "hat" mean. The classifier only knows 1,2,3... etc, the human knows that "1 means cat". How would this lack of shared context be resolved between an adult and a baby, so that the adult could provide "multi-class labels"?
___________
[1] Sorry that I don't have any references for all this handy- I can try to dig some up if you're interested, but you could start by reading the wikipedia page on Language Identification in the Limit, which is about the famous result from inductive inference I mention (also known as Gold's result from the man who derived it):
By now, the Chomskian approach to linguistics is not unchallenged anymore and there is some doubt on whether the "poverty of stimulus" argument holds any water (see e.g. [1]).
IMHO, modern cognitive science based approaches (such as by Tomasello and others) have a better chance of explaining how language is acquired than the hypothesising of the 70s.
I don't have time now to go into more references, but the question is far from settled.
I agree that the matter is not settled [edit: in the sense that there is criticism of Chomskian linguistics, from linguists] and that there is debate on the poverty of the stimulus and universal grammar etc, but the post you link to is not a very good summary of it. I recommend Alexander Clark's "Linguistic Nativism and the Poverty of the Stimulus" for a good look on the subject from the non-Chomskian poit of view.
Note however that, as far as I understand it, there is no controversy about the lack of negative examples of language given to children by their parents.
Fair, I just looked for the first reference I could find. I haven't done any real linguistics in years, although I vividly remember the arguments. Especially that Evans & Levinson article 10 years or so back ("The Myth of language universals") which generated quite some heat. If I have time, I will check out your reference.
Not sure about the negative examples; but language acquisition was never my focus area anyway.
I would just generally be cautious about applying formal language theory too readily to linguistics, that's all I wanted to say.
> How would this lack of shared context be resolved between an adult and a baby, so that the adult could provide "multi class labels"?
The baby would do multi-modal learning - learning the associated sound (name) with an image (object).
I don't think the parent and baby lack a shared context. They are both agents in the same environment, who often interact and cooperate to achieve goals and maximise rewards. The baby understands the world much earlier than can speak, the context is there.
I dont' know if it's a good idea to mix terminology from machine learning (or game theory?) in the subject of human learning, like you do. At some point the analogies become a bit too far-fetched. Parents and babies "are agents who interact to maximise rewards"? That just sounds like taking an analogy and running with it, and then putting it in a rocket and sending it to Mars. We have no idea why and how babies think or decide to behave how they behave.
This is one reason why I'm confused about the OP's use of "labelled information". Clearly that is a term borrowed from machine learning to describe something that happens in the real world- but, what?
There may be some kind of labeling encoded in genes. One thing that it is safe to assume is genetically encoded somehow is that sounds made by your parents/humans around you is worth repeating while other sounds are not.
However, past that, the actual sounds themselves, and any association to meaning, are pretty far from tagged data sets. Stuff like the specifics of language (e.g. that a dog is called 'dog') are definitely learned, and children learn them with typically only a handful of stimuli, often a single one.
For contrast, imagine training a model with raw sound data tagged only with "speech" vs "not speech" (and probably only a few thousand data points at that) and I will be amazed if it can recognize a single word. And babies don't just learn words, they learn their association to things they see and hear, and grammar, and abstract thought.
Do note that it is very likely that human brains can learn all that because they have some good heuristics built in. We definitely know some stuff is "hardware" - object recognition, basic mechanics, recognizing human faces and expression, and others. We are pretty sure higher level stuff is also built in - universal grammar, basic logic, some ability to simulate behavior seen/heard in other humans. This specialized hardware was also most likely learned, but over much, much greater periods of time, through evolution over hundreds of millions of years (since even extremely old animals are capable of picking out objects in the environment, approximating their speed etc).
There seems to be a spectacular underestimation of the amount of training data humans experience.
Not only does socialised human intelligence require at least a decade of formal education, but it also spends a lot of time in a complex 3D environment which is literally hands-on.
It's true some of the meta-structures predispose certain kinds of learning - starting with 3D object constancy, mapping, simple environmental prediction, and basic language abstraction.
But that level gets you to advanced animal sentience. The rest needs a lot of training.
For example - we can recognise objects in photographs, but I strongly suspect we learn 3D object recognition first - most likely with a combination of shape/texture/physics memory and modelling - and then add 2D object recognition later, almost as a form of abstraction.
Human intelligence is tactile, physical, and 3D first, and abstracted later. So it seems strange to me to be trying to make AI start with abstractions and work backwards.
Well, babies start picking out objects within weeks or months after birth. And many birds and mammals are much faster than that. That's not a huge amount of data to learn something so abstract from scratch, especially given the limited bandwidth of our data acquisition.
Furthermore, for other kinds of human knowledge, the learning process is very rarely based on data. After the acquisition of language, we generally seem to learn much more by analogy and deduction than by purely analyzing data. The difference is evident, since we can often pick up facts with a single datapoint, even in small children in kindergarten.
Also, getting back to your point on how we start AI - if you try to take a neural network and throw 3D sensor data at it, and immediately start using its outputs to modify the environment those sensors are sensing, I suspect you will not get any meaningful amount of learning. You probably need a very complex model and set of initial weights to have any chance of learning something like 3D objects and their basic physics (weight, speed and hwo those affect their predicted position). I would at least bet that you wouldn't get anywhere near, say, kitten accuracy in one month of training.
>> Not only does socialised human intelligence require at least a decade of formal education, but it also spends a lot of time in a complex 3D environment which is literally hands-on.
Note that for most of our history, the majority of humans did not get anything like "formal" education as we mean it today (i.e. going to school). Although adults in hunter-gatherer societies do teach children many things (e.g. which mushorooms are edible ect.) this must be done after a child has learned language -and those kids don't go to school to learn their language, they picke it up as they grow up.
> One thing that it is safe to assume is genetically encoded somehow is that sounds made by your parents/humans around you is worth repeating while other sounds are not.
I don't see how that's safe to assume at all. What one could assume is the level of familiarity and comfort (sight, smell, touch) might be somewhat genetic and gives such inputs precedence. OR it might just be that those sources of information are engaging and animated.
> Do note that it is very likely that human brains can learn all that because they have some good heuristics built in.
Nor do I see this assumption having any weight, many of the heuristics we take for granted were hard fought, its just so long ago that we've forgotten the fight. Lets not forget how "little" our species gets over the first few YEARS of child development.
If your child can move their body, just about walk and talk a little at TWO WHOLE YEARS in, they're an achiever.
The encoding I was talking about may well be something more abstract than 'imitate humans'. Still, babies don't generally try to imitate the sound of rattles or household sounds nearly as much as speech, so I still conclude that it is a safe assumption that there is something about sounds made by humans that is inherently interesting to them for some reason (instead of being a learned behavior).
Related to the second, the rate at which we learn, and the very specific order we learn things in, points very strongly in the direction that there is some built-in model that we train inside of. For example, essentially all babies first learn intonation before learning words. Also, most words are learned with an extremely small set of examples - at some ages, often hearing a word a single time is enough for the child to learn it (known as the 'poverty of the stimulus' problem). This has been mainstream understanding ever since behaviorism fell out of favor due to similar arguments by Chomsky.
> try to imitate the sound of rattles or household sounds nearly as much as speech
Well surely that's a case of the range of the vocal chords? Parrots are another intelligent creature that has better range and they imitate all sorts of sounds.
> Related to the second, the rate at which we learn, and the very specific order we learn things in, points very strongly in the direction that there is some built-in model that we train inside of.
Or that an action like walking requires one to put one foot ahead of the other, all other strategies in attempting to walk end in failure, which is why we don't see them.
I'd like to point out that all humans perceive intonation and its perceivable outside of language, that's why its easy to pick up, you don't need language to realise that someone is cross, or happy or sad. However considering autistic children cannot then maybe there are some genetic markers at play there at least.
>> Well surely that's a case of the range of the vocal chords? Parrots are another intelligent creature that has better range and they imitate all sorts of sounds.
Parrots (and birds like mainas etc) immitate human sounds and all sorts of sounds, but they don't discriminate between, e.g., the sound made by a train whistle and the sound made by a human carer. I mean that a parrot will not learn to speak a human language by immitating its sounds, any more than it'll learn to speak train by immitating a train whistle.
Human babies don't just immitate their parents' sounds, they figure out what those sounds do and how they come together to form language and express meaning. That is a small miracle that we don't understand at all well and Chomsky is 100% right to speak of scientific wonderment, in its context. It is really mind-blowing that kids can eventually learn to speak without, for the vast majority of children, anyone around them having any idea how to teach a kid to speak in any systematic way. Not to mention the trouble that adults have in learning another language even given formal training in it (which perhaps is further evidence that we really don't know how to teach language, because we don't understand how it works, so again, how can we teach small children to speak a language, but not adults?).
Chomsky's universal grammar is really the simplest answer: children don't learn how to speak a human language, they already know how, and they only have to learn the vocabulary and syntax of the language of their parents. This only presuposes that humans have human biology, and that our biology is responsible for our language ability. We can't learn to fly because we don't have wings and parrots can't learn to speak because they don't have human brains.
[Edit: that it's the simplest answer doesn't mean it's the right answer, only that it's got a damn good chance to be it.]
> One thing that it is safe to assume is genetically encoded somehow is that sounds made by your parents/humans around you is worth repeating while other sounds are not.
Well, these sounds come with a face attached and we know babies are hardwired to pay attention to faces.
I show my 19 month old daughter like three cartoon drawings of owls and she recognises a live one at the bird park instantly, unprompted. We have a way to go.
I believe cartoons are our equivalent of adversarial images. They typically look nothing like (photos of) their namesake and yet we recognise them usually without prompting.
It is my understanding (although I sure don't have any evidence on me) that cartoons and such (at least, the ones where we haven't simply learned that this cartoon means this animal) work by being a picture of what we remember about an animal. Akin to a caricature; the cartoon contains the most salient features. It doesn't work by looking like the actual animal; it works by reacting with how we remember the animal.
Isn't that kind of the same thing? Adversarial examples work by matching what the neural net 'remembers' about the target classification, rather than being a picture of a thing in that class. Neural nets just find different features salient .
I've wondered in the past if we could use black box adversarial methods with Mechanical Turk to generate adversarial examples that work on humans. Maybe they'd end up looking like cartoons?
(Also agreed, some cartoon animals are just informed likeness - for instance Goofy doesn't look anything like a dog, at least to me.)
>> Akin to a caricature; the cartoon contains the most salient features
The question is - how do we know what are the salient features? How do we figure out that if we make _this_ drawing, it will "remind of" an owl, and if we make _that_ drawing it will "remind of" a dog (or not, as the case may be)? I mean, if we knew that, how humans extract salient or relevant features from their environment, we'd be way ahead on the path to AI.
Humans don't _just_ learn to recognise the things they see though, they have complex mental models of the objects and things about them they can access by choice as well as make hypotheses about new things that they can immediately test, humans don't get labelled photos of cats, but they see cats in 3D and can interact with them and use spatial reasoning and walk around them to completely separate that cat from the background behind them.
I love PyTorch, but I’m not confident the claim that it is the most popular is close to true. The cited link, which brings up a lot of new research is in PyTorch simply doesn’t account for the amount of TensorFlow in production.
Sure, a lot of academics may be embracing PyTorch, but almost all production models have been in TensorFlow. Tesla is a huge notable example that’s using PyTorch at scale.
I do suspect that the split of TensorFlow 1 and 2 is perhaps one of the worst times for TF 2, many teams will likely try out PyTorch instead.
I think both are amazing frameworks, however TF was designed for Google Scale .... which leads to a lot of difficulties since 99.9 are not at Google scale.
Depends on how you measure it, of course. However, stackoverflow survey, google trends, and github octoverse all show PyTorch is on a steep upward trajectory that recently reached effective parity with TensorFlow and has not yet started slowing down.
I believe we will (1) find some basic data structures and algorithms to do real AI. (2) At first it will be able to do I/O only via text or simple voice. (3) Due to (1) it will learn very quickly from humans or other sources. (4) Soon it will be genuinely smart, enough, say, to discover and prove new theorems in math, to understand physics and propose new research directions, to understand drama and write good screen plays, to understand various styles and cases of music and compose for those, etc.
Broadly from (1) with the data structures it will be able to represent and store data and, then, from the algorithms, manipulate that data generating more data to be stored, etc.
In particular it will be able to do well with thought experiments and generation and evaluation of scenarios.
Good image understanding will come later but only a little later; the ideas in (1) may have to be revised to do well on image understanding.
*
I believe we will (1) find some basic data structures and algorithms to do real AI.*
There's no new kind of data structure to discover, humanity has made a disjonction of all possibles.
The choice of data structure for semantic parsing is trivial, it's an hypergraph.
The debate isn't the datastructure but how to fill it correctly while keeping the same Expressivity as in the original input (natural language).
There's no reason to think we will make progress on this task beyond wishful thinking.
Only a handful of humans are working on semantic parsing, which is the real AI task.
are some arguments about the term data structure. For that argument, what I have in mind is more general. So, with generality beyond data structures we can have relational database schema (apparently Microsoft's SQL Server documentation has a different meaning of schema) which can be new.
Or the intelligence needs to store and manipulate data. To store is to use a data structure of some kind, and to manipulate is to use some algorithms.
So, the challenge is to find the data structure and corresponding algorithms.
My prediction is that in the 10 years we will do that.
My reasoning:
(1) Mice, rats, kittens, puppies, ..., and much more including humans do a lot of it, that is, what humans do with intelligence.
(2) The babies of these species learn starting with relatively very little or nothing.
(3) The learning is fairly simple, e.g., does not require a lot of computing or data.
(4) Once the learning gets going, especially in humans, the amount of learning grows quickly from building on past learning, new data, and new thinking.
(5) We can guess that especially early on the learning is relatively simple -- that keeps the intelligence relatively stable.
Well I'm predicting that from essentially just (1)-(5) we can get the human level intelligence I mentioned, i.e., we will find the appropriate data structures and algorithms.
If there were a lot of value in my thinking about (1)-(5), NSF and/or DARPA would fund me to pursue what I outlined; since I'm sure they would not fund me, we, including me, can conclude that there is not a lot of value in my thinking. Maybe that does not make my thinking wrong; it might be right but just so far incomplete!
So, why not the short summary of GOFAI "someday"? Because I'm guessing that data structures, algorithms, (1)-(5), and 10 years will be at least a significant part of what will be sufficient for GOFAI!
In particular, what I explained seems to be more than just "someday" because:
(A) I believe that current work in deep learning neural networks, which obviously DO have some applications, might have some role in some relatively autonomous and non-conscious parts of GOFAI, maybe similar to some of what is crucial in some insects. So, I don't see the current work in neural networks as very relevant to the GOFAI I am predicting. So, I'm pruning off that stream of work for progress toward what I'm predicting. Again the neural network DOES have some applications.
(B) I worked for some years in AI in IBM's Watson lab. My view is that none of that work is at all relevant to what I'm predicting. So, I'm pruning off that stream of work for progress toward what I'm predicting.
So, with (A) and (B), I'm saying two paths to avoid. If I'm right, then avoiding (A) and (B) would have some evidence of being good contributions.
But no doubt like nearly everyone else, I want practical results faster than my prediction promises so don't want to work on that prediction. And I am not working on that path and, instead, am pursuing my startup which DOES have some pure and applied math at its core but is MUCH more likely to work and work much faster than my AI prediction!
But here HN asked for some predictions, so I made some!
Unless walled gardens bring in mechanisms that bring a real cost to the table for creating content, such as limitations on posts and account creation, they would be in the same boat.
In either system users can go to a strict whitelisting approach but that would com at a cost of discoverability and serendipity. This would strengthen the position of anointed 'influencers' and curators, and diminish the value of algorithmic feeds so eating into the revenue model of those that rely on this.
Since those forces are counteracting it is hard to make predictions, but I will anyway but take it with a strong dose of uncertainty.
My dystopian take is the ubiquitous deployment of 'fake' actors will further undermine the general inter-human levels of trust. Evolutionary less fragile altruistic strategies rely on unfakable or at least costly signals to stand up to the 'free-rider' intrusion. Sadly undermining trust will accelerate the further descend into identity tribalism we are already witnessing today, a segregation into near-immutable trait based groups where cross-clan transgressions are punished with extreme measures. As a caricature think of the television portrayal of tribal gang cultures in maximum security prisons.
My utopian take is that the onslaught of fake noise will restore the reliance on offline contacts and connections. Due to physical proximity these offline groups have to share more of the negative externalities caused by their actions which could lead to more altruistic consideration in consumption and production descisioning. Direct verifyable contribution and impact might trump systemic distrust and lead us out of the current innate identity tribal descend.
This problem is coming from the way providers choose to deliver content, the way they design their data products and the metrics they optimise for. NLP is simply the group of algorithms that power this process, not the underlying cause.
Probably more likely hitting that annoying point that you cannot be quite sure is article machine-generated lorem ipsum which sounds convincing but does not have any real information behind it. something like http://news.mit.edu/2015/how-three-mit-students-fooled-scien... but with scale.
- individual GPUs will hit a plateau at around 25TFlops in FP32 due to Moore's law and thermal dissipation however it will be easier than ever to interconnect multiple GPUs into large virtual ones due to interconnect tech improvements and modularization of GPU processing units
- only large companies will be able to train and use SOTA models with training costs in $10M-$100M per training run and those models will hit law of diminishing returns quickly
- 50% of all white collar jobs will be automated away, including a significant chunk of CRUD software work. Increased productivity won't be shared back with society, instead two distinct wealth strata of society will be formed worldwide due to scale effects, like in Latin America (<1% owners, >99% fighting for their lives).
- AI will make marketing, ads and behavioral programming much more intrusive and practically unavoidable
The page forwards itself to a spam google survey for me? Page history fills with the spam survey and cant navigate back to the article. iOS safari with reader view enabled.
Re: PyTorch TPU support, has anyone checked it out beyond "it works"?
There are many aspects of TPUs that I'm not convinced are easy to port: Colocating gradient ops, scoping operation to specific TPU cores, choosing to run operations in a mode that can use all available TPU memory (which is up to 300GB in some cases), and so on.
These aren't small features. If you don't have them, you don't get TPU speed. The reason TPUs are fast are because of those features.
I only glanced at PyTorch TPU support, but it seemed like there wasn't a straightforward way to do most of these. If you happen to know how, it would be immensely helpful!
As far as predictions go, AI will probably take the form of "infinite remixing." AI voice will become very important, and will begin proliferating through several facets of daily life. One obvious application is to apply the "abridged" formula to old sitcoms. (An "abridged" show is when you rewrite it using editing and new dialog, e.g. https://www.youtube.com/watch?v=2nYozPLpJRE. Someone should do Abridged Seinfeld.) AI audio as already made inroads on Twitch, where streamers like Forsen allow donation messages to be read off in the voice of various political figures (and even his own voice). The Pony Preservation Project was recently solved with AI voice (https://twitter.com/gwern/status/1203876674531667969) meaning it's possible to do realistic voice simulations of all the MLP characters with precise control over intonation and aesthetics.
Natural language AI will continue to ramp up, and people will learn how to apply it to increasingly complex situations. For example, AI dungeon is probably just the beginning. I recently tried to do GPT-2 chess (https://twitter.com/theshawwn/status/1212272510470959105) and found that it can in fact play a decent game up to move 12 or so. AI dungeon multiplayer is coming soon, and it seems like applying natural language AI to videogames in general is going to be rather big.
Customer support will also take the form of AI, moreso than it is already. It turns out that GPT-2 1.5B was pretty knowledgable about NordVPN. (Warning: NSFW ending, illustrating some of the problems we still need to iron out before we can deploy this at scale.) https://gist.github.com/shawwn/8a3a088c7546c7a2948e369aee876...
AI will infiltrate the gamedev industry slowly but surely. Facial animation will become increasingly GAN-based, because the results are so clearly superior that there's almost no way traditional toolsets will be able to compete. You'll probably be able to create your own persona in videogames sooner than later. With a snippet of your voice and a few selfies, you'll be able to create a fairly realistic representation of yourself as the main hero of e.g. a Final Fantasy 7 type game.
We've seen multiple BERT-related PyTorch models training successfully on Cloud TPUs, including training at scale on large, distributed Cloud TPU Pod slices.
Would you consider filing a GitHub issue at https://github.com/pytorch/xla or emailing pytorch-tpu@googlegroups.com to provide a bit more context about the specific issue you encountered?
Google wrote BERT and they provide technical support to the FB Pytorch TPU port so it's not entirely surprising. RoBERTa, (Fb's variant) would be a good candidate to test it with.
We only see code when customers open-source it or otherwise explicitly share it with us. We are directly in touch with several customers who are using the PyTorch / TPU integration, so we hear feedback from them, and we also run a variety of open-source PyTorch models on Cloud TPUs ourselves as we continue to improve the integration.
"With a snippet of your voice and a few selfies, you'll be able to create a fairly realistic representation of yourself as the main hero of e.g. a Final Fantasy 7 type game. "
Extrapolating from FB/Instagram I would wager a 'fairly realistic' representation while technically an option will be shunned by most in favor of a 'fairly idealized' representation or alternatively some sort of myopically polarized extreme trait representation.
I agree with so many points here. Generative AI is one of those things that are immediately applicable. Instead of problems that require 100% accuracy, generative models don’t require that. As long as the avatar looks close enough, it works. And no one really needs explainability there.
I'm generally pessimistic about predictions of the future. In this case I can't help but smile. They're trying to predict how a field (AI), which deals with complex adaptation, will intelligently adapt its adaptive techniques in the coming year, within an environment (we humans) that are themselves changing behavior while adapting to AI. That's approximately three meta levels. Good luck, guys!
I predict the broader ML/DL community will keep pumping out iterative papers that push the ball just a little bit forward while maintaining job security : Gatekeeping, no one thinking outside of the box, benchmark putting, just enough for the appearance of progress, and nothing broadly innovative or disruptive. The applications of ML/DL will continue to be gimmicky consumer products that have questionable valuable, questionable profit potential, add even more to disinformation/misinformation, produce more informational noise, only serve to rebuff a big corp's cloud offerings, and waste people's time. I predict tons more 'bought' articles that hype up AI technology for the typical 'household' names. I predict the same ol' echo chamber of thought and reinforcement of 'gatekept' ideology. I expect a number of more prominent articles critiquing the shortfalls of the technology. I expect a number of young minds steeped in DL/ML coming to the realization that it's not what they expected... That its a big profit/revenue story for Universities and established corporate platforms. I expect a number of them to realize ML/DL is truly not "AI" or anything close to it. That they aren't doing cutting edge research and that they are not allowed to think outside of the echo chamber of 'approved' approaches.
I predict more useless chatbots that utter unpredictable word salads. I expect more gimmicky entertainment focused uses of it. I expect more assistants being adopted for data collection. I expect more people who aren't busy or doing anything important, using assistance assistants and text-to-speech to speed up their tasks so they can waste more of their time on social media/youtube/entertainment. Samsung Neon is coming out in some days.. making use of that 'Viv' acquisition.
I expect more feverish attempts at attacking low hanging fruit jobs with overly complex solutions. I predict failures in a number of startups targeting this. I predict no pronounced progress in self-driving cars nor any particular grand use for them. I predict several hollow attempts to overlay symbolic systems over ML/DL or integration attempts of it with ML/DL from prominent AI figures. I predict pronounced failures in this effort cementing a partial end to the hype of ML/DL.
I predict we will get a pronounced development outside of run-of-the-mill corporate/academic gatekept/walled garden ML/DL that will forge a new and higher path for AI. Hinton's words from prior years will have been heeded and the results of a new approach to AI presented. A change of guard, a break from the necessity of a PhD, a break from the echo-chamber of names, and a broader and more deeply thought out vision. Disruption not of low-hanging-fruit but disruption directed at the heart of the AI/Technology industry... So that we may finally progress from this stalled out disinformation/misinformation/hype/gatekeeping/cloud/all-your data-belongs-to-us cycle.
Apparently, a lot of the positives are broadly overhyped because such hype and misrepresentation keep money in people's pockets, ventures overvalued, universities with a steady pipeline of warm bodies paying 40-50k a year, and a movement sustained. Apparently, you can't do this forever.
The linked page threw up a suspicious looking overlay. I left the site, too bad since I wrote a blog with my predictions last night and wanted to compare my AI predictions.
If you use Firefox, just enable "Reader View" for the site (just hit F9). Removes all the crud around the page and just shows the text and required images.
Some axioms that I'm not seeing talked about much:
* Artificial general intelligence (AGI) is the last problem in computer science, so it should be at least somewhat alarming that it's being funded by internet companies, wall street and the military instead of, say, universities/nonprofits/nonmilitary branches of the government.
* Machine learning is conceptually simple enough that most software developers could work on it (my feeling is that the final formula for consciousness will fit on a napkin), but they never will, because of endlessly having to reinvent the wheel to make rent - eventually missing the boat and getting automated out of a job.
* AI and robot labor will create unemployment and underemployment chaos if we don't implement universal basic income (UBI) or at the very least, reform the tax system so that automation provides for the public good instead of the lion's share of the profit going to a handful of wealthy financiers.
* Children aren't usually exposed to financial responsibility until around the age of 15 or so, so training machine learning for financial use is likely to result in at least some degree of sociopathy, wealth inequality and further entrenchment of the status quo (what we would consider misaligned ethics).
* Humans may not react well when it's discovered that self-awareness is emotion, and that as computers approach sentience they begin to act more like humans trapped in boxes, and that all of this is happening before the world can even provide justice and equality for the "other" (women, minorities, immigrants, oppressed creeds, intersexed people, the impoverished, etc etc etc).
My prediction for 2020: nothing. But for 2025: an optimal game-winning strategy is taught in universities. By 2030: the optimal game-winning strategy is combined with experience from quantum computing to create an optimal search space strategy using exponentially fewer resources than anything today (forming the first limited AGI). By 2035: AGI is found to require some number of execution cycles to evolve, perhaps costing $1 trillion. By 2040: cost to evolve AGI drops to $10 billion and most governments and wealthy financiers own what we would consider a sentient agent. By 2045: AGI is everywhere and humanity is addicted to having any question answered by the AGI oracle so progress in human-machine merging, immortality and all other problems are predicted to be solved within 5 years. By 2050: all human problems have either been enumerated or solved and attention turns to nonhuman motives that can't be predicted (the singularity).
Honesty: how many times was the exact same data processed? Was the result cherry picked and the best one published? For the sake of integrity how is it possible to scientifically improve on this result? (example, your AI outputs some life altering decision?)
Repeatability: In science, if a result can be independently verified, it gives validity to the "conclusion" or result. Most AI results cannot be independently verified. Not indepentently verifiable really ought to give the "science" the same status as an 1800's "Dr. Bloobar's miracule AI cure".
Numerical Analysis: performing billions / trillions of computations on bit-restricted numerical values will introduce a lot of forward propagated errors (noise). What does that do? Commentary: Video cards don't care if a few bits of your 15 million pixel display are off by a few LSB bits, they do that 60 or 120 frames a second and you don't notice. It is an integral part of their design. The issue is, how does this impact AI models? This affects repeatability -> honesty.
If error of a quantized size is a necessarily required property to achieve "AI learning that converges", there is still an opportunity for canonicalization -- a way to map "different" converged models to explain why they are effectively the "same". This does not seem to be a "thing", why not?
In my opinion, in 2020, the AI emperor still has no clothes.