> It feels like either finding that 2% that's off (or dealing with 2% error) will be the time consuming part in a lot of cases.
The last '2%' (and in some benchmarks 20%) could cost as much as $100B+ more to make it perfect consistently without error.
This requirement does not apply to generating art. But for agentic tasks, errors at worst being 20% or at best being 2% for an agent may be unacceptable for mistakes.
As you said, if the agent makes an error in either of the steps in an agentic flow or task, the entire result would be incorrect and you would need to check over the entire work again to spot it.
Most will just throw it away and start over; wasting more tokens, money and time.
The last '2%' (and in some benchmarks 20%) could cost as much as $100B+ more to make it perfect consistently without error.
This requirement does not apply to generating art. But for agentic tasks, errors at worst being 20% or at best being 2% for an agent may be unacceptable for mistakes.
As you said, if the agent makes an error in either of the steps in an agentic flow or task, the entire result would be incorrect and you would need to check over the entire work again to spot it.
Most will just throw it away and start over; wasting more tokens, money and time.
And no, it is not "AGI" either.