GPT-3 is better than most humans at most of the work done by humans
And no one realizes it
Do you know how bad humans are at email routing? Pretty bad. Like, incredibly bad. Humans are not good at tasks that require low-medium intelligence, focused for long periods of time. The problem is no one notices it because they couldn't bother to review the *human labeled data* that they use as a rubric for the LLM. Turns out that the LLM was actually better.
5-10% btw. Failures for email routing. A classification problem with 5 groups. Corroborated by https://arxiv.org/abs/1808.02636, but take it with a grain of salt because it is IBM. That said, IME: failure rates from humans are like 15% when teams to route n=10. LLMs are much better given a good enough prompt
(Don't believe me? Try it yourself. Surely your company has an email routing problem. Just query it and have an LLM do it. In fact, look at your actual success rates)
And yet, it's been 4 years. Support agents still exist. What's the bottleneck?
Is the bottleneck software engineers? Faithfully plugging away, putting the bot in between the emails and the teams? Is the bottleneck money? No, and no. Both - we have a ton of.
The bottle neck is *support agents* not knowing how to code. You want a support agent that is a software engineer - that is motivated (meaning, gets paid for results).
Imagine, if you will, a support agent. A single support agent, for your entire company. And imagine if this support agent were a very good programmer. And that they had infinite tokens to the world's best LLM.
And imagine, if they got paid per support ticket routed appropriately, and per support ticket resolved. How much do you think they would get through? How much do you think their accuracy would be?
That single person could outperform a 1000 person farm of support agents.
The bottleneck is motivated smart people