I'm an "AI engineer" as I've seen others loftily describe it: meaning a traditional full stack who works on an AI first code base. I am a heavy user of the OpenAI and Gemini APIs... But I just don't understand what purpose these "AI frameworks" solve. It's almost to the point where I feel stupid for just not getting it.
You write words. Merge your context in. What possible need is there for any library?
I saw a relatively new AI company already valued at over $1 bn publish a "new technique" recently, complete with whitepaper and all. I looked at the implementation and... it was just querying four different models, concatenating the results and asking a fifth model to merge the responses together.
Makes me wish I had spent some time in college learning how to sell rather than going so hard on the engineering side.
I had a startup building course at university and I was kinda shocked when I learnt that VCs don't actually have all the knowledge you have so you have to know how to sell your ideas. Yep, knowing how to sell something at the right person (at the right moment) is just as important as having a good idea
The novel part of that paper was not merging the responses. The last model can, from the inputs, synthesize higher quality overall responses than any of the individual models.
It’s a bit surprising that works and your take on it is overly reductive largely because you’re wrong in your understanding of what it was doing.
I didn't just look at the implementation, I tried it as well. I was hoping it would work, but the aggregating model mostly either failed to properly synthesize a new response (merely dumping out the previous responses as separate functions) or erratically took bits from each without properly gluing them together. In every case, simply picking the best out of the four responses myself yielded better results.
Interesting, I've seen live demos working fairly well. I've also implemented something adjacent to the work and it works quite well too. I'm not sure why you had a hard time with it.
I am however working in a domain where verification isn't subjective so I know a good response from a bad response fairly easily. Things like this depend quite heavily on the model being used too in my experience.
1. It’s a curated set of problem/solutions more than prompts
2. It abstracts away tons of the work of dealing with the various APIs
3. It’s crowdsourced and constantly improving
1-3 are useful because most people aren’t AI engineers like you, and they shouldn’t have to be to get the benefits of AI.
Basically the answer comes down to the technical term “ensembles of agents”. All these frameworks dance around it in different ways, but ultimately everyone’s trying to recreate this basic AI structure because it’s known to work well. I mean, they even implemented it on a sub-human level with GPT’s mixture of experts!
If you haven’t had the chance yet, I highly recommend skimming Russel and Norvig’s textbook on symbolic AI. Really good stuff that is only made more useful by the advent of intuitive algorithms
Haha. Well put. I think the idea is that you get yourself busy learning new APIs, spending time reading ever changing docs, hanging out in Discord channels, so the framework creators can point to the size of their flock of developers and make a big exit. And they’ll also make sure to get the product managers all starry eyed about the framework. Condition them into believing that a certain AI application means it must be that framework. Now the product manager can just use the name of the framework (which they have seen in a 10s video on X subtitled with a row of exploding head emojis) to describe what they want. And if you, as a developer, can do prove that you belong to the chosen tribe, you will be in demand. (Thinking of it, they should sell certifications.)
So how dare you build with Python and LLMs directly. How dare you say it’s just string manipulation.
You write words. Merge your context in. What possible need is there for any library?