This paper perfectly articulates the problem I spent the last year solving. The shift from "hallucination" to "fidelity decay" is the correct mental model for agent stability.
I built an open source framework called SAFi that implements the "Fidelity Meter" concept mentioned in section 4. It treats the LLM as a stochastic component in a control loop. It calculates a rolling "Alignment State" (using an Exponential Moving Average) and measures "Drift" as the vector distance from that state.
The paper discusses "Ground Erosion" where the model loses its hierarchy of values. In my system, the "Spirit" module detects this erosion and injects negative feedback to steer the agent back to the baseline. I recently red-teamed this against 845 adversarial attacks and it maintained fidelity 99.6% of the time.
It is cool to see the theoretical framework catching up to what is necessary in engineering practice.
Two weeks ago on Saturday morning I was driving my son to his soccer game through Medford MA, when in one of the bus stops I saw Richard Stallman waiting for the bus.
I told my wife, that's Richard! and I stopped right in the middle of the street and jumped out of the car. I ran toward him and hugged him, and told him to take care of himself and be strong. He didn't say a word.
I feel bad for the man, he looked a bit frail, and mind absent.
I can't believe history is being rewritten right in front of my eyes, the man that started it all is being forgotten. Open Source is just a catchy marketing phrase for "free software"
"Another misunderstanding of “open source” is the idea that it means “not using the GNU GPL.” This tends to accompany another misunderstanding that “free software” means “GPL-covered software.” These are both mistaken, since the GNU GPL qualifies as an open source license and most of the open source licenses qualify as free software licenses. There are many free software licenses aside from the GNU GPL."
The Open Source Initiative sees the GPL as Open Source, and the Free Software Foundation sees the ISC (OpenBSD license) as Free Software. Neither group's definition depends on if it is copyleft of permissive.
The main difference between Free Software and Open Software is what aspect of the software they focus on. Free Software is more on the freedom while Open Source is more on the development method. They still value the other aspects, just not as strongly.
I built an open source framework called SAFi that implements the "Fidelity Meter" concept mentioned in section 4. It treats the LLM as a stochastic component in a control loop. It calculates a rolling "Alignment State" (using an Exponential Moving Average) and measures "Drift" as the vector distance from that state.
The paper discusses "Ground Erosion" where the model loses its hierarchy of values. In my system, the "Spirit" module detects this erosion and injects negative feedback to steer the agent back to the baseline. I recently red-teamed this against 845 adversarial attacks and it maintained fidelity 99.6% of the time.
It is cool to see the theoretical framework catching up to what is necessary in engineering practice.
Repo link: https://github.com/jnamaya/SAFi
reply