This is from like 20 years ago, but I remember Debian Testing as the one where updates broke the system most frequently, or maybe the longest without fixes: Stable was stable, Sid / unstable was what most Debian developers were using... and Testing was the weird thing that was neither a release nor tested and fixed "live" by developers.
But who actually tests Testing? If it's not the Debian developers themselves, fixes could take a while. I seem to recall Testing breaking because of package version combinations that never existed, so were never tested, in Sid.
What changed?