This is not the most developed form of this idea that I've seen, but it does contain a good justification section. I don't think I've ever seen someone trying to justify this before. (I'd challenge the idea that touchscreen users liked Windows 8: afaik, things were most confusing for them, unless they were already used to the much-more-sensible Windows Phone interface. Lots of important stuff in Windows 8 was hidden behind right-click, and Metro did not make it obvious that press-and-hold was doing anything until the context menu popped up. Don't get me started on Charms: https://devblogs.microsoft.com/oldnewthing/20180828-00/?p=99... tells you all you need to know about the extent to which they were actually grounded in UI research.)
For a system like this, you can't just have objects: you need some kind of interface abstraction. One example from current OSs is the webview: you want to be able to choose which of LibreWolf, Vivaldi and Servo provides the webview component. But you also don't want to be tied to one interface design (e.g. this is what is meant by "rich text", now and forevermore), since that constrains the art of the possible. If you want to preserve backwards-compatibility, this means you need to allow interface transformers / adapters provided externally ("third-party") to the components they allow to communicate.
Treating applications as monoliths isn't ideal, either: most applications are actually toolsuites. A word processor has multiple operations which can be performed on a document: some of these are tightly-linked to the document representation (e.g. formatting), but others are loosely-coupled (e.g. spellcheck). We can break these operations out as separate objects by constructing an interface for the document representation they expect: this would provide a kind of mutable view (called a "lens", in academic literature; known as "getters and setters" to most programmers), allowing GIMP plug-ins to see a GIMPDrawable while exposing a Krita Document to a Krita plug-in. (Or ideally something more specific than "Krita Document", but Krita's documentation is awful.) (These would, of course, be very complicated translation layers to write, so it might make more sense to do things the other way around to begin with: produce a simpler interface, and expose the resulting tools in both Krita and GIMP.)
In principle, documents can get arbitrarily complex. Microsoft's OLE architecture was a good first start, but it was still "composition of monoliths". You couldn't run spell-check on an OLE document and all its child documents. Perhaps a solution for this lies in ontology logs, though for pragmatic reasons you'd want a way to select the best translation from a given set of almost-commuting paths. (The current-day analogue for this would be the Paste Special interface: I'm sure everyone has a story about all of the options being lossy in different ways, and having to manually combine them to get the result you want. This is an inevitable failure mode of this kind of ad-hoc interoperability, and one we'd need to plan around.)
For describing interfaces, we want to further decouple what it is from what it looks like. If I update Dillo, I want all right-click context menu entries from the new version to appear, but I still want the overall style to remain the same. There are multiple approaches, including CSS and monkeypatching (and I've written about others: https://news.ycombinator.com/item?id=28172874), but I think we at least need a declarative interface language / software interface renderer distinction. Our interface language should describe the semantics of the interface, mapping to simple calls into the (stateful) object providing our user interface (sitting on top of the underlying API, to provide the necessary decoupling between the conceptual API, and the UI-specific implementation details). The semantics should at least support a mapping from WAI-ARIA, but ideally should support all the common UI paradigms in some way – obviously, in such a fashion that it is not too hard to convert a tabbed pane into a single region with section headings (by slapping another translation layer on top, or otherwise).
Then, there should be interface-editing interfaces, which will be relatively simple to produce once all the underlying work has been done. The interface-editing interface will, naturally, let you draw on backgrounds, spell-check your labels, change fonts… using the same tools and toolbars as you use in any other program – or a toolbar you've cobbled together yourself, by grabbing bits from existing applications.
---
Since translation can get quite involved in this scheme (e.g. if you're trying to use an Image Editor v1 pencil on an Image Editor v43 canvas, there might be 18 different changes to pixel buffer representation in the pile of compatibility layers), this system would benefit from being able to recompile components as-needed, to keep the system fast. We'd want a compiler with excellent support for the as-if rule, and languages high-level enough to make that easy. We'd also want to make sensible decisions about what to compile: it might make sense to specialise the Image Editor v1 pencil to use the Image Editor v43 interface, or it might make sense to compile the Image Editor v1 – Image Editor v43 compatibility chain into a single translation layer, or it might make sense to use a more generic Raster Canvas interface instead. This decision-making could take into account how the software is actually used, or we could make it the responsibility of distro maintainers – or even both, akin to Debian's popularity contest.
Recompilation tasks should be off-loaded to a queue, to give the user as much control as they want (e.g. they might not want to run a 30-minute max-out-the-processor compilation job while on battery, or an organisation might want to handle it centrally on their build servers). Since modular systems with sensible interfaces tend to be more secure (there are fewer places for vulnerabilities to hide, since modules are only as tightly-integrated as their interfaces support), we wouldn't expect to need as many (or as large) security updates, but the principles are similar.
This would only become a problem after a few years, though, so the MVP need not include any recompilation functionality: naïvely chaining interfaces is Good Enough™.
https://mmcthrow-musings.blogspot.com/2020/04/a-proposal-for...