> And finally, arbitrary JavaScript execution will be questioned, as it should'v...

hollerith · on Aug 18, 2019

Thank you for your well-written comment.

I don't want to deprive you of the VM as promised, but that seems very hard to implement, and I don't have time to help.

I just want to solve the much easier problem of giving people some way to use the internet to read documents without getting tracked [1].

By "document" I mean a page of text, images, links to other documents and maybe some other easy-to-implement non-privacy-compromising things.

The web is how almost all documents are made available on the public internet. Most document authors don't even consider or imagine any other way to do it. And the web is a privacy nightmare. That is the problem I'd like to solve.

I felt the need to write this because past discussions on this site of the problem I want to solve have gotten derailed into a discussion of how finally to achieve the vision of "a VM where anybody (technical or not) could run almost anything", which, like I said, strikes me as a much harder nut to crack.

[1]: and without the need for anything as demanding of the user's time or the user's technical skills as you described when you wrote, "I run UMatrix and manually whitelist 3rd-party scripts, I've blocked web features like WebGL and Canvas behind prompts".

danShumway · on Aug 18, 2019

We already have all the of the technology we would need to build a document-only web now. The biggest unresolved problems are asset caching[0] and IP addresses[1]. But for the most part, nobody would even have to build a new browser, they could just distribute a custom build of Firefox that turned off Javascript and a few other features.

It would be very fast, reasonably private, probably a lot nicer to use (at least where documents are concerned), and nobody would use it. I don't necessarily disagree with your goal -- it seems very reasonable to want a document distribution platform that isn't encumbered with JS. But given that news sites can already speed up their pages dramatically by removing JS, and they don't, why would they support this new browser or platform? And without them supporting it, why would users move to it?

We've seen this play out with AMP. AMP is fundamentally flawed, but it did get one thing right: that news sites only care about search engine placement, and they do not care about user experience past that point.

I would (very cautiously) suggest that it might actually be easier to implement a VM that safely runs arbitrary code, than it would be to convince publishers to move to a user-friendly platform that only distributed documents. The only success I've seen in getting publishers to abandon scripting is via platforms like Facebook and Medium -- maybe that's replicable with a new browser or distribution layer on top of the web[2]? I dunno; I think that also might just be a really, really hard problem.

I'd be happy to be proven wrong, I would happily start using a ubiquitous, script-free platform for document distribution.

---

[0]: We maybe need to just bite the bullet and get rid of asset caching, or at least give asset caches a very short expiration date (<1 day). I guess making them domain-specific could help too.

[1]: IP addresses are a huge issue, and I don't think the tech community talks about them enough. TOR is not really scaling. The best solution for ordinary consumers is a VPN, and VPNs have a lot of pretty obvious problems.

[2]: My guess would be, someone would need to make publishing much easier than building a website (ie, Medium), maybe by adopting the DAT protocol and just offering free hosting for everyone. That runs into the same IP address problems (DAT and IPFS are not privacy friendly), but there's progress being made in that area. Or someone would need to find a way to get users to en-mass abandon the web and move to the new platform.

hollerith · on Aug 18, 2019

>it might actually be easier to implement a VM that safely runs arbitrary code, than it would be to convince publishers to move to a user-friendly platform that only distributed documents.

Would the VM that safely runs arbitrary code render existing web pages or would it be necessary to persuade publishers to adopt it?

danShumway · on Aug 18, 2019

The current strategy being pursued by Apple and Mozilla is -- yes. They're hoping to make Javascript into that VM without breaking the majority of web pages.

It is yet to be seen whether that's feasible. Apple and Mozilla are certainly making a lot of progress, but Javascript is very old, and was designed in an era where the attacks were much less sophisticated.

The most promising progress (I think) is in building tracking protection that is undetectable. For example, you can put a permission prompt in front of someone's location and block off the API if the user clicked "no", but you could also just lie about their location, which means you'd still be compatible with most existing pages, and publishers wouldn't be able to strong-arm users into turning off the setting. This is how Firefox currently handles high-resolution timers. You can request them, they'll just lie to you sometimes.

Again, it's yet to be seen whether that kind of stuff will work.

The other hope is that WASM may be good enough on its own to encourage wide adoption -- being able to use (almost) any language to compile to the web is very, very attractive, so WASM might overcome Javascript's network effects (no pun intended) and replace it as the dominant language on the web.

If that happens, WASM might be an opportunity to rethink web permissions. Might.

If neither of those approaches work, then anything is fair game. At that point, we might as well try to make a document-only web, or migrate everyone to a new platform. I think that will be very difficult though.

hollerith · on Aug 18, 2019

Thanks for your reply to my other question.

I don't understand what problem(s) asset caching is currently causing. (Although I am a programmer, I'm not a web developer.)

Is the asset caching done by the browser a major part of the problem you refer to?

Is the caching done by content delivery networks a major part of the problem?

Are there other major parts of the problem with asset caching?

danShumway · on Aug 18, 2019

Asset caching problems I worry about come from two sources: the browser itself, and 3rd-party content-delivery networks.

On the browser side, by serving a unique set of resources to each user, I can identify them in future visits. Roughly speaking, the attack works by inserting a combination of unique asset URLs and repeated asset URLs, and then logging on the server which of those URLS the browser tries to fetch. By responding to fetch requests with 404s or bad assets, the server can avoid having your browser overwrite the unique combination of URLS it has cached, which allows for more persistent tracking.

These caches also work across domains -- that means that if "foo" requests asset "googlepixel/my_id", and "bar" also requests asset "googlepixel/my_id", both will get served from the same cache. This means that caches can be used to track browsers across multiple websites.

On the content-delivery side, the cached images are likely unique to the website you're visiting, and by downloading them, you're letting the 3rd-party know that your IP address visited the page that they're associated with. In a centralized web this is a small concern, to the extent that it allows Cloudflare/Google to know a lot more about what pages your IP is visiting.

In a distributed web like IPFS this is a bigger problem, because now you're connecting to completely random people online to get your assets downloaded. Even worse, you're asking for those images/documents not by going to one specific server and saying, "hey, server X said you had asset Y", but by announcing to the entire network, "hey, I want asset Y, does anyone have it?"

tannhaeuser · on Aug 18, 2019

> What the web promised was a VM where anybody (technical or not) could run almost anything

I have to agree with hollerith here and say this is not at all what was promised. Instead, what was promised, IMHO, was that there are linked, passive pages with an expectation of UI transparency wrt when network accesses occur (in response to clicking on links or submitting forms), and this notion is even honored in the HTML 5 spec to some degree. A universal VM/runtime is desired by developers who want to sell services rather than sw, or don't want to bother with deployment procedures in app stores, or want portable code accross platforms, or for other plausible reasons, but is a non-priority next to the web's original purpose. OSs are far superior platforms for general-purpose apps, and turning browsers into platforms is only helping the (few) browser vendors left, but trampling by design on security, privacy, simplicity, and power efficiency.

danShumway · on Aug 18, 2019

Sounds good -- but, if we move all of the web apps to be native apps, then I want a native environment that can safely run untrusted code.

Currently, none exist that I'm aware of. Phones aren't doing well on that front, and we haven't finished moving to Wayland yet, so that mess still exists. X11 is heaven for anyone who wants to fingerprint a device. And we still have to figure out whether or not we're going to allow high-resolution timers or raw access to the GPU, which is itself a pretty big fingerprinting target.

On mobile phones, the closest thing I have to a good adblocker is AFWall+, which doesn't work on iOS, and only blocks via the built-in IP-table, which isn't good enough to make me feel safe running apps like Facebook or Twitter. And most mainstream Linux distros (with a few exceptions like Qubes OS) are not shipping with the kind of process isolation that's necessary to guard against malware.

I guess MacOS is making some progress in this area at least? But for the most part, none of our computer environments were designed to run untrusted code -- Linux in particular was primarily designed to protect you against other users. The prevailing advice was, "just don't download malware", which doesn't reflect how people today use computers.

I want to stress -- there could be a solution to this. We could make a user-friendly native platform that replaced the web. But I don't think anyone has made one yet.

I want to advocate that it's a good idea for us to solve that problem (or at least think about it) before we get rid of Javascript. I don't care what happens on the web, except that the web is currently the most user-friendly, widely-used VM that we have. I see a lot of people suggesting that we burn that down, but I'm not sure they've really thought about what's going to happen afterwards.

hollerith · on Aug 19, 2019

>the web is currently the most user-friendly, widely-used VM [where anybody (technical or not) could run almost anything] that we have. I see a lot of people suggesting that we burn that down, but I'm not sure they've really thought about what's going to happen afterwards.

Could you say more about this? Previously, I asked you for technical information, but here I'm after your aspirations and maybe your values. What is so great about a state of affairs in which the average consumer can decide where on the internet to go today and at each stop (e.g., web page) along the way, code written by the owner of the web page is sent to the consumer's computer and is transparently run with the user's having to install anything?

My guess is that you dream of using the internet to create compelling experiences that move many (million?) of people, and you consider documents consisting of text, images and links to other documents woefully inadequate for that purpose, but let's hear from you.

(BTW, I don't care about using the internet to consume compelling or moving experiences -- or more precisely ordinary text documents, images, audios and videos are the only types of compelling / moving experiences that I use the internet to consume, and I have no need or desire for more than that.)

danShumway · on Aug 19, 2019

> What is so great about a state of affairs in which the average consumer can decide where on the internet to go today and at each stop... [code] is transparently run

Someone might as well ask what's so great about general purpose computers, or Open Source. The web is a way to share documents, but even from its origin it was also a way to distribute software packages. There are a couple of things that also make it a reasonably decent software runtime, but more on that later.

My goal is that I want to make it easier for ordinary people to share software and to share software modifications -- that means fewer gatekeepers (ie, app stores), less complicated publishing (software should be as portable as possible), and less complicated installation. The removal of those barriers means that software is inherently less trustworthy -- I want ordinary people to be able to share code, but I also don't trust ordinary people that much.

On Linux, our thought process around software has been that distro packagers will read source code and hand-pick which packages are safe. Users can bypass their package managers, but for the most part shouldn't, unless they feel OK reading the source code and evaluating whether the author is trustworthy. This doesn't really scale (see Android), it requires a ton of volunteer work, it makes developing and distributing software much harder, and it puts burdens on end users that are unrealistic.

If we want a world where anyone can write software and anyone can run it, we have to make arbitrary code safer. It's never going to be 100% safe, but a user should feel comfortable downloading and installing an arbitrary app. When I say that currently the web is the best VM, this is what I'm referring to.

Across almost every axis, it is currently safer to visit a random website than it is to download a random app to your phone or desktop computer. And when I talk to people about hardening phone security, they're all caught up on moderation and approval processes, which are actively the wrong direction to go if you think of computers as general-purpose, democratizing devices.

From this point of view, it's less that the web should be a software runtime, and more that making software accessible requires us to have a good software runtime, and currently the web is better than the alternatives. It's pragmatic -- all of the other software runtimes are either less secure (Android/Windows), or less accessible (Qubes OS, actual VMs).

----

> My guess is that you dream of using the internet to create compelling experiences... and you consider documents consisting of text, images and links to other documents woefully inadequate for that purpose

I do want to be able to create compelling experiences and weird stuff, and I think there's an inherent value to having even flawed platforms that enable that. But, let's ignore weird canvas experiments and games, since not everyone cares about them. When we talk about traditional, normal software, my position is the opposite -- that document layout tools are adequate for most software.

Let's ignore the web and just talk about what a good general application framework would look like. Maybe about 60-70% of the software I run today could be using a terminal interface. Pure text is good enough for a large portion of application interfaces, and terminals are usually nicer to use than GUIs.

Most other applications I run natively are just documents, and they'd be better if their interfaces were HTML/CSS. Chat apps, text/database editors, git clients, file navigators, calendars, music players: these are not fundamentally complicated interfaces. The only applications I have installed natively that aren't just interactive documents are fringe-cases: games, image editors, Blender. There's a subset of programmers that get wrapped up in having pixel-level control over how their applications look, and I couldn't care less about how they want their applications to look -- all of their interfaces are just text arranged into tables with maybe a few SVGs on the side. They're documents that I can click on.

HTML and CSS have real problems, and we might want to fix a few of them. But they're already pretty good at laying out documents -- arguably better than most other interface tools that we have. And once you start thinking of applications as interactive documents, a lot of design decisions in HTML/CSS make a lot more sense. For example, if HTML is a language that you use to build a display layer, than it's dumb that there aren't more significant 2-way data-binding tools. But if HTML is a display layer, then it's obvious why we wouldn't want to have a lot of 2-way data-bindings -- they're hard for users to consume.

Where scripting is concerned, we have two options for this theoretical platform: we can run logic locally, or we can run it on a server. A lot of FOSS developers advocate for serverside logic, and I don't understand that, because I think that SaaS is (often) just another form of DRM that takes control away from users. I'd like to move more logic off of servers -- some of the biggest weaknesses of the web come from the fact that everything is so impermanent; you can't pin libraries, you can't run an older version of a website, you can't easily move data around. SaaS makes the majority of those problems worse. If a calculation can be done locally it is often better for the user to avoid the server entirely and bundle everything clientside.

None of this touches on the network layer or user extensions, which could also be long conversations in and of themselves. And again, I want to stress this theoretical application runtime could be anywhere; we could have a document-only web and do applications someplace else. But I don't (usually) see people proposing anything like that when they talk about getting rid of Javascript -- usually their vision ends up being either, "fewer people should write software, and we'll just use the existing native model" or "everything should be SaaS."

I don't like either of those visions. I think most native platforms are just as bad as the web today (worse if you're thinking about security), and I think widespread SaaS is bad for users. Again, this is pragmatic -- it's not that the web is great, or that it doesn't have fundamental problems, it's that the web currently exists and is available to most people, and I don't think any of the native alternatives are comparable. If someone showed me something better, I'd abandon the web in a heartbeat.

_revy · on Aug 18, 2019

> What the web promised was a VM where anybody (technical or not) could run almost anything

The web promised a place where we could read structured text documents with the occasional embedded piece of media.

If it was known that it would be an app platform like today, decisions would have been made differently.

zzo38computer · on Aug 18, 2019

Such the VM is way too complicated, and has other problems. (Also, it is difficult to sufficiently control the features of JavaScript programs (not only HTML related stuff, but also Date and endianness), or to replace specific scripts.)

But just if you want a Turing complete language that is incapable of fingerprinting the host environment, there is TeX, which might be incapable of fingerprinting if a bug it has is fixed, and the file I/O is made more restricted.

Also, for internet stuff there are many other programs and protocols, so could be used, e.g. SMTP, NNTP, Gopher, Telnet/SSH, etc. (In many ways it will even work better. No need to deal with complex user interfaces that do not even work properly and which is not even the user's intentions.)

There is also Glulx, and other VMs, and I also thought of idea to make up a VM for this purpose, too.

ksec · on Aug 18, 2019

>We could get rid of Javascript on the web, it might even be a good idea.

We will first need to get 95% of the useful Javascript functionality within Browser first. I would be happy if I could have Rails's Stimulus and Turoblinks function without Javascript.

gruez · on Aug 18, 2019

>I've blocked web features like WebGL and Canvas behind prompts

what extension does this? there's plenty that disables it outright, but is there one that shows a prompt?

danShumway · on Aug 18, 2019

Firefox settings. Fingerprint resistance (privacy.resistFingerprinting) will auto-reject the canvas and show a relatively non-intrusive icon in the address bar that lets you re-enable it for that website[0].

Since webGL requires you to use canvas, this should also block that attack, although I'm currently going a step farther and disabling webGL entirely (webgl.disabled), since I've seen a few sites (panopticlick for example) get around the prompt specifically with webGL, and I don't know how they're doing it.

Firefox's fingerprint resistance efforts are showing a lot of promise, although you will have to put up with some quirks (like learning to read UTC time).

[0]: https://support.mozilla.org/en-US/kb/firefox-protection-agai...

gruez · on Aug 18, 2019

I know there's a prompt for canvas, but it doesn't seem to work for webgl. On a fresh install with fingerprint resistance enabled, webgl fingerprinting appears to work fine[1].

[1] https://browserleaks.com/webgl