There are many ways to sandbox Javascript, both serverside and browser-side. Are...

spankalee · on July 7, 2024

Salesforce does this with a combination of web components, with a patched up ShadowRoot so that code with a reference to the shadow root can't walk into the rest of the document, and a secure evaluator function related to SES (Secure EcmaScript) to limit the globals the untrusted script has access too.

The secure evaluator is wild. I think this is the heart of it: https://github.com/Agoric/realms-shim/blob/v1.1.0/src/evalua...

There's also an idea for isolated web components to solve this in the platform: https://github.com/WICG/webcomponents/issues/1002

m1el · on July 7, 2024

Salesforce sandboxing is too easy to escape. Last time I needed to implement some feature for Salesforce, I've encountered 4 different escapes. It was also horrible dev experience.

spankalee · on July 8, 2024

I would love to hear more about that. I'm looking into their approach for a plug-in system myself.

cxr · on July 7, 2024

You can also check out the discussion for Figma's earlier work on their plugin system, which is what inspired jitl (above) to create quickjs-emscripten. Previously:

How to build a plugin system on the web and also sleep well at night. <https://news.ycombinator.com/item?id=20770105> 2019 August 22. 89 comments.

cxr · on July 7, 2024

The closest thing I know of is Allen Wirfs-Brock's jsmirrors prototype, but he never got to speccing out anything for DOM (and never really intended to as far as I know). Just capabilities for JS-the-programming-system.

You could look at jsmirrors for inspiration and take a crack at some sort of "dommirrors" yourself, but it's big undertaking. (There's a roundabout way to go about using jsmirrors as-is to kind of achieve what you want, but it's not ergonomic.)

That being said, giving access to the DOM, even mediated/simulated, is almost certainly not what you really want. Figure out what you _actually_ want to allow the other side to do, and then just give them a capability that lets them do it. (For example, to let them add a button somewhere, you might think you need to give them an anchor point (parent element) where they can insert it and let them use `document.createElement` to make the DOM node that they're going to put there. But you don't actually want that—for them to have access to `document.createElement`, etc. What you want is for them to have an add-button capability. So give them that—go implement `addButton`.)

Moar: <https://news.ycombinator.com/item?id=30703531#30706060>

PS: don't listen to anyone who comes along and says that this is what CSP is for. It's not. (If we're being accurate, even for what CSP really is for, it's poorly designed, user-hostile junk and should never have been implemented or extended as far as it has been.) It's dangerous to depend on it.

jitl · on July 8, 2024

Big plus-one to this:

> That being said, giving access to the DOM, even mediated/simulated, is almost certainly not what you really want. Figure out what you _actually_ want to allow the other side to do, and then just give them a capability that lets them do it. (For example, to let them add a button somewhere, you might think you need to give them an anchor point (parent element) where they can insert it and let them use `document.createElement` to make the DOM node that they're going to put there. But you don't actually want that—for them to have access to `document.createElement`, etc. What you want is for them to have an add-button capability. So give them that—go implement `addButton`.)

For a plugin model, I’d suggest providing a high-level UI library to add panels & actions rendered by first-party UI components in specific areas which communicate with plugin JS running in quickjs. Many plugins that integrate with the 3rd-party’s own service will also want an iframe for embedding 3rd-party content, so you can provide that as well since iframe is sandboxed and the use-case makes sense. But scripting/plugin code shouldn’t be reading or writing to the DOM, it should be making requests and responding to request from the host application APIs synchronously in-process.

That’s the way I think about it anyways.

jitl · on July 8, 2024

The only really safe way to approach this would be to give the 3rd party code an off-domain iframe with the sandbox attributes configured. You can still measure the DOM content size from the parent page to resuze the iframe to certain limits to integrate it more seamlessly into your app UI.

Depending on the level of exposure and trust between your users, you’ll need to watch out for impersonation/phishing and clickjacking attempts in the iframe. Ideally you can lock down the frame so it can’t make any web requests at all (which implies no image loading), which means there’s no way to exfiltrate data from the frame if, for example, they convinced the user to enter their password into a fake password form.

The main way to restrict what kinds of resources an iframe can request is via content-security-policy, which you can use to turn off all 3rd party images, scripts, etc.

https://developer.mozilla.org/en-US/docs/Web/API/HTMLIFrameE...

https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP

You should also enable these other sandbox attributes and disable access to privacy sensitive DOM APIs like the webcam etc:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/if...

https://developer.mozilla.org/en-US/docs/Web/Security/IFrame...

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Pe...

mattigames · on July 8, 2024

The only feasible way would be to add a API where they send you the html they want to render -as a string- and you parse it using one of the many libraries to do so, then recreate the Dom based on the parsed data, that way you can whitelist the html elements and the attributes you want to allow, if you want to allow listening to native DOM events that complicates things but is not impossible, you would need something like an API that accepts the name of event (string) and the id of the element that would receive it, you would then listen to that event in the real DOM and replicate such event inside the JS sandbox you may be using (where they must have access to the aforementioned API)

austin-cheney · on July 8, 2024

Off the top of my head the way I would this:

* On the back end request the third party code and then associate that code with a hash sequence.

+ On the backend dynamically modify the html such that there is a div tag with an id whose value is the hash sequence. Also modify the html such that there is a script tag that requests the third party code from your domain. For tracking purposes you add the hash value to a data attribute on that script tag.

* On the back end modify that third party code such that all instances of document. and window. are replaced by document.getElementById(hash_value). and all query selectors begin with #hash_value.

* You would to replace .parentNode in the Element prototype with a custom property that checks for and drops escape from the providdd container.

Then send the html document to the browser. If the third party code breaks that is ok. The constraints should be communicated to the third party and it’s up to them to test their own code before sending it to your server. All you care about is that their code does not escape the dynamically provided container. Test this regularly on your side to look for security violations.

Also, this may not work, but it would be fun to experiment with.

chmod775 · on July 8, 2024

In the context of this conversation, which is about running untrusted code, this has about a million holes.

The only way DOM access can become secure is if either browsers add support for sandboxing in such a way, or you have your own sandbox, like OPs, and provide DOM modification APIs within it that go through rigorous validation before you pass anything on to the browser.

Trying to sandbox with find/replace will never work (unless you replace the entire script with an empty string).

austin-cheney · on July 8, 2024

> this has about a million holes.

Kind of.

First of all, the DOM is a global artifact. Browsers do not provide any convention to isolate a section of the DOM tree except for iframes and document fragments and there are limitations to both. iframes are slow and not entirely isolated either. Document fragments are better for security as they are isolated from the document object, but they are designed to be worthless until appended to the document. Document fragments were only created to build multiple DOM trees in parallel without waiting to access the document object, because there is only one document object. These performance concerns are largely irrelevant now because the DOM, even with the extreme slowness of query selector strings, is insanely fast.

Secondly, this is about running untrusted code. In applications outside the browser this is super scary. However, in the browser this happens just about everywhere all the time. Any JavaScript code that comes into a page not from a domain you own is untrusted. Go to any page and look at the network tab and its common just about everywhere. The only security safeguard to this, besides the browser's single origin policy, is that all security risks are directly transferred to the user in the browser because its not requested or touching the web server.

With this context in mind talk of security holes is kind of ridiculous to the point of ignorance about how the browser works. As hacky as my suggest is, its still far more secure than how every commercial webpage normally operates.

chmod775 · on July 8, 2024

> Kind of.

This is JS code that will write "Hello!" to the body tag, bypassing protections that rely on find/replace: https://gist.githubusercontent.com/laino/8d2676f8fd6fe0de19d...

Another one that uses eval, which may be disabled by CSP on some pages: https://gist.githubusercontent.com/laino/316843234f5da5073bd...

The point is that find/replace will never work with a dynamic language like JS.

> First of all, the DOM is a global artifact. Browsers do not provide any convention to isolate a section of the DOM tree except for iframes and document fragments and there are limitations to both.

There's also shadow DOM which allows you to encapsulate things on a page.

> However, in the browser this happens just about everywhere all the time.

And this whole exercise is about making that secure, which is what OPs sandbox can do for you. Code in such sandboxes can only interact with whatever you explicitly give them access to, which is called whitelisting, whereas your approach is trivially circumventable blacklisting (the code can do anything you don't explicitly prevent).

If all you expose to code in a sandbox is your own DOM modification utilities that perform rigorous validation, then absent of any security holes, that's what the code running in the sandbox will be able to do. If you decide all it gets to do is create up to 10 div tags with a custom text and color, then that's it.

dawnerd · on July 8, 2024

That seems like it’s pretty fragile though. I’d be really worried about all the weird edge cases

jitl · on July 8, 2024

You can easily escape this by traversing a DOM node’s parent pointer.

austin-cheney · on July 8, 2024

Not at all. So, yes, depending upon the implementation there is a parent pointer tree which binds node hierarchy. This is not really how the DOM works though and not what's exposed via API. Typically the nodes are objects in memory and point to each other via static relational reference. These references are exposed to the API, like: parentNode, nextSibling, childNodes.

Under the hood deep in the binary might the parentNode make use of a parent pointer tree to map between the node instance in memory? Again, that depends upon the implementation and is entirely irrelevant to the executing JavaScript which is isolated from that layer.

jitl · on July 8, 2024

I mean "parent pointer" in a generic sense. There’s more parent direction pointers available in addition to parentNode: parentElement, ownerDocument, offsetParent. There’s just loads and loads of ways to navigate around the DOM. You can fire a custom event and watch as it bubbles out of the “sandbox” with `myEvent.currentTarget`. You could build a new script to eval bit-by-bit and then attach it with a <script> tag (can disable that with CSP though).

I wouldn't trust security based on a deny-list approach, especially when it comes to an API surface area as complex as the DOM, where the platform can roll out new APIs before you can update your deny-list.