Hacker Newsnew | past | comments | ask | show | jobs | submit | deian's commentslogin

It is doable, but it's hard to make it fast on all platform. See the SegmentZero32 description in <https://cseweb.ucsd.edu/~dstefan/pubs/kolosick:2022:isolatio...> for an example prototype.


"Unfortunately, since the entire point of it is to protect against malicious leaks..."

That's actually not the entire point. At least in this paper, we do not claim to address attacks that leverage covert channels. But the attacker model assumption is weaker (i.e., the attacker is assumed to be more powerful) than that originally assumed by the Chrome design (e.g., that only pages are malicious and will try to exploit extensions). And this is important. Particularly because the design that we end up with will be more secure than the current one. So, at worst, the new system addresses the limitations of the existing system under their attacker model. Then, depending on how far you are willing to hack up the browser, underlying OS, or hardware you can also try to address the covert channel leaks.


Rather than reply individually to the messages from this thread, I'm going to try to clarify some things here. I think a lot of good points were brought up both with regards to the "taint" mechanism and timing covert channels.

IFC/tainting mechanism:

The style of IFC we are proposing is not not the tainting style used by more typical language-level systems. In particular, we're proposing a mostly-coarse grained system:

- A single label is associated with a context. A context can be the main extension context or, more likely, a light-weight worker the main extension context created. This single label is conceptually the label on everything in scope. As such, when performing some I/O this label is used by the IFC mechanism to check if the code is allowed to perform the read/write. This coarse-grained approach alleviates the need to keep track of tricky leaks due to control flow: the branch condition has the label of the context so any control flow within the context is safe. If you want to perform a public computation after a secret computation, you need to structure your program into multiple contexts (e.g. create a new worker wherein you intend to perform the secret computation).

- Labels can also be associated with values. A new constructor Labeled is provided for this task. You can think of it as a boxing the value and, by default, only allowing you to inspect the label of the value. You can pass such labeled values around as you see fit (and this is pretty useful in real applications). Importantly, however, when you read the underlying (protected) value the IFC system taints your context---i.e., it raises the context label to let you read the value, but also restricts your code from writing to arbitrary end-points since doing so may leak something about the newly-read value.

An important requirement of this IFC approach is that there be no unlabeled (implicit) shared state between the different contexts. For example, rand() in context A and B must not share an underlying generator. This kind of isolation can be achieved in practice and is an important detail when considering covert channels.

Timing channels:

It turns out that if you do IFC in this coarse-grained fashion you can prevent leaks due to timing covert channels, if you are willing/able to schedule contexts using a "secure scheduler." In [1] we showed how to eliminate timing attacks of the style described in this thread (internal timing channels) that are inherent to the more typical IFC systems. In [2] we extended the system to deal with timing attacks due to the CPU cache by using an instruction-based scheduler (IBS). IBS and other techniques to deal with timing channels are nicely described in [3]. (I'd be happy to expand here, but this is already a long reply.)

[1] http://www.scs.stanford.edu/~deian/pubs/stefan:2012:addressi... [2] http://www.scs.stanford.edu/~deian/pubs/stefan:2013:eliminat... [3] https://ssrg.nicta.com.au/projects/TS/timingchannels.pml


Right, minimizing attack surface is pretty important. Though the described attack scenario (a form of self-exfiltration attacks [1]) is something we did think about. (The details of the core IFC mechanism are describe in the COWL paper [2].) For example, if the extension only needs to read data from gmail.com it is tainted with a unique origin. (In general, IFC can be used to deal with both confidentiality and integrity.)

[1] http://www.ieee-security.org/TC/W2SP/2012/papers/w2sp12-fina... [2] http://www.scs.stanford.edu/~deian/pubs/stefan:2014:protecti...


Even worse, they only update if they don't need any additional permissions. Thus incentivizing developers to ask for more permissions up front.


"I believe that concerns like this are why Apple will introduce the "content blocking" extensions in iOS 9 and OS X 10.11. They enable the most popular types of extension (ad blocking and privacy protection) without letting extension code run in your browser."

Fully agree. We actually described exactly that mechanism in an early version of our paper (declarative APIs), but didn't have enough space to do it for the final version.

"While the tainted data approach sounds interesting, I don't think there's an easy way to guarantee the safeness of arbitrary code executed on your machine. It's possible to sandbox code, but as soon as you allow any communication at all, there's no automated way to prevent data theft."

It turns out, it is possible with information flow control (IFC). The simple idea behind IFC is to protect data by labeling/tagging it and restricting code according to the kinds of labeled data it reads. Once code in an execution context (e.g., iframe or process) reads some labeled data, IFC restricts where it can further communicate. In the simplest form: once you read data that is SECRET, you can't write to any PUBLIC communication channel. (You can, of course, write to a SECRET channel.)


That's a tricky question. Some extension (e.g., HTTPS Everywhere[1]) can improve your privacy on the Web and are arguably written by developers that as trustworthy as your browser developers. But, in general, I would be cautious.

[1] https://www.eff.org/https-everywhere


Yep, you are right. If the crypto/label API didn't force a fixed-length blob (which may be hard to do), it would certainly be leaking some information.


I was thinking more like timing side channels (if you can force the encryption at will and it isn't fixed time).

The possible security models where you can send data but it's encrypted are not very appealing. For a single application it may be fine (lastpass, or chrome syncing with passphrase), but it's really hard to see how that can be a standard api and remain secure.


Very good points. We proposed a way to deal with DOM manipulation in the paper [1], but Stefan omitted this in the blog post. Specifically, Section 4 of the paper (the "Page access" paragraph) briefly describes this. (Sorry for referring you to the paper, but our wording in the paper is probably better than my attempt to paraphrase here.)

Of course there are other ways malicious extensions can used to leak data---pick your favorite covert channel. But the idea was to propose APIs (and mechanisms) that are not overtly leaky. (We are early in the process of actually building this though.)

[1] https://www.usenix.org/conference/hotos15/workshop-program/p...


"To ensure that extensions cannot leak through the page’s DOM, we argue that extensions should instead write to a shadow-copy of the page DOM—any content loading as a result of modifying the shadow runs with the privilege of the extension and not the page. This ensures that the extension’s changes to the page are isolated from that of the page, while giving the appearance of a single layout" Could you elaborate more on this? Do you mean that you'll compare the network requests made from the main and the shadow pages? What if the main script is sensitive to the request receive time? Then the shadow DOM may act differently. From more practical standpoint, having two DOMs for every page will eat even more of my laptop's RAM.


"Do you mean that you'll compare the network requests made from the main and the shadow pages?"

Essentially, yes. Requests from the extension should be treated as if they are of an origin different from the page. (We could potentially piggy-back on existing notions of security principals (e.g., that Firefox has) to avoid huge performance hits.) And if the extension is tainted the kinds of requests will be restricted according to the taint (as in COWL [1], likely using CSP for the underlying enforcement).

"What if the main script is sensitive to the request receive time? Then the shadow DOM may act differently."

If by main script you mean a script on the page, then there should be no real difference.

"From more practical standpoint, having two DOMs for every page will eat even more of my laptop's RAM."

I hope this won't be so bad down the line (assuming we'd be able to leverage some underlying shadow DOM infrastructure and that performs relatively well).

[1] http://cowl.ws


Sorry for the slow reply.

I think part of what you are proposing Chrome already implements and calls "isolated worlds". Chrome extensions don't operate directly on the page's DOM, they have an isolated version of it (https://developer.chrome.com/extensions/content_scripts).

So in principal, we already have the context from which we can decide which network requests to allow or block (this is already used today to allow cross-origin XHR from content scripts).

However, it is super gnarly to implement your idea in practice because:

1. There has to be a connection traced from every network requests back to the JS context which ultimately caused the request (e.g., by mutating the DOM). This is doable, it's just a lot of work.

2. There can't be any way to execute JavaScript in the page - even with page's principal. Such mechanisms exist today by design because developers desire them.

3. Even if you do 1 and 2, there are still channels such as hyperlinks. The extension could add a hyperlink and get the user to click on it. I suppose you could try and tie that back to the script context that created or modified the hyperlink.

4. Even if you do 1-3, if you can induce the page script (or any other extension, or the browser, or the user) to request a URL of your design, you win.

Sigh. Still seems fun as a research project to see how close you could get.


Yep, isolated worlds is definitely what we want, and part of the inspiration for the particular (DOM modification) feature we proposed.

I think CSP helps with 1 & 2, unless I'm missing something? (Our labels map down to CSP policies pretty naturally.)

Points 3-4 and phishing, in general, are definitely a concern. Unfortunately, I'm not sure that a great solution that does not get in way exists, but we'll see how close we can get :)


CSP does help, but it is document-specific, not js-context specific. At least in Chrome, tracing the js context that was responsible for some network request would be significantly more difficult to implement.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: