I have a Pixel 8 running GrapheneOS and it does most of the above:
- slightly lesser build quality than my iPhone 15 Pro but comparable
- doesn't funnel data to Google unless you explicitly let it, you can circumvent a lot of the Google Play stuff by restricting permissions and using Aurora Store
- is backed by, well, Google and the Graphene team porting over security updated. They have a fairly good track record by now.
- can use any of the N backup services that already exist unlike the iPhone's silly restriction on background apps
- don't notice performance differences between it and my iPhone 15 Pro in day to day use. If anything, biometrics are way faster on the Pixel since iOS 26 made FaceID slow as molasses for me
- the default camera is pretty good (uses all the Pixel fancy processing hardware) but if it's not good enough you can just install the stock Google camera which works fine. You can turn off network access if you don't want it snooping. Photos are neck and neck with the iPhone but the iPhone is way better at videos, no Android phone has cracked that yet IMO.
- Yeah, but no other brand can give you an AppleCare experience. Best option is phone insurance and just getting a new Pixel if that happens. Graphene can do full device backups via Seedvault and it's not that much more of a pain to restore compared to an iPhone backup. Granted, it's jankier but it's not impossible. The other issue is that Pixels are not a thing in a lot of countries, Apple really has the edge here but I'd take that risk over the UX shenanigans they pull nowadays with their latest updates (god is it awful)
LTT did a video on GrapheneOS recently[0]. The conclusion was basically that it’s a trade off between privacy and convenience. It will require more tinkering and things that don’t “just work”. While I haven’t used GrapheneOS, it doesn’t seem like something a non-technical user would have the patience for, unless they were into the idea of picking up a new hobby of managing their phone’s OS.
The way it works is the user registers / imports MCP (Model Context Protocol) servers they would like to use. All the tools of those servers are imported and then the firewall uses structured LLM calls to decide what types of action the tool performs among:
- read private data (e.g. read a local file or read your emails)
- perform an activity on your behalf (e.g. send an email or update a calendar invite)
- read public data (e.g. search the web)
The idea is that if all 3 types of tool calls are performed in a single context session, the LLM is vulnerable to jailbreak attacks (e.g. reads personal data -> reads poisoned public data with malicious instructions -> LLM gets tricked and posts personal data).
Once all the tools are classified the user can go inside and make any adjustments and then they are given the option to set up the gateway as an MCP server in their LLM client of choice. For each LLM session the gateway keeps track of all tool calls and, in particular, which action types are raised in the session. If a tool call is attempted that raises all action types for a session, it gets blocked and the user gets a notification, which sends them to the firewall UI where they can see the offending tool calls, and decide to either block the most recent one or add the triggering "set" to an allowlist.
Next steps are transitioning from the web UI for the product to a desktop app with a much cleaner and more streamlined UI. We're still working on improving the UX but the backend is solid and we would really like to get some more feedback for it.
That is a fair concern, and while that would happen often in some cases, there are others which rarely export data or rarely read public data where you can manually approve each usecase. Still, we are very interested in seeing how people use MCPs so we can improve the UX, which is why we're publishing this release. If users report that they get too many false positives we can always increase the granularity of the trifecta categories (say, exports data can be exports data publicly or privately. Or reads public data can have different tiers, etc.)
It is based on what the MCP server reports to us. As with most current LLM clients we assume that the user has checked the MCP servers they're using for authenticity.
Really good questions, let's look at them one by one:
1. We are assuming that the user has done their due diligence verifying the authenticity of the MCP server, in the same way they need to verify them when adding an MCP server to Claude code or VSCode. The gateway protects against an attacker exploiting already installed standard MCP servers, not against malicious servers.
2. That's a very good question - while it is indeed non-deterministic, we have not seen a single case of it not showing the message. Sometimes the message gets mangled but it seems like most current LLMs take the MCP output quite seriously since that is their source of truth about the real world. Also, while the message could in theory not be shown, the offending tool call will still be blocked so the worst case is that the user is simply confused.
3. Currently we follow the trifecta very literally, as in every tool is classified into a subset of {reads private data, writes on behalf of user, reads publicly modifiable data}. We have an LLM classify each tool at MCP server load time and we cache these results based on whatever data the MCP server sends us. If there are any issues with the classification, you can go into the gateway dashboard and modify it however you like. We are planning on making a improvements to the classification down the line but we think it is currently solid enough and we would like to get it into users' hands to get some UX feedback before we add extra functionality.
Those are really good points and we do have some plans for them, mainly on the first topic. What we're envisioning in terms of UX for our gateway is that when you set it up it's very defensive but whenever it detects a trifecta, you can mark it as a false positive. Over time the gateway will be trained to be exactly as permissive as the user wishes with only the rare false positive. You can already do that with the gateway today (you get a web notification when the gateway detects a trifecta and if you click into it, you get taken to a menu to approve/deny it if it occurs in the future). Granted, this can make the gateway overly-permissive but we do have plans on how to improve the granularity of these rules.
Regarding the second point, that is a very interesting topic that we haven't thought about. It would seem that our approach would work for this usecase too, though. Currently, we're defending against the LLM being gullible but gullible and actively malicious are not properties that are too different. It's definitely a topic on our radar now, thanks for bringing it up!
That's a good question! We do use an LLM to categorise the MCP tools but that is at "add" or "configure" time, not at the time they are called. As such we don't actively run an LLM while the gateway is up, all the rules are already set and requests are blocked based on the hard-set rules. Plus, at this point we don't actually look at the data that is passed around, so even if we change the rules for the trifecta, there's no way for any LLM to be poisoned by a malicious actor feeding bad data.
It is possible thay a malicious MCP could poison the LLM's ability to classify it's tools but then your threat model includes adding malicious MCPs which would be a problem for any MCP client. We are considering adding a repository of vetted MCPs (or possibly use one of the existing ones) but, as it is, we rely on the user to make sure that their MCPs are legitimate.
Malicious servers are a separate threat I think. If the server is lying about what the tools do, an LLM can't catch that without seeing server source code, thus defeating the purpose of MCP.
At least in its current state we just use an LLM to categorise each individual tool. We don't look at the data itself, although we have some ideas of how to improve things, as currently it is very "over-defensive". For example, if you have the filesystem MCP and a web search MCP, open-edison will block if you perform a filesystem read, a web search, and then a filesystem write. Still, if you rarely perform writes open-edison would still be useful for tracking things. The UX is such that after an initial block you can make an exception for the same flow the next time it occurs.
Thanks for the follow up. I can see the value in trying to look at the chained read - search - write or similar patterns to alert the user. Awareness of tool activity is definitely helpful.
It is possible to configure it like that - when a trifecta is detected, it is possible for the gateway to wait for confirmation before allowing the last MCP call to proceed. The issue with that MCP clients are still in early stages and some of them don't like waiting for a long time until they get a response and act in weird or inconvenient ways if something times out (some of them sensibly disable the entire server if a single tool times out, which in our case disables the entire gateway and therefore all MCP tools). As it is, it's much better to default to returning a block message, and emit a web notification from the gateway dashboard to get the user to approve the usecase, then rerun their previous prompt.
There's a lot of trash talking of x86 here but I feel like it's not x86 or Intel/AMD that are the problem for the simple reason that Chromebooks exist. If you've ever used a Chromebook with the Linux VM turned on, they can basically run everything you can run in Linux, don't get hot unless you actually run something demanding, have very good idle power usage, and actually sleep properly. All this while running on the same i5 that would overheat and fail to sleep in Windows / default Linux distros. This means that it is very much possible to have an x86 get similar runtimes and heat output as an M Series Mac, you just need two things:
- A properly written firmware. All Chromebooks are required to use Coreboot and have very strict requirements on the quality of the implementation set by Google. Windows laptops don't have that and very often have very annoying firmware problems, even in the best cases like Thinkpads and Frameworks. Even on samples from those good brands, just the s0ix self-tester has personally given me glaring failures in basic firmware capabilities.
- A properly tuned kernel and OS. ChromeOS is Gentoo under the hood and every core service is afaik recompiled for the CPU architecture with as many optimisations enabled. I'm pretty sure that the kernel is also tweaked for battery life and desktop usage. Default installations of popular distros will struggle to support this because they come pre-compiled and they need to support devices other than ultrabooks.
Unfortunately, it seems like Google is abandoning the project altogether, seeing as they're dropping Steam support and merging ChromeOS into Android. I wish they'd instead make another Pixelbook, work with Adobe and other professional software companies to make their software compatible with Proton + Wine, and we'd have a real competitor to the M1 Macbook Air, which nothing outside of Apple can match still.
reply