More

topsycatt · 2025-12-15T01:54:37 1765763677

I work on that team! We're growing extremely rapidly (I joined 4 months ago from GDM and I'm at the 50th percentile of tenure), if you're interested in E2E model training reach out at https://microsoft.ai/careers/

topsycatt · 2025-04-01T14:30:11 1743517811

That's a very good point. Let me speak with some folks and see what I can do.

topsycatt · 2025-03-28T19:31:17 1743190277

Thanks!

topsycatt · 2025-03-28T18:34:16 1743186856

That's the system I work on! Please feel free to ask any questions. All opinions are my own and do not represent those of my employer.

ryao · 2025-03-28T22:25:11 1743200711

I imagine you need to make and destroy sandboxed environments quite often. How fast does your code create a sandboxed environment?

Do you make the environments on demand or do you make them preemptively so that one is ready to go the moment that it is needed?

If you make them on demand, have you tested ZFS snapshots to see if it can be done even faster using zfs clone?

topsycatt · 2025-04-01T14:26:49 1743517609

Sorry for the delay in replying!

We actually use gVisor (as stated in the article) and it has a very nifty feature called checkpoint_restore (https://gvisor.dev/docs/user_guide/checkpoint_restore/) which lets us start up sandboxes extremely efficiently. Then the filesystem is just a CoW overlay.

ryao · 2025-04-02T05:11:47 1743570707

Thanks for the response. I had misread the article’s description of gVisor and mistook it as something meant to protect the rest of the system rather than something that handled the filesystem part of the sandbox. It is an interesting tool.

dullcrisp · 2025-03-29T14:43:24 1743259404

What’s ZFS? That doesn’t sound like a Google internal tool I’ve ever heard of.

x-complexity · 2025-03-29T15:08:09 1743260889

https://en.wikipedia.org/wiki/ZFS

It's a filesystem, to put it simply.

2OEH8eoCRo0 · 2025-03-29T14:46:52 1743259612

Oh boy. Get ready for the zealots

blixt · 2025-03-29T10:27:57 1743244077

Seconding this. Also curious if this is done with microkernels (I put Unikraft high on the list of tech I'd use for this kind of problem, or possibly the still-in-beta CodeSandbox SDK – and maybe E2B or Fly but didn't have as good experiences with those).

luke-stanley · 2025-03-29T09:02:34 1743238954

I use ZFS, but isn't the situation the sandbox is in totally different? Why would it be optimal?

ryao · 2025-03-29T17:25:37 1743269137

If you are making sandboxes, you need to put the files in place each time. With ZFS clones, you can keep referencing the same files repeatedly, so the amount of changes to memory needed to create an environment are minimized. Let’s say the sandbox is 1GB and each clone operation does less than 1MB of memory writes. Then you have a >1000x reduction in writing needed to make the environment.

Furthermore, ZFS ARC should treat each read operation of the same files as reading the same thing, while a sandbox made the traditional way would treat the files as unique, since they would be full copies of each other rather than references. ZFS on the other hand should only need to keep a single copy of the files cached for all environments. This reduces memory requirements dramatically. Unfortunately, the driver has double caching on mmap()’ed reads, but the duplication will only be on the actual files accessed and the copies will be from memory rather than disk. A modified driver (e.g. OSv style) would be able to eliminate the double caching for mmap’ed reads, but that is a future enhancement.

In any case, ZFS clones should have clear advantages over the more obvious way of extracting a tarball every time you need to make a new sandbox for a Python execution environment.

o11c · 2025-03-29T17:39:36 1743269976

It's worth noting that if you go down a layer, LVM snapshots are filesystem-independent.

ryao · 2025-03-29T18:06:57 1743271617

You need to preallocate space on LVM2 for storing changes and if it fills, bad things happen. You have write amplification of 4MB per write by default on LVM2, while ZFS just writes what is needed, since LVM2 isn't aware of the filesystem structures. All of the advantages WRT cache are gone if you use LVM2 too. Correct me if I am wrong.

That said, if you really want to use block devices, you could use zvols to get something similar to LVM2 out of ZFS, but it is not as good as using snapshots on ZFS' filesystems. The write amplification would be lower by default (8KB versus 4MB). The page cache would still duplicate data, but the buffer cache duplication should be bypassed if I recall correctly.

RunningDroid · 2025-03-29T16:13:44 1743264824

I believe they were referring to the use of ZFS snapshots for a Copy-on-Write type setup

hnuser123456 · 2025-03-28T18:50:30 1743187830

Is the interactive python sandbox incompatible with thinking models? It seems like I can only get the interactive sandbox by using 2.0 flash, not 2.0 flash thinking or 2.5 pro.

topsycatt · 2025-03-28T18:53:11 1743187991

That's a good question! It's not incompatible, it's just a matter of getting the flow right. I can't comment too much on that process but I'm excited for the possibilities there.

hnuser123456 · 2025-03-28T19:00:32 1743188432

Oh, I see Gemini can run code as part of the thinking process. I suppose the sandbox that happens in was the target of this research, while code editing in Gemini Canvas just has a button to export to Colab for running. The screenshots in the research show a "run" button for generated code in the chat, but I'm not seeing that exact interface.

In any case, I share your excitement.

topsycatt · 2025-03-28T19:14:40 1743189280

Canvas actually has a mix of this sandbox (with a different container) and fully client-side.

The "run" option for generated code was removed due to underutilization, but the sandbox is still used for things like the data analysis workflow and running extensions amongst other things. It's really just a general purpose sandbox for running untrusted code server-side.

hnuser123456 · 2025-03-29T04:24:05 1743222245

Is there a way for you to campaign to return the run button for common queries for code examples? It's probably the most powerful educational tool ever invented, to be able to see how the human language description turns into strange computer code which turns into resulting output. If you guys can get it secure enough, it's a killer feature.

sans_souse · 2025-03-29T08:22:42 1743236562

+1 vote here

sans_souse · 2025-03-29T08:22:12 1743236532

Talk about indirect gas-lighting, I can never find info on deprecated functions like this one, to the point I convinced myself I imagined it. I guess now I know who to ask

TechDebtDevin · 2025-03-29T05:23:13 1743225793

Have you by chance read this paper: https://agent-gen.github.io/

topsycatt · 2025-04-01T14:27:55 1743517675

I have not. I'll take a look, thanks!

wunderwuzzi23 · 2025-03-28T22:29:22 1743200962

That's cool. I did something similar in the early days with Google Bard when data visualization was added, which I believe was when the ability to run code got introduced.

One question I always had was what the user "grte" stands for...

Btw. here the tricks I used back then to scrape the file system:

https://embracethered.com/blog/posts/2024/exploring-google-b...

waych · 2025-03-29T00:59:18 1743209958

The "runtime" is a google internal distribution of libc + binutils that is used for linking binaries within the monolithic repo, "google3".

This decoupling of system libraries from the OS itself is necessary because it otherwise becomes unmanageable to ensure "google3 binaries" remain runnable on both workstations and production servers. Workstations and servers each have their own Linux distributions, and each also needs to change over time.

saagarjha · 2025-03-29T07:56:30 1743234990

Of course, this meant that some tools got stuck on some old glibc from like 2007.

waych · 2025-04-01T00:36:30 1743467790

IIRC Google has a policy whereby all google3 binaries must be rebuilt within a 6-month window. This allows teams to age-out support for old versions of things, including glibc. grte supports having multiple multiple versions of itself installed side-by-side to allow for transition periods ("v5" in the article).

saagarjha · 2025-04-01T09:15:08 1743498908

Sure, I'm talking about things linked against grtev4

flawn · 2025-03-28T23:09:08 1743203348

It says in the article - Google Runtime Environment

jemfinch · 2025-03-28T22:57:32 1743202652

grte is probably "google runtime environment", I would imagine.

fragmede · 2025-03-28T18:53:09 1743187989

Do you think "hacked Gemini and leaked its source code" is an accurate representation of what happened here?

topsycatt · 2025-03-28T18:54:33 1743188073

I'm on the Google side of the equation. I think the title is a bit sensationalized, but that's the author's prerogative.

devdudect · 2025-03-28T19:07:08 1743188828

When are we going to be able to run sandboxed php code?

simonw · 2025-03-28T20:02:45 1743192165

You can run PHP in ChatGPT Code Interpreter today if you upload the right binary (also Deno and Lua and more): https://til.simonwillison.net/llms/code-interpreter-expansio...

topsycatt · 2025-03-28T19:20:55 1743189655

We could, it's just not high up on the priority list. Any particular reason you want php?

alienbaby · 2025-03-28T19:38:19 1743190699

Possibly they are mildly insane

0xbadcafebee · 2025-03-28T22:44:42 1743201882

>75% of the web's server-side code is php. most of that is WordPress, but lots of people customize it, and being able to write your own themes, plugins, etc is a big deal

egeozcan · 2025-03-28T20:16:55 1743193015

Next step is gemini hosting Personal Home Pages.

ipaddr · 2025-03-29T02:36:07 1743215767

Why would you want to run anything else?

koakuma-chan · 2025-03-28T19:16:33 1743189393

> but that's the author's prerogative

You submitted this.

topsycatt · 2025-03-28T19:19:06 1743189546

I submitted this HN link with a title that exactly matches the one on the article, but I didn't write the title on the article. AFAIK HN posts should match the title of the article they link to.

dang · 2025-03-28T19:29:48 1743190188

Actually the rule is designed to let you correct misleading titles:

"Please use the original title, unless it is misleading or linkbait; don't editorialize." - https://news.ycombinator.com/newsguidelines.html

I've done that now (https://news.ycombinator.com/item?id=43509103).

I appreciate your scruples though! Because even though you would have been on the right side of HN's rules to correct a misleading (and/or linkbait) title, the fact that you work for Google would have opened you to the usual gotcha attacks about conflict of interest. This way we avoided all of that, and it's still a good submission and thread!

topsycatt · 2025-03-28T19:38:00 1743190680

Thank you very much dang!

gorlilla · 2025-03-29T12:09:38 1743250178

Can you run the country too?

bitexploder · 2025-03-29T04:29:48 1743222588

Dang, you are cool. :)

koakuma-chan · 2025-03-28T19:29:09 1743190149

> AFAIK HN posts should match the title of the article they link to.

I am not aware of such rule's existence.

Also "should" not "must."

To be clear: I don't have a problem with you submitting this, but the title appears to be completely false.

marcellus23 · 2025-03-28T19:29:08 1743190148

From the HN guidelines:

> Otherwise please use the original title, unless it is misleading or linkbait; don't editorialize.

Arguably this is misleading or clickbait, but safer to err on the side of using the original title.

wil421 · 2025-03-28T19:23:35 1743189815

Even better, OP shared something OP didn’t write but thought it was interesting.

enoughalready · 2025-03-28T19:46:54 1743191214

Have you contemplated running the python code in a virtual environment in the browser?

seydor · 2025-03-28T18:52:54 1743187974

you re the hacker or the google?

topsycatt · 2025-03-28T18:53:26 1743188006

The google

larodi · 2025-03-28T22:57:30 1743202650

"im the google" is definitely a top 3 chart synthpop song by ladytron .)

sans_souse · 2025-03-29T08:24:13 1743236653

Can a Mod please change thread title to I'm The Google. AMA.

onemoresoop · 2025-03-28T21:08:19 1743196099

Question: how does it feel inside google in terms of losing their lunch to OpenAi? Losing here is very loose, I don’t think OpenAI won yet but seems to have made a leap ahead of google in terms of marker share and we know google was sitting on tons of breakthroughs and research. Any panicking or internal discontent at google’s product policies? No need to answer if you’re uncomforable that your employer may hold you responsible for what you write here.

mediaman · 2025-03-28T22:15:35 1743200135

This is an unusual opinion in industry, although common with consumers.

Currently, Google has the most cost effective model (Flash 2) for tons of corporate work (OCR, classifiers, etc).

They just announced likely the most capable model currently in the market with Gemini 2.5.

Their small open source models (Gemma 3) are very good.

It is true that they've struggled to execute on product, but the actual technology is very good and getting substantial adoption in industry. Personally I've moved quite a few workloads to Google from OpenAI and Anthropic.

My main complaint is that they often release impressive models, but gimp them in experimental mode for too long, without fully releasing them (2.5 is currently in this category).

snoman · 2025-03-28T22:31:34 1743201094

How does Flash compare to Nova Lite? The latter looks less expensive. I haven’t really used either (used Nova Pro and it was good)

MyelinatedT · 2025-03-28T21:51:15 1743198675

From my perspective (talking very generally about the mood and environment here), it’s important to remember that Google is a very, very big company with many products and activities outside of AI.

As far as I can see, there is a mix of frustration at the slowness of launching, optimism/excitement that there are some really awesome things cooking, and indifference from a lot of people who think AI/LLMs as a product category are quite overhyped.

fennecbutt · 2025-03-29T03:27:25 1743218845

Idk, I used to want to work for Google but I'm not so sure anymore. They built an awesome landscaper next to my office in London.

But the UX and general functionality of their apps and services has been in steep decline for a long time now, imo. There are thousands of examples of the most basic and obvious mistakes and completely uninspired, sloppy software and service design.

MoonGhost · 2025-03-29T07:31:01 1743233461

> obvious mistakes and completely uninspired, sloppy software and service design.

That's something you can work on to improve.

A few years back I wanted to work for FAANG big company. Now I don't after working for smaller but with 'big' management. There are rats races, dirty tricks. And engineers don't have much control on what and how they are doing. Many things decided by incompetent managers. Architect position is actually a manager's title, no brain or skills required.

Today I rather go to a small company or startup where the results are visible and appreciated.

fennecbutt · 2025-04-01T20:14:22 1743538462

Well exactly. Sure I could try hard to pass some Google interview with silly exercises and be lucky and get selected most likely by some interviewer who isn't one of the devs but works in HR.

But why? When they have so much management now and have just gotten so big that it'd probably be impossible to get anything done.

dieortin · 2025-04-07T14:36:06 1744036566

That’s now how the hiring process works at Google. You seem to be making decisions based off assumptions

fennecbutt · 2025-04-08T18:13:05 1744135985

Well, it seems like they use an intense scoring system that reeks of management involvement and inconsistency (per interviewer).

I mean I'm for sure making some presumptions and plenty of assumptions; we literally evolved to do this. Otherwise we'd shake the cold paw of every shadow in the dark.

dataflow · 2025-03-28T23:35:08 1743204908

> Google is a very, very big company with many products and activities outside of AI.

Profit is what matters though, not number of products. The consumer perception is that Search rakes in the largest profits, so if they lose that, it doesn't matter what else is there. Thoughts?

nikcub · 2025-03-28T21:54:58 1743198898

Nobody serious believes this. OpenAI may be eating up consumer mindshare - but Google are providing some of the most capable, best, cheapest and fastest models for dev integration.

bitexploder · 2025-03-29T04:28:27 1743222507

As the hype dies down, Goliath shakes off the competition. AI models are now a game of inches and those inches cost billions every inch, but it matters in the long run.

lanyard-textile · 2025-03-29T21:41:29 1743284489

I’m honestly shocked to hear anyone defend gemini, respectfully :)

What casts it as most capable?

luke-stanley · 2025-03-29T09:29:06 1743240546

They just released a SOTA model (Gemini 2.5 Pro) that beats all models on most benchmarks, it's a great comeback from the model side but IMO they are less strong on the product side, they pioneered the sticky ecosystem of web app products model, though kinda like the Microsoft Office suite that (originally) had to be downloaded, ironically building on XML HTTP request support the IE5 introduced for Outlook.

Mindwipe · 2025-03-28T18:49:21 1743187761

Does anyone at Google care that you're trying to replace Assistant with this in the next few months and it can't set a timer yet?

(I mean it will tell you it's set a timer but it doesn't talk to the native clock app so nothing ever goes off if you navigate away from the window.)

hnuser123456 · 2025-03-28T18:53:29 1743188009

I doubt the guy working on the code sandbox can do anything about the overall resource allocation towards ensuring all legacy assistant features still work as well as they used to. That being said, I was trying to navigate out of an unexpected construction zone and asked google to navigate me home, and it repeatedly tried to open the map on my watch and lock my phone screen. I had to pull over and use my thumbs to start navigation the old fashioned way.

iury-sza · 2025-03-28T19:06:51 1743188811

I keep reading people complaining about this but I can't understand why. Gemini can 100% set timers and with much more subtle hints than assistant ever could. It just works. I don't get why people say it can't.

It can also play music or turn on my smart lamps, change their colors etc. I can't remember doing any special configuration for it to do that either.

Pixel 9 pro

jdiff · 2025-03-29T00:19:20 1743207560

I certainly can't get it to reliably play music on my Pixel 8. Mostly it summons YT Music, only occasionally do I get my music player, and sometimes I merely get "I'm an LLM, I can't help you with that."

And you used to be able to say "Find my phone" and it would chime and max screen brightness until found. Tried that with Gemini once, and it went on with very detailed instructions on using Google or Apple's Find My Device website (depending on what type of phone I owned), maybe calling it from another device if it's not silenced, or perhaps accepting that my device was lost or stolen if none of the above worked. Did find it during that lengthy attempt at being helpful though.

Another fun example, weather. When Gemini's in control, "What's the weather like tonight?" gets a short ramble about how weather depends on climate, with some examples of what the weather might be like broadly in Canada, Japan, or the United States at night.

Unlike Assistant where you could learn to adapt to its unique phrasing preferences, you just flat out can never reliably predict what Gemini's going to do. In exchange for higher peak performance, the floor dropped out the bottom.

dgunay · 2025-03-28T21:50:04 1743198604

I dislike Google's (mis)management of Assistant as much as the next guy, but this just has not been my experience. I can tell Gemini on my phone to set timers and it works just fine.

ChadNauseam · 2025-03-28T22:00:42 1743199242

I have a rooted pixel with a flashed custom android ROM, which should be a nightmare scenario for gemini, and it can set timers just fine (and the timers show up in the native clock app)

arebop · 2025-03-28T18:57:45 1743188265

The Assistant can't reliably set timers either, though I guess 80% is considerably better than 0. Still, I think it used to be better back before Google caught a glimpse of a different squirrel to chase.

7bit · 2025-03-28T19:06:07 1743188767

It can't do shit, especially in some EU countries, where it can do even less shit.

Setting timers reminders, calendar events. Nothing. If they kill the assistant, I'll go Apple, no matter how much I hate it.

GrayShade · 2025-03-30T10:57:04 1743332224

Just tested, you need to enable "Gemini Apps", but they remember your interactions for 3, 18 or 36 months instead of 3 days.

Mindwipe · 2025-04-02T13:06:21 1743599181

Gemini Apps doesn't offer the ability to talk to the clock app on Samsung devices.

7bit · 2025-03-30T11:00:47 1743332447

Yeah, I disabled that when I tested it. No go for me, but thanks for informing me!

nosrepa · 2025-03-28T23:57:43 1743206263

I just want the assistant voice. I hate the Gemini ones.

whatevertrevor · 2025-03-29T06:05:33 1743228333

I'm with you on that. I prefer a human trying to sound like a robot instead of a robot trying to sound human.

jwlake · 2025-03-29T04:16:59 1743221819

Is there any reason it's not documented?

ed_elliott_asc · 2025-03-29T09:16:33 1743239793

This is why hacker news is so cool

KennyBlanken · 2025-03-28T19:56:59 1743191819

Can you get someone to fix the CSS crap on the website? When I have it open it uses 40-50% of my GPU (normally ~5% in most usage)...and when I try to scroll, the scrolling is jerky mess?

topsycatt · on July 20, 2024

I noticed seemingly all deaths in Manhattan are labeled as having occurred in flushing, a nearby neighborhood. Perhaps an off by one error?

Bencarneiro · on July 21, 2024

NYC is no longer called flushing in the DB <3 thank you for flagging this

topsycatt · on Jan 18, 2024

Oh hey! I ghost-wrote an earlier article that references the same idea (https://blog.google/technology/ai/bard-improved-reasoning-go...). I wonder if they came to it independently or if they read mine and it planted some seeds? Either way, super cool work.

j2kun · on Jan 18, 2024

They have been working on these ideas since like 2021 they said, so I don't think it's that recent.

topsycatt · on Dec 9, 2023

If only it was called GNP...

topsycatt · on Dec 5, 2023

Very cool, but some basic stuff is broken having used the app. For example, I'm unable to create a new conversation since I'm unable to click on a contact after having searched for their name in the "New Chat" flow. Looking forward to using it once this is fixed!

Edit: Seems like I have to double tap instead? Not super easy to use but it works so I'll take it!

topsycatt · on Nov 22, 2023

Hey I work on that! I can only share information that's already public unfortunately, which isn't much beyond what's on that page...

rrrrrrrrrrrryan · on Nov 22, 2023

Any word on the (public) timeline for Gemini?

topsycatt · on March 18, 2022

I'm working on it, but part of the motivation for the question is wanting to find people to get ideas / guidance / friendship from who are more interesting than myself.

Kind of an “If you're the most interesting person in the room, you're in the wrong room.” vibe.

collinthecorgi · on March 22, 2022

Ah, in this case you can find your friends in the places you like. For example you like an outdoor setting, then you should go to the park. The same applies when you come to the reading book club if you're a bookworm (like me) :">