Here is the link to the blogpost, that actually describe what this is: https://g...

nels · 2026-03-31T05:56:17 1774936577

I think you meant to link this page: https://research.google/blog/a-decoder-only-foundation-model...

OliverGuy · 2026-03-31T07:14:27 1774941267

Wish they gave some numbers for total GPU hours to train this model, seems comparatively tiny when compared to LLMs so interested to know how close this is to something trainable by your average hobbyist/university/small lab

OliverGuy · 2026-03-31T07:27:41 1774942061

Edit, it looks like the paper does

TPUv5e with 16 tensor cores for 2 days for the 200M param model.

Claude reckons this is 60 hours on a 8xA100 rig, so very accessibile compared to LLMs for smaller labs

refulgentis · 2026-03-31T05:53:19 1774936399

That takes me to the same content as the submission, a GitHub repo (Chrome on iOS)

rockwotj · 2026-03-31T05:56:31 1774936591

Probably the better link: https://research.google/blog/a-decoder-only-foundation-model...

akshayshah · 2026-03-31T05:57:45 1774936665

And https://arxiv.org/pdf/2310.10688 if you want the full paper.

Cyuonut · 2026-03-31T05:57:13 1774936633

I suppose they tried to link this: https://research.google/blog/a-decoder-only-foundation-model...