have you seen this: https://chatjimmy.ai/ It's quite impressive what purpose bui...

redwood · 2026-03-16T22:59:23 1773701963

Wow impressive. What's the story with this?

jffry · 2026-03-16T23:06:03 1773702363

It's a tech demonstrator for a company that turns models into custom silicon for fast inference. In this case llama3.1-8b https://taalas.com/products/

gizajob · 2026-03-17T02:24:07 1773714247

Is this an ASIC? Or FPGA? Or something even more exotic?

I’m guessing it’s some form of ASIC because I can’t imagine crafting the logic of Llama on silicon is a very quick or easy job. Not that doing it on an ASIC is a piece of cake either.

jffry · 2026-03-17T13:26:37 1773753997

An ASIC is custom silicon, no?

Anyways, I found this article discussing it a bit more: https://www.eetimes.com/taalas-specializes-to-extremes-for-e...

"Taalas is borrowing some ideas from the structured ASICs of the early 2000s to make its hardwired model-specific chips. Structured ASICs used gate arrays and hardened IP blocks, changing only the interconnect layers to adapt the chip to a specific workload. At the time, this was seen as a more cost-effective alternative to a full-custom ASIC that was more performant than an FPGA."

"Taalas changes only two masks to customize a chip for a specific model, but the two masks can change both model weights and dataflow through the chip. On the HC1, the model and its weights are stored on the chip using a mask-ROM-based recall fabric paired with a (programmable) SRAM, which can be used to hold fine-tuned weights and/or the KV cache. Future generations of chips may split the SRAM onto a separate chip, meaning they could be denser than the HC1."

hmartin · 2026-03-16T23:19:49 1773703189

Taalas hardware implementation of Llama 3.1 8B They claim 16k tok/s vs Cerbras at 2k. https://taalas.com/products/