Their findings on KV-cache invalidation are spot on for a single-context approach.
Strata's architecture is philosophically different. Instead of loading a large toolset and masking it, we guide the LLM through a multi-step dialogue. Each step (e.g., choosing an app, then a category) is a separate, very small, and cheap LLM call.
So, we trade one massive prompt for a few tiny ones. This avoids the KV-cache issue because the context for each decision is minimal, and it prevents model confusion because the agent only ever sees the tools relevant to its current step. It's a different path to the same goal: making the agent smarter by not overwhelming it. Thanks for the great link!
Strata's architecture is philosophically different. Instead of loading a large toolset and masking it, we guide the LLM through a multi-step dialogue. Each step (e.g., choosing an app, then a category) is a separate, very small, and cheap LLM call.
So, we trade one massive prompt for a few tiny ones. This avoids the KV-cache issue because the context for each decision is minimal, and it prevents model confusion because the agent only ever sees the tools relevant to its current step. It's a different path to the same goal: making the agent smarter by not overwhelming it. Thanks for the great link!