AI shouldn’t block the conversation just because a tool is busy. To evaluate that behavior properly, we needed good data—so we made it.
AsyncTool is a Hugging Face dataset of 270 high‑quality, multi‑turn (and I mean up to 60 turns) conversations where the assistant keeps talking while tools work in the background. Each case is different, grounded in real JSON‑Schema tool definitions, and the tool calls/results are consistent and make sense with no fabricated states or magical shortcuts.
What’s inside
- 18 scenario templates × 15 renders = 270 conversations.
- Conversations run 10–30 “in‑world” minutes with filler chat, retries, status checks, and out‑of‑order returns.
- Every row includes messages, tools, and meta so you can replay transcripts, inspect schemas, and trace provenance.
- Protocol features: <tool_ack /> placeholders, -FINAL handoffs, mixed sync/async chains, transient failures, and fatal‑error surfacing.
- License: Apache‑2.0.
We’re exploring how agents can ack now, answer later - waiting for the right signal (last relevant tool result vs. last user question) while staying natural and helpful. This dataset gives you supervised signals to:
- finetune assistants that acknowledge async work without hallucinating tool states,
- build guardrails/regression tests for routers juggling retries and reordered responses,
- evaluate “answered at the right time” behavior.
We’re also publishing the generator so you can reproduce or extend everything locally. If you’re building tool‑using agents - or just tired of UIs that freeze—this should help you train, test, and iterate faster.
We’ve spent a while trying to generate a very specific, complex, multi‑turn conversation dataset. Most tools pushed us toward glue scripts and one‑off pipelines that were hard to review or reuse. Torque is our attempt to make this boring and predictable: a small, MIT‑licensed framework that treats dataset generation like building a UI. You define small, declarative “components” and compose them into pipelines. The goal is clear code and repeatable runs, not another heavy DSL.
It’s early and open source. We’d love feedback on the API design, examples you’d like to see, and rough edges we should fix first.
Docs and code: https://github.com/qforge-dev/torque
Fully understand the WhatsApp part. Do you use any other communicators? Discord, Signal or something else? We are looking for more "interfaces".
When it comes to async that's exactly what we are trying to "solve". Right now models are built in a way that they expect tool results right after tool calls.
For tooling attached to it right now we are using Pipedream integrations and are planning to move to an open source public solution configurable by users so u can set whatever you want.
So imagine you have a single chat interface that steers a fleet of other agents in the background for code. But not by handing off the memory but navigating the tasks in an async way.
Right now it can:
- handle real tasks in the background — emails, calendar stuff, research, finding info, organizing data
- chat naturally without feeling like you're talking to a bot
- remember context and keep conversations flowing
- work with integrations (gmail, calendar, docs, maps, etc.) so it can actually do stuff, not just talk about it
- multi-task — you can ask it multiple things and it can handle them in parallel also if u mispronounce anything it can update the existing stuff that is happening.
It's like you're talking to a real person. No stop buttons, no waiting. Interrupt, add details, or change direction anytime. Just like a natural conversation.
Our vision on the future of computer use through voice assistance. Available everywhere in your system to do whatever you'd like including advanced agent based tool usage.
We're having a completely free beta now so hope you'll like it and test it out!
AsyncTool is a Hugging Face dataset of 270 high‑quality, multi‑turn (and I mean up to 60 turns) conversations where the assistant keeps talking while tools work in the background. Each case is different, grounded in real JSON‑Schema tool definitions, and the tool calls/results are consistent and make sense with no fabricated states or magical shortcuts.
What’s inside - 18 scenario templates × 15 renders = 270 conversations. - Conversations run 10–30 “in‑world” minutes with filler chat, retries, status checks, and out‑of‑order returns. - Every row includes messages, tools, and meta so you can replay transcripts, inspect schemas, and trace provenance. - Protocol features: <tool_ack /> placeholders, -FINAL handoffs, mixed sync/async chains, transient failures, and fatal‑error surfacing. - License: Apache‑2.0.
We’re exploring how agents can ack now, answer later - waiting for the right signal (last relevant tool result vs. last user question) while staying natural and helpful. This dataset gives you supervised signals to: - finetune assistants that acknowledge async work without hallucinating tool states, - build guardrails/regression tests for routers juggling retries and reordered responses, - evaluate “answered at the right time” behavior.
We’re also publishing the generator so you can reproduce or extend everything locally. If you’re building tool‑using agents - or just tired of UIs that freeze—this should help you train, test, and iterate faster.
Built with Torque → https://usetorque.dev/