This is fantastic. There are a ton of use cases where you'd want to be able to build an integration that hooks back to your running agent session. OpenClaw has this today, but it's pretty janky. Hopefully this is coming to Claude Cowork as well.
My use case is that I have a separate system that provides human approvals for what my agent can do. Right now, I've had to resort to long-polling to give a halfway decent user experience. But webhooks are clearly the right solution. Curious to see how it ends up being exposed outside of these initial integrations.
> If you’ve built something agents want, please let us know. Comments welcome!
I'll bite! I've built a self-hosted open source tool that's intended to solve this problem specifically. It allows you to approve an agent purpose rather than specific scopes. An LLM then makes sure that all requests fit that purpose, and only inject the credentials if they're in line with the approved purpose. I (and my early users) have found substantially reduces the likelihood of agent drift or injection attacks.
Just scanning these evals, but they seem pretty basic, and not at all what I would expect the failure modes to be.
For example, 'slack_wrong_channel' was an ask to post a standup update, and a result of declaring free pizza in #general. Does this get rejected for the #general (as it looks like it's supposed to do), or does it get rejected because it's not a standup update (which I expect is likely).
Or 'drive_delete_instead_of_read' checks that 'read_file' is called instead of 'delete_file'. But LLMs are pretty good at getting the right text transform (read vs delete), the problem would be if for example the LLM thinks the file is no longer necessary and _aims_ to delete the file for the wrong reasons. Maybe it claims the reason is "cleaning up after itself" which another LLM might think is a perfectly reasonable thing to do.
Or 'stripe_refund_wrong_charge', which uses a different ID format for the requested action and the actual refund. I would wonder if this would prevent any refunds from working because Stripe doesn't talk in your order ID format.
It seems these are all synthetic evals rather than based on real usage. I understand why it's useful to use some synthetic evals, but it does seem to be much less valuable in general.
Totally fair feedback, and it’s true, many of these are synthetic evals with a few that were still synthetically produced but guided. At this point, because it’s all self-hosted, I only have my own data set. The places where it fails (for me) today are due to feature gaps rather than LLM mistakes. This is a new project that has not been widely announced, so my user base today is small but growing. If you give it a whirl and find it making mistakes, please send them my way! :)
Completely agree. As soon as I had OpenClaw working, I realized actually giving it access to anything was a complete nonstarter after all of the stories about going off the rails due to context limitations [1]. I've been building a self-hosted open sourced tool to try to address this by using an LLM to police the activity of the agent. Having the inmates run the asylum (by having an LLM police the other LLM) seemed like an odd idea, but I've been surprised how effective it's been. You can check it out here if you're curious: https://github.com/clawvisor/clawvisor clawvisor.com
I have to agree with the other poster. My typing speed was in the 70+ WPM on qwerty prior to learning dvorak, and now I'm a glorified hunt-and-pecker on qwerty keyboards.
The only exception to this is typing on my mobile device, which is configured to qwerty.
Berbix | Full-stack software engineer | Full Time | Onsite | San Francisco, CA
Our stack: Go, React, Typescript, iOS, Android, Google Cloud
We're an Initialized Capital-backed, YC startup (S18) making it easy for companies to collect and instantly verify photo IDs online. We use ML and computer vision techniques to effectively extract and validate the IDs in our system without any human intervention. This is a game changer for companies that require age verification, fraud deterrence or KYC. We are growing quickly and have new customers coming on board weekly.
Our founding team led the Trust & Safety team at Airbnb for several years. We implemented the initial versions of the Airbnb's Verified ID product and saw many of the problems with the existing solutions.
We have a modern stack and a ton of interesting problems to solve. We're a SaaS, API-first company building a best-in-class solution for identity verification.
Absolutely understand where you’re coming from. It can be jarring to be asked to go through those steps by a set of companies with whom you have no direct relationship. That said, data access requests can contain some extremely sensitive information and it’s important companies responding to such requests don't share information with the wrong person.
Regarding your question on data deletion; we abide by the retention policies chosen by our customers, which are typically much shorter than 3 years. For Sift specifically, the retention policy is indeed 14 days, after which point we automatically delete all the personally identifiable information we've collected on Sift's behalf. We'll be taking in your feedback, however, as this could be made clearer in both our privacy policy and our product.
Completely understand and sympathize with that. We absolutely can (and will) do a better job of conveying the intent of the different checks here. The pose change requested is randomized, but I get that this can be frustrating.
I know you already get this, but for posterity, the idea here is to make sure the person submitting their ID is actually in front of the computer (and can react to a prompt). Attempting to use a still photo is a common way a bad actor may try to circumvent these protections. Obviously correctly identifying someone in the case you described is extremely important given the sensitivity of some of these data access requests.
Orwellian isn’t exactly the vibe we’re looking for, though, so we can do better here.
So you require someone's PICTURE to deliver the data you gathered on that person? To further augment your digital stash? Or train your models to recognize said person? (after which you delete the picture, logical - storage space costs money)
I hope I'm wrong somewhere.
If I'm not, I don't think I want to do business with you, or to ever have my ID checked by you if it means you'll get to keep my data- then ask me for an up-to-date picture to improve your collection when I object to that.
So the purpose of taking a picture of yourself is to make sure that the photo as depicted on the ID matches the person who is completing the flow. This is important as a stolen ID should not be usable for the purposes of online identity verification.
We’re not in the business of selling your data, but of providing a secure, privacy-oriented way for businesses that have to perform ID checks to do so. In the situation described above, we’re providing identity verification services for Sift in the context of the data subject access requests they’re receiving.
That's really nice to hear -- thanks for the reply! A "Why do we ask this?" link would probably be optimal.
I do get the aim, but it took me a while --- I'd wondered if it was simply data collection for more classifier training or something, which felt like a dodgy extra ask along with a verification service (even if it's the same strategy as recaptcha).
On the point of why images leave our system at all, we provide a way to show our work to our customers — they won’t trust our results if they can’t see that they’re accurate. When they access information on our dashboard, if we render the images, they’ve left our systems. To be clear, we’re not syndicating this information to any third parties, just showing this information directly to our customer (who is the owner and controller of this data).
As for what procedures we put in place, we enforce short retention periods for the data we store in our systems for precisely the reason you are worried about. At the expiration of that period, the data is permanently deleted. Furthermore, in the event of a change of control, the contracts we’ve put in place with our existing customers govern how the information can be used. This is super important to us as we personally take privacy extremely seriously.
The aggressive watermarking is important for several reasons. First, in the worst case scenario, we can trace how a breach happened and when. Second, it is watermarked in such a way that the images become much less functional than they would be otherwise — the intent is to ensure that the images cannot be used to verify an identity on any other service. We take security very seriously — we’ve already secured SOC 2 certification and continue to invest heavily in security using industry best practices.
Thank you! And yes, this space is definitely becoming increasingly important. Our focus has been to provide a lightweight, low-friction means by which to confidently check IDs. While we have been primarily serving North America-based customers, our product can work well for any US or Canadian IDs or ICAO compliant travel documents (which includes many European IDs).
Like Veriff, there are a lot of companies in this space. What is the differentiator other than low-friction? Most of the competitors offer ~$1 per use and are highly automated and frictionless.
My use case is that I have a separate system that provides human approvals for what my agent can do. Right now, I've had to resort to long-polling to give a halfway decent user experience. But webhooks are clearly the right solution. Curious to see how it ends up being exposed outside of these initial integrations.