Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Demystifying Evals for AI Agents (anthropic.com)
2 points by theptip 10 hours ago | past | discuss
Anthropic Economic Index economic primitives (anthropic.com)
3 points by atlasunshrugged 1 day ago | past | discuss
Finding bugs across the Python ecosystem with Claude and property-based testing (anthropic.com)
1 point by mmaaz 2 days ago | past | discuss
Anthropic Labs (anthropic.com)
2 points by kerim-ca 2 days ago | past | discuss
Advancing Claude in healthcare and the life sciences (anthropic.com)
32 points by ta_u 4 days ago | past | 2 comments
Demystifying Evals for AI Agents (anthropic.com)
3 points by dvorka 5 days ago | past | discuss
Advancing Claude in healthcare and the life sciences (anthropic.com)
1 point by meetpateltech 5 days ago | past | 1 comment
Anthropic: Demystifying Evals for AI Agents (anthropic.com)
4 points by Bayram 6 days ago | past | 1 comment
Writing Evals for AI agents (anthropic.com)
3 points by seshagiric 6 days ago | past | discuss
More efficient protection against universal jailbreaks (anthropic.com)
2 points by pretext 7 days ago | past | 1 comment
Demystifying Evals for AI Agents (anthropic.com)
2 points by vinhnx 7 days ago | past | 1 comment
Constitutional Classifiers++: More efficient protection against jailbreaks (anthropic.com)
3 points by Tiberium 7 days ago | past | discuss
Demystifying Evals for AI Agents (anthropic.com)
5 points by pretext 7 days ago | past | 1 comment
Experimenting with AI to defend critical infrastructure (anthropic.com)
2 points by Kristopher1337 8 days ago | past | discuss
Bloom: An open source tool for automated behavioral evaluations (anthropic.com)
2 points by pbd 13 days ago | past | discuss
Bloom: an open source tool for automated behavioral evaluations (anthropic.com)
3 points by maluta 24 days ago | past
Bloom: An open source tool for automated behavioral evaluations (anthropic.com)
2 points by sonabinu 25 days ago | past
Bloom: an open source tool for automated behavioral evaluations (anthropic.com)
1 point by gangtao 25 days ago | past
Project Vend: Phase Two (anthropic.com)
197 points by kubami 25 days ago | past | 92 comments
Bloom: an open source tool for automated behavioral evaluations (anthropic.com)
1 point by gmays 26 days ago | past
Bloom: an open source tool for automated behavioral evaluations (anthropic.com)
4 points by Garbage 27 days ago | past
Training and Evaluating LLMs as General-Purpose Activation Explainers (anthropic.com)
1 point by not4uffin 28 days ago | past
Project Vend: Phase Two (anthropic.com)
1 point by Anon84 28 days ago | past
Protecting the well-being of our users (anthropic.com)
2 points by amrrs 29 days ago | past
Project Vend: Phase Two (anthropic.com)
2 points by dcre 29 days ago | past | 1 comment
Natural Emergent Misalignment from Reward Hacking in Production RL [pdf] (anthropic.com)
1 point by samlinnfer 34 days ago | past
Donating the Model Context Protocol and establishing the Agentic AI Foundation (anthropic.com)
288 points by meetpateltech 38 days ago | past | 145 comments
Anthropic Interviewer (anthropic.com)
1 point by erhuve 40 days ago | past
Anthropic Interviewer (anthropic.com)
2 points by ta_u 42 days ago | past
Anthropic Interviewer (anthropic.com)
1 point by victorbuilds 42 days ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: