Hacker Newsnew | past | comments | ask | show | jobs | submit | mustaphah's submissionslogin
1.Tell HN: Claude two rate limits don't know about each other
2 points by mustaphah 26 days ago | past
2.Enhancing gut-brain communication reversed cognitive decline in aging mice (stanford.edu)
386 points by mustaphah 27 days ago | past | 185 comments
3.Many SWE-bench-Passing PRs would not be merged (metr.org)
278 points by mustaphah 28 days ago | past | 154 comments
4.AGI is an unscientific myth (tandfonline.com)
4 points by mustaphah 41 days ago | past | 2 comments
5.Web Verbs (github.com/nlweb-ai)
1 point by mustaphah 45 days ago | past
6.OpenAI's 5-month experiment: building a product with no human-written code (openai.com)
2 points by mustaphah 49 days ago | past
7.SkillsBench: Benchmarking how well agent skills work across diverse tasks (arxiv.org)
364 points by mustaphah 51 days ago | past | 171 comments
8.Evaluating AGENTS.md: are they helpful for coding agents? (arxiv.org)
232 points by mustaphah 51 days ago | past | 161 comments
9.Curosr: Expanding our long-running agents research preview (cursor.com)
3 points by mustaphah 53 days ago | past
10.Measuring Time Horizon Using Claude Code and Codex (metr.org)
1 point by mustaphah 53 days ago | past
11.SWE-ContextBench: context learning benchmark in coding (arxiv.org)
1 point by mustaphah 54 days ago | past
12.SWE-AGI: benchmarking spec-driven software construction (arxiv.org)
1 point by mustaphah 55 days ago | past | 1 comment
13.Code Formatting Silently Consumes Your LLM Budget (arxiv.org)
1 point by mustaphah 58 days ago | past
14.Agent Trace by Cursor: open spec for tracking AI-generated code (agent-trace.dev)
1 point by mustaphah 67 days ago | past
15.METR releases Time Horizon 1.1 with 34% more tasks (metr.org)
1 point by mustaphah 68 days ago | past
16.Coffee timing isn't one-size-fits-all (examine.com)
4 points by mustaphah 69 days ago | past
17.ChatGPT subscription support in Kilo Code (kilo.ai)
1 point by mustaphah 72 days ago | past
18.Imposter Syndrome Predicts Perfectionism (psypost.org)
2 points by mustaphah 73 days ago | past
19.Motivation acts as a camera lens that shapes how memories form (psypost.org)
2 points by mustaphah 73 days ago | past
20.Claude Code: Merging Slash Commands into Skills (x.com)
2 points by mustaphah 73 days ago | past | 2 comments
21.The visual feedback tool for coding agents (agentation.dev)
2 points by mustaphah 75 days ago | past
22.Agent Skills to help developers using AI agents with Supabase (github.com/supabase)
1 point by mustaphah 75 days ago | past
23.METR AI Benchmark: Clarifying Limitations of Time Horizon (metr.org)
2 points by mustaphah 75 days ago | past
24.Scaling PostgreSQL to power 800M ChatGPT users (openai.com)
348 points by mustaphah 76 days ago | past | 136 comments
25.Claude Code plugin that rings your phone when a run needs you (github.com/zeframlou)
2 points by mustaphah 84 days ago | past | 1 comment
26.Exercise can be nearly as effective as therapy for depression (sciencedaily.com)
402 points by mustaphah 89 days ago | past | 393 comments
27.Party of One for Code Review (tidyfirst.substack.com)
2 points by mustaphah 3 months ago | past
28.FrontierScience Benchmark by OpenAI (openai.com)
17 points by mustaphah 3 months ago | past | 1 comment
29.Open Scouts: AI-driven web monitoring (firecrawl.dev)
1 point by mustaphah 3 months ago | past
30.A Rosetta Stone for AI Benchmarks (epoch.ai)
2 points by mustaphah 4 months ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: