Most benchmarks use GPT4 as the grader. What does your benchmark use and do you ...

		patrickhogan1 on March 17, 2024 \| parent \| context \| favorite \| on: Ask HN: If you've used GPT-4-Turbo and Claude Opus... Most benchmarks use GPT4 as the grader. What does your benchmark use and do you believe that this causes any bias in the results?