Interesting point about #2. I've been doing something similar but from a differe...

Interesting point about #2. I've been doing something similar but from a different angle — running the same question through Claude, GPT-4o and Gemini to see where they disagree. Turns out they give completely different root causes about 30% of the time, which honestly surprised me.

What's your experience with qwen3.5 for debugging tasks? I've mostly stuck with the big models so far.