ICLR 2026 - Submissions

Submissions

Quantity AI Content: 0-10%10-30%30-50%50-70%70-90%90-100%All

Avg Rating: 0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 All

Title	Abstract	Avg Rating	Quantity AI Content	Reviews	Pangram Dashboard
CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering	Medical question answering (QA) benchmarks often focus on multiple-choice or fact-based tasks, leaving open-ended answers to real patient questions underexplored. This gap is particularly critical in ...	6.67	14%	See Reviews	View AI Dashboard

PreviousPage 1 of 1 (1 total rows)Next