Microsoft Research Podcast podcast show image

Microsoft Research Podcast

Researchers across the Microsoft research community

Podcast

Episodes

Listen, download, subscribe

Abstracts: NeurIPS 2024 with Jindong Wang and Steven Euijong Whang

Researcher Jindong Wang and Associate Professor Steven Euijong Whang explore the NeurIPS 2024 work ERBench. ERBench leverages relational databases to create LLM benchmarks that can verify model rationale via keywords in addition to checking answer correctness.  Read the paper Get datasets and codes

Microsoft Research Podcast RSS Feed


Share: TwitterFacebook

Powered by Plink Plink icon plinkhq.com