Publications

2025

NATURE

General Scales Unlock AI Evaluation with Explanatory and Predictive Power

Lexin Zhou, Lorenzo Pacchiardi, Fernando Martı́nez-Plumed, Katherine M. Collins, Yael Moros-Daval, Seraphina Zhang, and 20 more authors

Nature (In Press), 2025

PDF Code

2024

NATURE

Larger and More Instructable Language Models Become Less Reliable

Lexin Zhou, Wout Schellaert, Fernando Martı́nez-Plumed, Yael Moros-Daval, Cèsar Ferri, and José Hernández-Orallo

Nature, 2024

🌟 Extensive Media Coverage PDF Code

This work has been featured by Nature, Forbes, MIT Tech Review, IEEE Spectrum, El País, New Scientist, QbitAI, IBM, among other media outlets
EMNLP

An LLM Feature-based Framework for Dialogue Constructiveness Assessment

Lexin Zhou, Youmna Farag, and Andreas Vlachos

EMNLP, 2024

🌟 Top 0.5% of Submissions PDF Code

This work received an average review score of 4.17 out of 5, placing it in the top 0.5% of all submissions in ARR June 2024

2023

AIJ

Predictable Artificial Intelligence

Lexin Zhou, Pablo A. Moreno-Casares, Fernando Martı́nez-Plumed, John Burden, Ryan Burnell, Lucy Cheke, and 9 more authors

Artificial Intelligence Journal (In Press), 2023

PDF

2022

EBeM@IJCAI

Reject Before You Run: Small Assessors Anticipate Big Language Models

Lexin Zhou, Fernando Martı́nez-Plumed, José Hernández-Orallo, Cèsar Ferri, and Wout Schellaert

In Workshop on AI Evaluation Beyond Metrics at IJCAI, 2022

PDF