Lexin Zhou

I am an incoming intern at Microsoft Research, advised by Dr. Xing Xie and Prof. Jose Hernandez-Orallo. I did my master’s in ML at the University of Cambridge, supervised by Prof. Andreas Vlachos. Prior to that, I did my BSc in Data Science at the Universitat Politècnica de València, where I got into research by working with Prof. Jose Hernandez-Orallo.

I am interested in research about AI Evaluation, social computing, human-AI interactions and AI safety, regularly taking inspiration from psychometrics and cognitive science. At present, I mostly spend my day thinking about (i) designing robust evaluation methods that offer explanatory and predictive power of AI’s capabilities, limitations and risks, and (ii) finding pathways to positively shape the reliability and predictability of AI systems in the quest of mitigating their harms and amplifying their benefits. I am especially intrigued by general-purpose systems like LLMs.

Across distinct timelines, I’ve spent time in research/consultancy roles on AI Evaluation at Meta AI, OpenAI, Krueger AI Safety Lab, VRAIN, and European Commission JRC. My work has been featured in Financial Times, Nature, IEEE Spectrum, El País, IBM, New Scientists, among others.

If you are drawn to everything relevant to AI Evaluation and wanna stay informed, I highly recommend the monthly AI Evaluation Digest, led by a few amazing colleagues I’ve worked with, to which I also make occasional contributions. If you want to talk about something I do, feel free to reach out via email or on Twitter.

I am seeking 25Fall PhD positions to continue my research journey. Please do reach out if you think I would be a good fit!

news

Oct 30, 2024	💡Invited talk on Larger and More Instructable Language Models Become Less Reliable at Microsoft Research!
Sep 25, 2024	📜 Larger and More Instructable Language Models Become Less Reliable is out in Nature! Takeaways on X and a fairly well-written article in Chinese. This may be the most computationally expensive proof of Goodhart’s law as of 2024.
Sep 20, 2024	📜 An LLM Feature-based Framework for Dialogue Constructiveness Assessment is accepted by EMNLP 2024!
Mar 01, 2024	👨‍💻 Participated in the Red Team at Meta AI for their new foundation models, focusing on adversarial testing.
Oct 09, 2023	📜 Predictable Artificial Intelligence preprint at Arxiv.

selected publications

NATURE

Larger and More Instructable Language Models Become Less Reliable

Lexin Zhou, Wout Schellaert, Fernando Martı́nez-Plumed, Yael Moros-Daval, Cèsar Ferri, and José Hernández-Orallo

Nature, 2024

PDF Code
EMNLP

An LLM Feature-based Framework for Dialogue Constructiveness Assessment

Lexin Zhou, Youmna Farag, and Andreas Vlachos

EMNLP, 2024

PDF Code
arXiv

Predictable Artificial Intelligence

Lexin Zhou, Pablo A. Moreno-Casares, Fernando Martı́nez-Plumed, John Burden, Ryan Burnell, Lucy Cheke, and 9 more authors

Under Review, 2023

PDF