Lexin Zhou
Hi! I am a 1st-year CS PhD candidate at Princeton University, advised Prof. Peter Henderson at the POLARIS Lab. Before joining Princeton, I was a StarBridge Scholar at Microsoft Research for a year, working with Dr. Xing Xie. I did my master’s in CS at the University of Cambridge, funded by Open Philanthropy and supervised by Prof. Andreas Vlachos, and my BSc in Data Science at the Universitat Politècnica de València, where I got into research by working with Prof. Jose Hernandez-Orallo.
I am a computer scientist by training but also regularly draw insights from cognitive science. At present, I’m particularly interested in designing systematic evaluation frameworks that allow for understanding the capabilities and generalization patterns of LLMs, as well as their associated risks (e.g. unreliability). I also search for ways to disapprove Goodhart’s Law through generalizable optimization targets and evaluations.
I’ve spent time in research/consultancy roles at Microsoft Research, OpenAI, Meta AI, European Commission JRC, Krueger AI Safety Lab, and VRAIN. My work has been featured in Nature, Financial Times, Microsoft Research, MIT Tech Review, Forbes, IEEE Spectrum, El País, New Scientists, QbitAI, IBM, among others.
If you wanna talk about something I do, feel free to reach out via email or on Twitter!
news
| Sep 11, 2025 | 💡 Invited talk about General Scales Unlock AI Evaluation with Explanatory and Predictive Power at Future of Life Institute. |
|---|---|
| Sep 02, 2025 | 🔥 Starting my PhD studies at the CS department of Princeton University! |
| Mar 20, 2025 | 💡 Invited talk about General Scales Unlock AI Evaluation with Explanatory and Predictive Power at Princeton University. |
| Mar 09, 2025 | 📜 New preprint on introducing conceptual and technological innovations for a science of AI Evaluation: General Scales Unlock AI Evaluation with Explanatory and Predictive Power! Takeaways on X. An open platform calling for collaborations and extensions of our methodology. A accessible Microsoft Research Blog summarizing our work for the general audience. This represents the work that I personally feel the most excited about, to date. |
| Oct 30, 2024 | 💡Invited talk on Larger and More Instructable Language Models Become Less Reliable at Microsoft Research! |
| Sep 25, 2024 | 📜 Larger and More Instructable Language Models Become Less Reliable is finally out in Nature! Takeaways on X. This reminds me of Goodhart’s law. |
| Sep 20, 2024 | 📜 An LLM Feature-based Framework for Dialogue Constructiveness Assessment is accepted by EMNLP 2024, receiving high review scores that placed it in the top 0.5% of all submissions! |
| Sep 09, 2022 | 👨💻 Participated in the Red Team of GPT-4 at OpenAI, focusing on capability assessment, reliability evaluation, and adversarial testing. |
selected publications
- General Scales Unlock AI Evaluation with Explanatory and Predictive PowerNature (In Press), 2025
- Larger and More Instructable Language Models Become Less ReliableNature, 2024