About

I am a researcher at OpenAI.

I study agents.

Selected papers

  • τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
    Shunyu Yao, Noah Shinn, Pedram Razavi, Karthik Narasimhan
    paper | repo | blog

  • Language Agents: From Next-Token Prediction to Digital Automation
    Shunyu Yao
    PhD Thesis
    paper | slides | talk

  • SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models
    John Yang*, Carlos E. Jimenez*, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
    paper | repo | tweet | project

  • SWE-bench: Can Language Models Resolve Real-World Github Issues?
    Carlos E. Jimenez*, John Yang*, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan
    ICLR 2024 (Oral)
    paper | repo | tweet | project

  • Cognitive Architectures for Language Agents
    Shunyu Yao*, Theodore Sumers*, Karthik Narasimhan, Thomas L. Griffiths
    TMLR 2024
    paper | repo | tweet

  • InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback
    John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao
    NeurIPS 2023 Datasets and Benchmarks Track
    paper | repo | tweet | project

  • Reflexion: Language Agents with Verbal Reinforcement Learning
    Noah Shinn, Federico Cassano, Beck Labash, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao
    NeurIPS 2023
    paper | repo | tweet

  • Tree of Thoughts: Deliberate Problem Solving with Large Language Models
    Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
    NeurIPS 2023 (Oral)
    paper | repo | tweet

  • ReAct: Synergizing Reasoning and Acting in Language Models
    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
    ICLR 2023 (Oral, top 5%)
    paper | repo | tweet | project | Google AI blogpost

  • WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
    Shunyu Yao*, Howard Chen*, John Yang, Karthik Narasimhan
    NeurIPS 2022
    paper | repo | tweet | project | demo | Quanta Magazine

Online talks

Recent readings

  • The Double Helix (James Watson)
  • Lectures on General Relativity (David Tong)
  • What Babies Know (Elizabeth Spelke)
  • The Art of Doing Science and Engineering (Richard Hamming)

(last updated: Aug 2024)