0
Skip to Content
Qally's
Home
Team
Our work
Qally's
Home
Team
Our work
Home
Team
Our work

Publications

RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts

H Wijk et al. (2024), Working Paper

Blog post

Qally’s

joel@qallys.com