Modulo Research has partnered with Hidden Variable Limited on projects supporting a leading frontier AI lab’s Frontier Red Team. This collaborative work includes developing an eval testing the degree to which models can build reinforcement learning pipelines, which we understand is in use at this lab and that they have shared with another. These projects have also included investigations of AI models’ capacity to generate economically valuable products, specifically focusing on simple computer games.
Additionally, we led the analysis team on a recent collaborative study investigating LLM persuasive capabilities (link), and are conducting other investigations of LLM capabilities to be announced in the coming months.
While not an evaluation of dangerous capabilities per se, see our recent preprint FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research for evaluations of frontier model abilities to detect and correctly characterize errors in long-form solutions to difficult problems.