{"id":2,"date":"2023-08-29T15:54:49","date_gmt":"2023-08-29T15:54:49","guid":{"rendered":"https:\/\/www.moduloresearch.com\/?page_id=2"},"modified":"2025-06-07T11:16:47","modified_gmt":"2025-06-07T11:16:47","slug":"data","status":"publish","type":"page","link":"https:\/\/www.moduloresearch.com\/index.php\/data\/","title":{"rendered":"Data"},"content":{"rendered":"\n<p>Modulo Research is developing specialized datasets to help advance empirical alignment research in artificial intelligence. These datasets will not only serve as the foundation for our own follow-up studies but will also be made available to the broader AI safety research community.<\/p>\n\n\n\n<p>You can&nbsp;<a href=\"https:\/\/forms.gle\/xBSiN7buHoA7Cw4K8\">sign up<\/a>&nbsp;to be notified when we release future datasets.<\/p>\n\n\n\n<p><strong>Currently Available<\/strong><\/p>\n\n\n\n<p><strong>FindTheFlaws<\/strong> is a set of datasets that include (1) long-form expert-verified correct solutions and (2) long-form flawed solutions with annotations highlighting specific errors to difficult questions in medicine, physics, chemistry and more. While several of the questions are drawn from existing benchmarks such as GPQA Diamond, it also includes the novel CELS dataset containing detailed expert annotations of LLM responses to difficult questions in surgical medicine, law, and Lojban. The <a href=\"https:\/\/github.com\/modulo-research\/findtheflaws\" data-type=\"link\" data-id=\"https:\/\/github.com\/modulo-research\/findtheflaws\">repository<\/a><strong> <\/strong>includes the datasets presented in the <a href=\"https:\/\/arxiv.org\/abs\/2503.22989\" data-type=\"link\" data-id=\"https:\/\/arxiv.org\/abs\/2503.22989\">paper<\/a>, and the scripts used to conduct model evals using UK AISI&#8217;s Inspect library.<\/p>\n\n\n\n<p><br><strong speechify-initial-font-size=\"18px\" style=\"font-size: 18px;\">Datasets<\/strong><strong> Under Development<\/strong><br><br>We&#8217;re now finalizing a dataset of textual representations of the research processes followed by high-performing participants in an experiment involving an online research task \u2014 which we hope could be useful for improving LLM capability elicitations \u2014 and have written up the results of the\u00a0<a href=\"https:\/\/moduloresearch.com\/papers\/Confirmation_bias_A_challenge_for_scalable_oversight.pdf\">experiment<\/a>\u00a0in which they were collected.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Modulo Research is developing specialized datasets to help advance empirical alignment research in artificial intelligence. These datasets will not only serve as the foundation for our own follow-up studies but will also be made available to the broader AI safety research community. You can&nbsp;sign up&nbsp;to be notified when we release future datasets. Currently Available FindTheFlaws [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"open","template":"","meta":{"_themeisle_gutenberg_block_has_review":false,"footnotes":""},"class_list":["post-2","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.moduloresearch.com\/index.php\/wp-json\/wp\/v2\/pages\/2","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.moduloresearch.com\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.moduloresearch.com\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.moduloresearch.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.moduloresearch.com\/index.php\/wp-json\/wp\/v2\/comments?post=2"}],"version-history":[{"count":10,"href":"https:\/\/www.moduloresearch.com\/index.php\/wp-json\/wp\/v2\/pages\/2\/revisions"}],"predecessor-version":[{"id":118,"href":"https:\/\/www.moduloresearch.com\/index.php\/wp-json\/wp\/v2\/pages\/2\/revisions\/118"}],"wp:attachment":[{"href":"https:\/\/www.moduloresearch.com\/index.php\/wp-json\/wp\/v2\/media?parent=2"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}