Modulo Research is developing specialized datasets to help advance empirical alignment research in artificial intelligence. These datasets will not only serve as the foundation for our own follow-up studies but will also be made available to the broader AI safety research community.
Datasets Under Development
• Expert-annotated Arguments in Specialized Fields: This dataset will include sentence-by-sentence annotations from domain experts pointing out issues in arguments presented by large language models. Domains include surgical medicine, contract law, evidence law, and the Lojban language.
• Meta-dataset of Annotated Arguments: A collection of datasets from the existing literature containing multi-sentence arguments/explanations with annotated flaws.
• Unique Questions for Sandwiching Experiments: A dataset comprising questions designed specifically for sandwiching experiments, focusing on questions that cannot be readily answered through a simple Google search.
• We are also exploring other datasets to be announced at a later date.
In addition to dataset provision, we are committed to sharing our methodologies, experiences, and insights related to data collection, expert selection, and dataset validation.