Home > Resources

GDPval Resources

Access the dataset and automated grading service

About These Resources

The GDPval benchmark includes 1,320 specialized tasks across 44 occupations from the top 9 industries contributing to U.S. GDP. We're releasing a gold subset of 220 tasks (5 per occupation) for public use, along with an automated grading service to facilitate research.

Each task in the dataset includes a realistic prompt, reference files, and context—reflecting real work products from experienced professionals. The automated grader provides an experimental research service to help researchers quickly evaluate model outputs.