
OpenAI, working with training data firm Handshake AI, is asking third-party contractors to submit real assignments and work products from their current or past jobs to evaluate the capabilities of its next-generation AI models, according to documents reviewed by WIRED. The initiative is part of OpenAI’s broader effort to establish a “human baseline” across professions as it measures progress toward artificial general intelligence (AGI). One confidential OpenAI document explains the rationale: “We’ve hired folks across occupations to help collect real-world tasks modeled off those you’ve done in your full-time jobs, so we can measure how well AI models perform on those tasks.”
Under the programme, contractors are instructed to upload “real, on-the-job work” they have “actually done,” spanning a wide range of formats including “Word doc, PDF, Powerpoint, Excel, image, repo.” An OpenAI presentation cited by WIRED clarifies that each submission should include both the original request and the final output, enabling evaluators to see how humans respond to authentic workplace demands. Examples referenced in the materials range from everyday business documents to highly detailed outputs, such as a customised Bahamas yacht itinerary created for a luxury concierge client.
While OpenAI does permit fabricated or hypothetical examples, the guidance repeatedly emphasises that genuine work artefacts are strongly preferred. The company argues that real assignments more accurately capture the nuance, constraints, and decision-making involved in professional tasks, providing a clearer benchmark for how AI systems compare with human performance across industries.
However, the strategy has sparked concern among legal and intellectual property experts. Evan Brown, an intellectual property lawyer at Neal & McDevitt, warned that the approach could expose AI developers to significant legal risk. “The AI lab is putting a lot of trust in its contractors to decide what is and isn’t confidential,” he told WIRED. “If they do let something slip through, are the AI labs really taking the time to determine what is and isn’t a trade secret? It seems to me that the AI lab is putting itself at great risk.”
The concerns centre on the possibility that contractors may unintentionally share proprietary, confidential, or trade secret information belonging to employers or clients. While OpenAI’s intent is to ground its evaluations in realistic human work, critics argue that relying on contractors to self-police sensitive material could create legal exposure. As AI labs race to measure and demonstrate progress toward more general intelligence, the episode highlights the growing tension between data realism, ethical safeguards, and legal accountability in AI development.




