Leaked Document Reveals Troubling Details About How AI Is Really Being Trained [View all]
Jul 19, 9:45 AM EDT by Joe Wilkins
Talk about a brain teaser.
Under the hood of a huge amount of artificial intelligence is an immense amount of human labor.
This can take many forms, but a particularly prominent one is "data labeling": the process of annotating material like written text, audio, or video, so that it can be used to train an algorithm.
Fueling the multi-billion dollar AI industry is a vast army of remote contract workers, often from less wealthy countries like the Philippines, Pakistan, Kenya, and India. Most data labelers are typically overworked and underpaid, and have to contend with the mental impact of repetitive work, punitive bosses, as well as exposure to hate speech, violent rhetoric, or other harmful and desensitizing material.
Recently, a trove of "safety guidelines" from billion-dollar data labeling company Surge AI was uncovered by the magazine Inc. Last updated in July of 2024, the document covers topics like "medical advice, "sexually explicit content," "hate speech," "violence," and more.
As Inc notes, Surge AI is a middleman firm, hiring contractors to train commercial large language models (LLMs) like Anthropic's Claude through a subsidiary, DataAnnotation.Tech. Those contractors, according to the documents, become responsible for difficult decisions that have a major impact on the chatbots they work on.
More:
https://futurism.com/documents-ai-training-surge