The essential labor of data work, such as modification and annotation, is systematically hidden from those who benefit from the fruits of that labor. A new project is putting the lived experiences of data workers around the world into the spotlight, providing a first-hand account of the costs and opportunities of working in tech overseas.
Many jobs that are tedious, thankless, or damaging to mental health are outsourced to poorer countries, where workers are happy to do them for a fraction of American or European wages. This labor market is the same as other "tedious, dirty or dangerous" jobs such as electronics "recycling" and shipbreaking. Moderate work or annotation work conditions are unlikely to cause you to lose an arm or develop cancer, but that doesn't mean they are safe, nor will they make you feel happy or fulfilled.
The Data Worker Survey, a collaboration between the AI ethics research group DAIR and the Technical University of Berlin, is nominally modeled on Marx's work in the late 19th century to identify labor conditions in a "collectively produced and politically actionable" report.
All reports are available free of charge and were launched at an online event today:
https://data-workers.org/about/
The ever-expanding scope of AI applications necessarily builds on human expertise, which until now has been purchased at the lowest value companies can offer without raising public relations issues. When you report a post, it doesn't say, "Great, we'll send this to a guy in Syria who will pay you 3 cents to get it done." But the volume of reports (and thus the volume of content worth reporting) is so high that solutions other than wholesale outsourcing of work to a cheap labor market don't really make sense for the companies involved.
Reading these reports, they are mostly anecdotal and intentional. These reports are more systematic anthropological observations than quantitative analyses.
Quantifying such experiences often fails to capture the true costs - the statistics you end up with are the type that companies like to trumpet (and therefore solicited in studies): higher wages than other companies in the area, jobs created, cost savings passed on to customers. Issues such as sleeplessness among temperance workers suffering from nightmares or rampant drug dependence are rarely mentioned, let alone measured and presented.
Take Fasica Berhane Gebrekidan’s report on Kenyan data workers struggling with mental health and drug issues. (For full PDF, please click here). She and her colleagues work for Sama, a company that bills itself as a more ethical pipeline for data work, but the reality is what actual workers describe as endless misery and a lack of support from local offices.
They were recruited to handle reported content in local languages and dialects, which meant they were exposed to an endless stream of violence, gore, sexual abuse, hate speech and other content that they had to quickly scan and "act on" lest their performance fall below expected levels, resulting in pay deductions, the report said. For some people, this means they view more than one piece of content per minute, which means they view at least around 500 pieces of content per day. (If you're wondering where the AI is - they're probably providing the training data).
"It's absolutely heartbreaking. I've seen the worst things imaginable. I'm afraid I'll be scarred for life from doing this job," said Rahel Gebrekirkos, one of the contractors interviewed.
Support staff were "underequipped, unprofessional and underqualified" and presenters often turned to medication to cope and complained of intrusive thoughts, depression and other problems.
We've heard of some of this happening before, but it makes sense to hear that it's still happening. There are several such reports, but others are more personal experiences or take different forms.
For example, Yasser Yousef Alrayes is a data annotator in Syria who works to pay for higher education. Together with his roommates, he worked on visual annotations, such as parsing text images, which, as he noted, were often poorly defined and frustratingly demanding from clients.
He chose to document his work in the form of a short film that is well worth the eight minutes of your time:
Workers like Arthur are often buried behind many organizational layers, acting as subcontractors of subcontractors, so that the lines of responsibility can be blurred should a problem or lawsuit arise.
DAIR and TU Berlin's Milagros Miceli, one of the leaders of the project, told me they haven't seen any comments or changes from the companies named in the report, but it's still early days. But the results seemed reason enough to go back and continue their research, she wrote: "We plan to continue this work with a second wave of data workers, who will most likely come from Brazil, Finland, China and India."
No doubt some people will be dismissive of these reports, because that's exactly what makes them valuable: their anecdotal nature. But while statistics can easily lie, anecdotes have at least some truth to them because these stories are taken directly from the source. Even if just a dozen reviewers in Kenya, Syria, or Venezuela have these issues, what they say should concern everyone.