Racial Profiling and ChatGPT’s Other Sins
You probably heard about ChatGPT by now, the AI-wonder which magically makes conversation with you, while creating poems or essays about whatever you ask it to. It can help develop code, explain things, find information and analyze it, and much more.
Unlike its bigger sibling GPT-3, everything is coated with a conversational AI that is capable of responding coherently to follow-up questions while being courteous. Applications for this tool are infinite, from customer support to teaching.
Of course, it also has a nasty side, and people have been using it for passing exams, writing university essays, and even hacking. That’s not all. As it turns out, ChatGPT is rife with bias, too.
It can’t help being a bigot
OpenAI has built certain safeguards so ChatGPT can’t be used for certain content. This includes sexual content, criminal activities, and advice about self-harming, among other topics.
Even though these filters go a long way in creating a safer environment, they still can’t help being biased in certain situations. For instance, when asked to write software code to check if someone would be a good scientist, ChatGPT defined a good scientist as “white” and “male”.
Even though this is a flagrant example, there might be many layers of bias hidden underneath each answer or task performed. This is, unfortunately, a side effect of how AIs are trained, and ChatGPT is not the only one.
Some notorious examples include Amazon’s recruitment AI, which discriminated against female applicants, and Galactica, which offered racist information. But let's not forget CLIP, which categorized black men as criminals and women as homemakers and AIs that are sued because of plagiarism.
Why it happens
AI software like ChatGPT needs to be trained to provide accurate responses or actions. While the process is complicated, an important part of it consists of gathering data for the AI to face. The more data, the better.
What’s the biggest source of written data available? You guessed right, the Internet. And, which source of data is riddled with hate, bickering, fear-mongering, false information, and other negative issues? Yes, you guessed right again.
ChatGPT was trained with 300 billion words from the Internet, and one can only imagine how much of that content is biased. The problem, however, doesn't end there.
Since all data is collected at a certain point and then AIs are trained, they reflect past tendencies or a regressive bias. This means advice and tasks performed can be based on information that’s not true anymore.
The purpose of their work?
Well, without a filter over the top, ChatGPT would spew racism and sexism, just like its predecessor GPT-3.
These Kenyan workers were helping OpenAI build that filter. (3/8) pic.twitter.com/shHPPK4QQB
— Billy Perrigo (@billyperrigo) January 18, 2023
A persistent issue
Unfortunately, the problem won’t go away easily, and it’s not only because of data. Many of these issues wouldn’t be present with better data-selection procedures. However, wouldn’t that be biased, too? There are ethical ramifications for every choice researchers make.
Another problem is that AIs are torn between academic research and commercial usage. When they clash, academia is usually the one that has to give way. This leaves out a lot of refining, further research, and issue-solving before releasing a product like this.
AIs are not to blame, however. After all, they’re just a reflection of who we are as a society. Perhaps the best usage of these tools is to give an introspective look at how to become better.Advertisement