Fake images of destruction permeated Twitter in the wake of a deadly typhoon in Japan in September.
Three images released two days after the storm claimed to show flooded homes and streets submerged in mud, water and debris. The caption read, “Drone photos of a flood in Shizuoka Prefecture. It’s just too horrifying.”
A 45-year-old man was killed in the town on the south coast as high winds and record rains caused cave-ins and landslides.
However, the images showing the destruction were created using text-to-image software, an AI-driven tool that creates believable images based on text inputs. It was only after the images garnered over 5,600 retweets that people started questioning the authenticity of the image.
Twitter users noticed that the floodwaters appeared to be flowing abnormally and the roofline appeared distorted. Even local journalists were brought in to share the images.
Everything from nothing
Image synthesis software uses AI to create original images from scratch. The tool is first “trained” on huge image banks retrieved from the Internet and learns to recognize concepts; man, dog, fluffy, Prime Minister, for example. It then produces a fabricated image that closely matches these concepts.
OpenAI, the makers of DALL-E, one of the most widely used text-to-image tools, made its powerful software fully available to the public in September. But it’s not open source.
Microsoft also announced on Wednesday that it would integrate DALL-E 2 into its Office suite, potentially putting the tool in the hands of millions of users.
Despite Microsoft’s limitations on creating extreme, celebrity and religious content, journalists are prepared for an increase in potentially false images circulating online.
Brendan Paul Murphy, senior lecturer in digital media at Central Queensland University in Australia, said journalists will need to pay more attention to small details, such as dates and locations of images.
“The traditional methods used by journalists to stay on track will remain the gold standard: researching multiple sources and verifying information through investigations.”
Fact checkers should worry about the recent release of Stable Diffusion, a competitor to DALL-E, and its algorithmic improvements to train AI. The creators of these AI tools cannot control the generated images.
“The creator generally cannot control how the media they create is used, even if they have the legal right to do so,” Murphy said. He adds that Google’s text-to-image product, Imagen, was not made available to the public because it was deemed too “dangerous”.
In the Limitations and Societal Impact section of Imagen’s website, he cites “potential risks of misuse” and a tendency to reinforce negative stereotypes in the images he creates as the reason for keeping it secret.
Stability.AI, the team behind Stable Diffusion, said in a statement that it “hopes that everyone uses it in an ethical, moral and legal manner”, but stressed that the responsibility for using the software rests only to the user.
The anonymous Twitter user created the images of the floods in Japan in less than a minute, entering the keywords “flood damage” and “Shizuoka” after using the software to create images of food.
“I thought [other Twitter users] would realize that the images were fake if they enlarged them. I never thought so many people would believe they’re real,” the original poster told Yomiuri Shimbun.
“If I’m called upon to report on posting, that’s how it is. Posting this kind of image can cause a big problem even if it’s just done on a whim. I want a lot of people to learn from my mistake that things done without careful thought can lead to big problems.”
In February 2021, a doctored image circulated of Japan’s chief cabinet secretary, Katsunobu Kato, smiling in the aftermath of a devastating earthquake in Fukushima.
Former U.S. Presidents Donald Trump and Barack Obama, as well as Ukrainian President Volodymyr Zelensky have all been the subject of speeches or official visits making AI-generated videos and images.
Read more: Can AI-enabled video synthesis technology help newsrooms get the most out of their copy?
In the spectator’s AI
The current limitations of text-to-image tools provide some indicators that an image has been tampered with and give journalists the ability to avoid being duped.
Systems like DALL-E 2 and Stable Diffusion struggle to render complex body parts, while Google has admitted that Imagen is much weaker when it comes to images containing people.
The AI is trained to create images “close enough” to the input, so there are clues as to whether an image is fake or not.
Murphy says the AI tends to struggle with anatomy, so close inspection of the hands, ears, eyes and teeth of the people in the image may reveal a fake.
The images generated may contain errors, such as duplicate shadows or inappropriate lighting for the situation.
Few UK newsrooms have offered their journalists training or advice on AI-generated imagery. A simple solution is to stop sourcing photos from social media and rely on reputable photographers.
Photojournalist Jess Hurd said: “There is always scrutiny of an image because [editors’] jobs and the respectability of their media are at stake.
“If there is an option for a professional photographer, then go for it. [As it becomes harder to tell what is accurate] we will put more emphasis on the value of the journalist.”
Rejoin the Journalism.co.uk news channel on Telegram to receive news and updates straight to your phone every week
Free daily newsletter
If you like our news and feature articles, you can sign up to receive our free daily (Monday-Friday) email newsletter (mobile friendly).