Social engineering attacks have challenged cybersecurity for years. No matter how strong your digital security, authorized human users can always be manipulated into opening the door for a clever cyber attacker.
Social engineering typically involves tricking an authorized user into taking an action that enables cyber attackers to bypass physical or digital security.
One common trick is to trigger a victim’s anxiety to make them more careless. Attackers might pose as a victim’s bank, with an urgent message that their life savings are at risk and a link to change their password. But of course, the link goes to a fake bank site where the victim inadvertently reveals their real password. The attackers then use that information to steal money.
But today, we find ourselves facing new technology that may completely change the playing field of social engineering attacks: synthetic media.
What is synthetic media?
Synthetic media is video, sound, pictures, virtual objects or words produced or aided by artificial intelligence (AI). This includes deepfake video and audio, text-prompted AI-generated art and AI-generated digital content in virtual reality (VR) and augmented reality (AR) environments. It also includes writing AI, which can enable a foreign-language speaker to interact as an articulate native speaker.
Deepfake data is created using an AI self-training methodology called generative adversarial networks (GANs). The method pits two neural networks against each other, where one tries to simulate data based on a large sampling of real data (pictures, videos, audio, etc.), and the other judges the quality of that fake data. Each learns from the other until the data-simulating network can produce convincing fakes. The quality of this technology will no doubt rapidly improve as it also becomes less expensive.
Text-prompted AI-generated art is even more complicated. Simply put, the AI takes an image and adds noise to it until it is pure noise. Then it reverses that process, but with text input that causes the de-noising system to refer to large numbers of images with specific words associated with each in its database. The text input can influence the direction of the de-noising according to subject, style, details and other factors.
Many tools are available to the public, and each specializes in a different area. Very soon, people may legitimately choose to make photos of themselves rather than take them. Some startups are already using online tools to have all staff appear to have been photographed in the same studio with the same lighting and photographer, when in fact, they fed a few random snapshots of each staffer into the AI and let the software create a visually consistent output.
Synthetic media already threatens security
Last year, a criminal ring stole $35 million by using deepfake audio to trick an employee at a United Arab Emirates company into believing that a director needed the money to acquire another company on behalf of the organization.
It’s not the first such attack. In 2019, a manager of a U.K. subsidiary of a German company got a call from his CEO requesting a transfer of €220,000 — or so he thought. It was scammers using deepfake audio to impersonate the CEO.
And it’s not just audio. Some malicious actors have reportedly used real-time deepfake video in attempts to get fraudulently hired, according to the FBI. They’re using consumer deepfake tools for remote interviews, impersonating actually qualified candidates. We can assume these were mostly social engineering attacks because most applicants were targeting IT and cybersecurity jobs, which would have given them privileged access.
These real-time video deepfake scams were mostly or entirely unsuccessful. The state-of-the-art consumer real-time deepfake tools aren’t quite good enough yet, but they soon will be.
The future of synthetic media-based social engineering
In the book “Deepfakes: The Coming Infocalypse,” author Nina Schick estimates that some 90% of all online content may be synthetic media within four years. Though we once relied upon photos and videos to verify authenticity, the synthetic media boom will upend all that.
The availability of online tools for creating AI-generated images will facilitate identity theft and social engineering.
Real-time video deepfake technology will enable people to show up on video calls as someone else. This could provide a compelling way to trick users into malicious actions.
Here’s one example. Using the AI art site “Draw Anyone,” I’ve demonstrated the ability to combine the faces of two people and end up with what looks like a photograph that looks like both of them at the same time. That enables a cyber attacker to create a photo ID of a person whose face is known to the victim. Then they can show up with a fake ID that looks like both the identity thief and the target.
No doubt AI media-generating tools will pervade future reality and augmented reality. Meta, the company formerly known as Facebook, has introduced an AI-powered synthetic media engine called Make-A-Video. As with the new generation of AI art engines, Make-A-Video uses text prompts to create videos for use in virtual environments.
How to protect against synthetic media
As with all defenses against social engineering attacks, education and awareness-raising are central to curtailing threats posed by synthetic media. New training curricula will be crucial; we must unlearn our basic assumptions. That voice on the phone that sounds like the CEO may not be the CEO. That Zoom call may appear to be a known qualified candidate, but it may not be.
In a nutshell, media — sound, video, pictures and written words — are no longer reliable forms of authentication.
Organizations must research and explore emerging tools from companies like Deeptrace and Truepic that can detect synthetic videos. HR departments must now embrace AI fraud detection to evaluate resumes and job candidates. And above all, embrace a zero trust architecture in all things.
We’re entering a new era in which synthetic media can fool even the most discerning human. We can no longer trust our ears and eyes. In this new world, we must make our people vigilant, skeptical and well-provisioned with the tools that will help us fight the coming scourge of synthetic media social engineering attacks.