Skip to main content
Commentary Education Government & Politics

Commentary: Bringing people and technology together to combat the threat of deepfakes

President Richard Nixon meets with Secretary of State Henry Kissinger January 21, 1974 in the Oval Office. Nixon famously recorded his conversations in the Oval Office. But In today’s AI era, the authors write, we are increasingly confronting the possibility that the media we encounter may not be genuine. Photo by National Archive/Newsmakers.

By Christine Mallinson and Vandana Janeja

Mallinson is professor of Language, Literacy & Culture at University of Maryland Baltimore County and director of the Center for Social Science Scholarship. Janeja is professor of Information Systems at UMBC and associate dean for research in the College of Engineering and Information Technology.

This is the latest commentary in a new partnership between Maryland Matters and the University of Maryland Baltimore County. Once a month, UMBC faculty, staff or students will provide an op-ed piece about a program, course or issue on campus and its broader relevance to the state.

In July of 1973, the American public learned of audio recordings that revealed Richard Nixon’s prior knowledge of criminal activity and participation in its coverup. What became known as the Watergate scandal caused an ensuing political earthquake.

As individuals and as a society, we rely on information and evidence — as in the case of the Watergate audio recordings — to make decisions, formulate beliefs, and take action. In today’s AI era, however, we are increasingly confronting the possibility that the media we encounter may not be genuine: robocalls impersonate political candidates, false images masquerade as celebrities, and readily available apps can manipulate anyone’s voice in authorized or unauthorized ways.

Images, videos, texts, and audio content that are synthetically generated or manipulated using AI tools and methods are known as deepfakes. Deepfakes are often hyper-realistic, and they can be created in seconds, instantly creating opportunities for deception, fraud, threat, misinformation and disinformation, and social turmoil. In short, deepfakes are undermining the trustworthiness of how humans communicate and are threatening democracy and society.

Further complicating the problem is the fact that deepfake generation is quickly outpacing deepfake detection. While fake content is easily created, it is difficult to catch. Typically, scientists have attempted to refine AI algorithms to detect deepfakes.  But, adversaries can use that same technology to generate even better deepfakes, leading to a vicious cycle. By focusing only on technology-driven solutions to technology-driven problems, we often inadvertently provide adversaries with the keys to the technological safe.

Fortunately, we are not as powerless as it may appear. A focus on the uniqueness that separates humans from machines and machine-generated content can help us take on this challenge of detecting fake media. At UMBC, our team is working on novel approaches to bring human experts back into the deepfake detection process. With the support of a grant from the National Science Foundation, our interdisciplinary team is cultivating new tools for tagging suspect content to improve the detection and discernment of audio deepfakes.

Our approach incorporates insights from sociolinguistics — the study of language in society — to improve the science of audio deepfake detection. Everyday language contains unique and variable features, such as the breaths and pauses we take when we talk, our pronunciation patterns, and the melody of our voices, which are hard for algorithms to precisely identify. By incorporating such measures of spoken language into algorithms, we have improved their capability for detecting fake speech.

 

But we can’t just rely on a computer to help us trust whether what we see and hear is real. People need to be able to reliably spot fake content when they encounter it in the real world, in real time. This task may seem daunting, as fake media becomes more and more sophisticated. However, we believe that the ability to perceive subtle and nuanced details of communication and language is a characteristic of human intelligence that can outpace artificial intelligence if we hone this skill.

Linguistic cues can augment our perception and help us discern fake media. They have the potential to become a powerful tool in our everyday deepfake detection toolbox, just as they can enhance the development of augmented AI models for deepfake detection.

Our team is also creating and testing short training sessions to help listeners spot common “tells” of real and fake speech. We are also developing open-access materials that listeners can access to learn more about audio deepfakes and improve their listening skills. All listeners can benefit from honing their perception, but we especially focus on college students as a digital generation facing the growing threat of misinformation online.

Technology is what we make of it. As social and traditional media are flooded with highly convincing computer manipulations, demand is growing for appropriate technological guardrails and for educational efforts to inform the public about the risks of synthetic content.

As researchers we can contribute scientific inquiry and insight, but we cannot raise political and societal awareness of the deepfake challenge alone. We need motivated citizens who see deepfakes not as a battle already lost but as a challenge we are capable of meeting.

We also need informed legislators who are committed to taking steps now, in the landscape of AI legislation, to combat unauthorized uses of AI and scaffold access to tools for generating fake content. Responsibility should also lie with creators of these tools, with checks and balances in place, and fake content creators should be held accountable for the propagation of misinformation.

With any powerful technology, guardrails and guidelines are essential. Working together, we can help ensure that the information and content that voters, consumers, and everyday citizens receive and encounter is as authentic, reliable, and trustworthy as possible.

REPUBLISHING TERMS

Our stories may be republished online or in print under Creative Commons license CC BY-NC-ND 4.0. We ask that you edit only for style or to shorten, provide proper attribution and link to our website. Please see our republishing guidelines for use of photos and graphics.

If you have any questions, please email [email protected].

To republish, copy the following text and paste it into your HTML editor.

License

Creative Commons License AttributionCreative Commons Attribution
Commentary: Bringing people and technology together to combat the threat of deepfakes