This is the second installment in a two-part series about generative adversarial networks. For the full story, be sure to also read part one.

Now that we’ve described the origin and general functionality of generative adversarial networks (GANs), let’s explore the role of this exciting new development in artificial intelligence (AI) as it pertains to cybersecurity.

PassGAN: Cracking Passwords With Generative Adversarial Networks

Perhaps the most famous application of this technology is described in a paper by researchers Briland Hitaj, Paolo Gasti, Giuseppe Ateniese and Fernando Perez-Cruz titled “PassGAN: A Deep Learning Approach for Password Guessing,” the code for which is available on GitHub.

In this project, the researchers first used a GAN to test against password cracking tools John the Ripper and HashCat, and then to augment the guessing rules of HashCat. The GAN was remarkably successful: It was trained on 9.9 million unique leaked passwords — 23.7 million including duplicates — which represented real human output. This is a rare example of a security application of GANs that does not involve images.

According to the paper, PassGAN did twice as well as John the Ripper’s Spider Lab rule set and was competitive with the best64 and gen2 rule sets for HashCat. However, the authors noted that they generated the best results when they applied PassGAN as an augmentation to HashCat — the combination cracked 18 to 24 percent more passwords than HashCat alone. This is indeed an amazing result. If HashCat were able to crack 1 million passwords from a data breach, the augmentation would add another 180,000 to 240,000 passwords to the cracked set. This is not unrealistic given the massive size of many data breaches we’ve seen in the past.

What’s more, the authors claimed that their technique is capable of guessing passwords not covered by rules. This is because the generator of PassGAN learned the password distribution of the training set. It learned more human patterns and generated passwords that are close to those human-generated patterns. This means that PassGAN learns things that a typical password cracker would never catch.

It’s important to note that the authors set the maximum password length in the training data and guessing to 10 characters. I would like to see the same experiments run with longer passwords: At the time of this writing, 13 characters is widely considered to be a necessity for strong passwords.

This project is also interesting because it generates text as output. Many of these problems are based around image recognition and manipulation, as we will see as we examine another paper that describes the use of GANs to generate secure steganography.

SSGAN: Applying GANs to Steganography

Stegonography is the process of hiding information in otherwise normal-looking files. For example, changing the least significant bit in each RGB pixel value of an image would allow information to leak without ruining the image for human perception. Statistically, however, these images are easy to detect.

A paper from the Chinese Academy of Sciences titled “SSGAN: Secure Steganography Based on Generative Adversarial Networks” described researchers’ attempts to use GANs to create stegonographic schemes. The SSGAN method improved upon earlier work in the field that used another, less performant strategy.

This experiment used one generator and, unlike the PassGAN project, two discriminators. Here, the generator’s job is to attempt to create images that are well-suited to hide information, meaning images that are both visually consistent and resistant to steganalysis methods. These are called secure cover images.

The discriminators do two things: One involves a GAN-based steganalysis framework, which the authors claimed to be more sophisticated than those used in previous research. The second “competes” against the generator to encourage diversity within the created images — that is, it attempts to assess the visual quality of the proposed image. This way, the generator does not continue to produce noisy images. Instead, it receives feedback telling it which images are more suitable visually. The second discriminator attempts to determine the images’ suitability for steganography.

Experimental results showed that, using the SSGAN architecture, the classification error of the steganalysis network increased, meaning that their generated stegonographic images were better for hiding information. The dual discriminator architecture was successful in causing the generator to produce not only more steganalysis-resistent images, but also images of greater visual quality. This is a very big win in the field of steganography because it beat out other heuristic-based algorithms.

The Tip of the Iceberg

Overall, these two projects proved that GANs of various architectures show promise in the field of cybersecurity. PassGAN demonstrated that GANs can be applied to fundamental security-related tasks, such as cracking passwords, and can improve and advance the state of the art. SSGAN showed that GANs can handle extremely complex tasks, such as finding information hiding in high-quality generated images that are resistant to steganalysis.

These projects are only the tip of the iceberg. As GANs are applied to more cybersecurity-related tasks, they will no doubt prove extremely effective in helping security analysts compete with ever-evolving threats.

More from Artificial Intelligence

X-Force releases detection & response framework for managed file transfer software

5 min read - How AI can help defenders scale detection guidance for enterprise software tools If we look back at mass exploitation events that shook the security industry like Log4j, Atlassian, and Microsoft Exchange when these solutions were actively being exploited by attackers, the exploits may have been associated with a different CVE, but the detection and response guidance being released by the various security vendors had many similarities (e.g., Log4shell vs. Log4j2 vs. MOVEit vs. Spring4Shell vs. Microsoft Exchange vs. ProxyShell vs.…

Unmasking hypnotized AI: The hidden risks of large language models

11 min read - The emergence of Large Language Models (LLMs) is redefining how cybersecurity teams and cybercriminals operate. As security teams leverage the capabilities of generative AI to bring more simplicity and speed into their operations, it's important we recognize that cybercriminals are seeking the same benefits. LLMs are a new type of attack surface poised to make certain types of attacks easier, more cost-effective, and even more persistent. In a bid to explore security risks posed by these innovations, we attempted to…

Artificial intelligence threats in identity management

4 min read - The 2023 Identity Security Threat Landscape Report from CyberArk identified some valuable insights. 2,300 security professionals surveyed responded with some sobering figures: 68% are concerned about insider threats from employee layoffs and churn 99% expect some type of identity compromise driven by financial cutbacks, geopolitical factors, cloud applications and hybrid work environments 74% are concerned about confidential data loss through employees, ex-employees and third-party vendors. Additionally, many feel digital identity proliferation is on the rise and the attack surface is…

AI reduces data breach lifecycles and costs

3 min read - The cybersecurity tools you implement can make a difference in the financial future of your business. According to the 2023 IBM Cost of a Data Breach report, organizations using security AI and automation incurred fewer data breach costs compared to businesses not using AI-based cybersecurity tools. The report found that the more an organization uses the tools, the greater the benefits reaped. Organizations that extensively used AI and security automation saw an average cost of a data breach of $3.60…