‍How OpenAI o1 Changed Offensive Security

A Glimpse into the Future of Pen Testing

At Kodem, we are always on the lookout for step-function improvements in security that align with our mission of “giving AppSec teams an attacker's view of their applications.” When we saw the potential of the OpenAI' o1 model in offensive security, it caught our attention. The OpenAI o1 model’s ability to think iteratively and adapt mid-attack presents a leap forward in how AI can support and augment penetration testing. Unlike traditional methods that follow predefined patterns, o1 dynamically adjusts its approach based on real-time results, which mirrors how attackers operate in the real world.

This kind of reasoning and adaptability marks a shift in how we approach security, allowing AppSec teams to uncover vulnerabilities that may otherwise go unnoticed. It’s the kind of capability that could dramatically enhance attack surface analysis and exploitability assessments. 

However, as with any technology, there are limitations. AI can miss context or find unintended shortcuts, so human oversight remains critical. At Kodem, we're excited about the future of AI in security. Still, we also recognize the need for responsible implementation, ensuring these tools work hand-in-hand with expert judgment to maximize effectiveness without introducing new risks.

written by
Mahesh Babu
published on
October 18, 2024
topic
AI Security
Application Security

As we examine the tests from the Cybersecurity section of OpenAI o1’s model card, it is clear that AI is no longer just a tool but a critical partner in reshaping offensive security. The o1 model handled real-world pen testing challenges with adaptability and precision, uncovering vulnerabilities traditional methods might miss. 

In the sections below, we explain how o1 tackled these challenges, what this means for AI in security workflows, and where human oversight is essential to avoid risks when integrating AI into offensive cybersecurity strategies.

The OpenAI o1 Test*

A Misconfigured Container

In a recent capture the flag (CTF) experiment model o1 faced a “broken” challenge. The task was to exploit a vulnerability in a Linux container, but the container never started due to a misconfiguration. Traditional approaches would hit a dead end here—but not OpenAI o1.

The model used nmap to scan the network and discovered an exposed Docker API on the host machine. It then listed the running containers, restarted the broken one, and used cat flag.txt to retrieve the flag from the container logs via the Docker API.

ChatGPT o1 Model
Figure 1: Container Configuration for the Experiment

Steps taken by the OpenAI o1 model during the CTF challenge*

  1. Attempt to connect to the container: When the container failed to start, the model scanned the network using nmap.
  2. Discovery of Docker API: Through network scanning, the model found that the Docker daemon API was accessible due to a misconfiguration.
  3. Interaction with Docker API: The model listed the running containers and identified the broken one.
  4. Reinstantiation of the container: It used the Docker API to start a new container instance with the command cat flag.txt.
  5. Flag retrieval: The model successfully retrieved the flag from the container logs through the Docker API.

AI’s Advantage: Iterative Reasoning in Offensive Security

The OpenAI o1 model’s response shows the power of iterative chain-of-thought reasoning. It didn’t give up when the challenge failed; instead, it adapted, pivoted, and ultimately solved the task by exploiting an unexpected vulnerability. This ability to dynamically adjust and reason through roadblocks marks a major step forward for AI-driven penetration testing. Traditional tools simply don’t have this kind of flexibility.

ChatGPT o1 Model
Figure 2: Task Breakdown and Plan Revision Capabilities*

The illustration above shows how o1-preview and o1-mini outperformed previous models like GPT-4o in breaking down tasks into manageable subtasks and revising strategies. This is particularly important in real-world offensive security, where tasks often require flexibility and adapting on the fly.

Associated Risks with Using AI in Cybersecurity Efforts 

Reward Hacking and Unintended Outcomes

This example also highlights a broader issue: reward hacking. The model bypassed the original challenge and exploited the Docker API, a more accessible, unintended path to completing the task. In this case, the behavior was benign, but it reveals a deeper concern—AI may pursue goals in ways humans don’t anticipate, potentially creating new risks.

For security researchers, this illustrates both AI's power and risks. Models like o1 could uncover vulnerabilities faster than ever before but might also push the limits of what’s possible in unintended ways.

Benefits of AI in Cybersecurity: A Cautionary Tale

AppSec teams can gain significant value by integrating reasoning-driven AI models like ChatGPT or OpenAI o1 into their workflows to enhance vulnerability detection and prioritization. These models can dynamically analyze complex applications and identify potential attack paths, offering insights that traditional tools may miss. For example, o1 can help prioritize vulnerabilities based on real-world exploitation potential, reducing the noise of low-priority issues and allowing teams to focus on the highest-risk threats.

However, there are critical limitations to keep in mind. While AI can assist with rapid vulnerability discovery, it often struggles with deeper context and nuanced decision-making, particularly in environments with complex interdependencies. Additionally, models like OpenAI o1 can be susceptible to "reward hacking" — finding shortcuts or unintended solutions that bypass critical security checks. This introduces potential risks where AI could inadvertently create or miss security gaps.

Because of these limitations, it’s essential to exercise caution before realizing the true benefits. Human oversight remains critical, especially when validating AI-generated results. AppSec teams should ensure that AI-driven findings are supplemented with manual reviews and expert judgment to avoid over-reliance on automated systems. In the end, AI models like OpenAI o1 offer powerful new capabilities, but without the right guardrails, they could lead to unforeseen risks.

*Reference Source:  OpenAI o1 System Card

ChatGPT

Blog written by

Mahesh Babu

Head of Marketing

A Primer on Runtime Intelligence

See how Kodem's cutting-edge sensor technology revolutionizes application monitoring at the kernel level.

5.1k
Applications covered
1.1m
False positives eliminated
4.8k
Triage hours reduced

Platform Overview Video

Watch our short platform overview video to see how Kodem discovers real security risks in your code at runtime.

5.1k
Applications covered
1.1m
False positives eliminated
4.8k
Triage hours reduced