AI system developers must first implement robust and mature approaches to security management and responsible disclosure. Without these, reported bypasses might not be handled properly. That’s among the advice from the UK official National Cyber Security Centre (NCSC) and the UK’s AI Security Institute (AISI). They’ve been considering how traditional cyber management tools might help mitigate the possibility of safeguard bypasses, focusing initially on the approaches to vulnerability management and disclosure described in the NCSC’s Vulnerability Disclosure Toolkit.
They have developed some suggested best practice principles for using Safeguard Bypass Bounty Programmes (SBBP) and Safeguard Bypass Disclosure Programmes (SBDP), building on AISI’s experience collaborating on and judging the Gray Swan Agent Red-Teaming Challenge and evaluating frontier AI safeguards, besides the NCSC’s own research. More on the NCSC’s blog.
Comments
Kevin Curran, IEEE senior member and professor of cyber security at Ulster University, says that GenAI presents both an innovation leap and a security concern. He says: “It can craft highly personalised, grammatically correct phishing emails, often indistinguishable from legitimate messages. It can also simulate customer service agents or internal communications to trick victims and generate functional malware or obfuscated code. The barrier to entry for cybercrime is being lowered drastically – with so-called ‘script kiddies’ becoming ‘AI kiddies’. Even attackers with limited programming skills can now create exploits or malicious scripts with natural language prompts.
“One of the more worrying developments is the way GenAI can refine proof-of-concept exploits published on GitHub, quickly turning them into weaponised tools. This shrinks the time between disclosure and attack, raising the stakes for defenders. We’re already seeing black-hat alternatives to ChatGPT, such as WormGPT and FraudGPT, openly marketed to cybercriminals. GenAI in the hands of hackers is no longer theoretical but an emerging reality. Safeguard bypasses should be taken seriously, but disclosure on its own is not enough – the real challenge is ensuring defences keep pace in what has become a high-stakes arms race.”
Keeley Crockett, IEEE member and professor of computational intelligence at Manchester Metropolitan University, says AI is advancing so quickly that many tools are drifting far beyond the purposes their developers originally intended. She says: “We’ve already seen it used not just to write code, but to help decide what data to steal, how to craft extortion demands – even suggesting ransom amounts. The time needed to exploit vulnerabilities is shrinking and defenders can’t afford to stay reactive.
“This should be taken as a warning of what comes next. As AI continues to accelerate cyber-crime, the emergence of agentic systems – models able to plan and act with some autonomy – could make attacks faster and more adaptive in real time. Clearly, that carries significant risks if such tools fall into the wrong hands. Safeguards, oversight and resilience need to be built into intelligent systems now. Even without full autonomy, AI is already lowering barriers for less skilled attackers and adding psychological sophistication to extortion. Waiting until agentic AI becomes mainstream would leave many organisations dangerously exposed.”




