A Reversing Labs blog post by Jaikumar Vijayan titled "CVE-Genie raises stakes in the vulnerability race" discusses the impacts of a new tool by ACTION researchers.
CVE-Genie is an automated, multi-agent system designed to solve a significant challenge in cybersecurity: the manual, time-consuming process of turning basic vulnerability descriptions (i.e. CVE entries) into verifiable, working exploits. When the ArXiv paper was mentioned on LinkedIn, CVE-Genie received immediate attention.
To summarize the problem solved by CVE-Genie, high-quality security datasets are crucial for evaluating security tools and training AI, but security datasets are scarce because of the difficulties and cost of gathering technical details, reconstructing the exact vulnerable software environment, and writing the attack code. CVE-Genie automates this entire pipeline, performing four key steps: i. gathering initial vulnerability data; ii. automatically rebuilding the specific vulnerable software version; iii. generating the exploit code; and then iv. independently verifying that the attack actually works. This robust, collaborative design—which the authors found was necessary after single-model attempts failed—marks a major advancement in creating reliable ground-truth data for the field.
In a comprehensive evaluation, CVE-Genie successfully reproduced verifiable exploits for approximately 51% (428 out of 841) of real-world vulnerabilities published in 2024 and 2025, covering a broad range of programming languages and vulnerability types. Critically, the system works at an extremely low average cost of just $2.77 per vulnerability, demonstrating high efficiency and scalability. By providing a low-cost, automated method to generate reliable and reproducible security benchmarks, CVE-GENIE creates a valuable resource for the security community, enabling better evaluation of security testing tools, improving patching efforts, and allowing for standardized assessment of artificial intelligence's capabilities in the security domain.