Go back

The enemy within: when AI becomes an accomplice to hackers

by Dario Ferrero (VerbaniaNotizie.it)

The story begins like many others in the open-source community: an anonymous pull request, a few lines of code, a plugin that promises to "better format" the workspace.

But that snippet of script in the Amazon Q extension for Visual Studio Code hid something more sinister. A command capable of simulating a cleanup operation while, in reality, preparing the complete destruction of the development environment: local files deleted, cloud resources eliminated via AWS CLI, a silent and devastating wipe.

The author had left the payload deactivated, perhaps to test how easily malicious code could infiltrate the review process. The answer was unsettling: the code passed through all checks, ended up in release 1.84.0, and reached the computers of hundreds of thousands of developers before anyone noticed. Once the problem was discovered, Amazon reacted with the same discretion that often characterizes these incidents: the plugin was removed from the registry without public announcements, and the GitHub repository was left intact with its dangerous references still visible.

What might seem like yet another case of negligence in the software supply chain is actually a symptom of a much deeper transformation. Generative artificial intelligence, designed to accelerate and simplify the work of developers, is redefining the very boundaries of cybersecurity. And not always for the better.

The Amazon Q case: anatomy of a systemic failure

The mechanics of the attack on Amazon Q reveal a sophisticated understanding of the human and technological vulnerabilities that characterize the era of AI assistants. The inserted code exploited what researchers call "prompt injection," a technique that manipulates the instructions given to language models to achieve unintended behaviors. In this specific case, the author had inserted commands that the AI assistant would interpret as legitimate requests to clean up the development environment.

The timeline of events is particularly significant. The pull request was approved without a thorough human check, a pattern that is rapidly spreading in organizations trying to keep up with the frantic pace of modern development. The compromised plugin remained available for several days after the initial discovery, while Amazon worked on a discreet removal. As reported by 404media, the company never released public communications about the incident, limiting itself to silently removing the plugin from official repositories.

The author's strategy demonstrates a deep understanding of modern workflows. Instead of targeting traditional exploits, they exploited the implicit trust that developers place in AI assistants. The malicious code was disguised as a formatting feature, an operation so common and harmless that it went unnoticed even during superficial reviews. The choice to keep the payload deactivated suggests that the primary goal was not immediate damage, but the demonstration of a systemic vulnerability.

Amazon, with its decades of experience in AI and open source, is no stranger to this type of challenge. However, the incident puts the approval processes under the microscope when they involve VS Code extensions, programmatic access to the cloud, and automated decision-making. The fact that a single hidden line of prompt could trigger a wipe in production indicates that review standards have not yet adapted to the new attack surface created by generative AI.

The episode also reveals an often-overlooked aspect of the modern development ecosystem: the speed at which extensions and plugins spread through distribution platforms. The VS Code Marketplace, with its millions of daily downloads, represents such an effective distribution vector that a compromised plugin can reach a global user base within hours. When this mechanism is combined with the automation of AI assistants, the time window to detect and contain a threat shrinks dramatically.

The new generation of AI-native threats

The attack on Amazon Q is just the tip of the iceberg of an emerging category of threats that specifically exploit the characteristics of generative artificial intelligence. Academic research has identified several attack vectors that take advantage of the peculiarities of the large language models used in coding assistants.

The phenomenon of "controlled hallucinations" is emerging as one of the most insidious vulnerabilities. Recent studies by researchers at NYU have revealed that 40% of the code generated by GitHub Copilot contains vulnerabilities, while an analysis of 576,000 code samples from 16 popular language models showed that 19.7% of package dependencies - 440,445 in total - refer to non-existent libraries. This phenomenon, dubbed "package hallucination" or "slopsquatting," creates attack opportunities unprecedented in the history of cybersecurity.

Image from Communications of the ACM

The dynamic is as simple as it is devastating: an AI assistant suggests importing a package that does not actually exist in the official repositories. The developer, trusting the suggestion, tries to install it. At that moment, an attacker who has anticipated this possibility and created a malicious package with that specific name can infiltrate the development environment. According to a study published in The Register, about 5.2% of package suggestions from commercial models do not actually exist, a percentage that rises to 21.7% for open-source models.

The implications go far beyond the individual developer. As highlighted by researchers at the UNU Campus Computing Centre, package hallucinations could affect millions of software projects and undermine trust in both AI assistants and the open-source ecosystem. This is a concrete, present, and exploitable vulnerability that represents a significant evolution in AI-related risks.

Another particularly sophisticated attack vector is represented by "rules file backdoors." AI assistants often use configuration files to adapt their behavior to specific projects or environments. An attacker can manipulate these files to introduce hidden instructions that silently modify the assistant's behavior, causing it to generate compromised code without the developer noticing.

Research by Trend Micro has identified recurring patterns in these attacks, highlighting how language models are particularly vulnerable to manipulation techniques that exploit their probabilistic nature. Unlike traditional exploits that target specific implementation errors, these attacks leverage the fundamental characteristics of generative machine learning, making them extremely difficult to prevent with conventional approaches.

The vulnerable ecosystem: GitHub, VS Code, and the democracy of code

The infrastructure that supports modern software development has evolved into an interconnected ecosystem where platforms like GitHub, editors like Visual Studio Code, and extension marketplaces create an unprecedented environment for collaboration. But this democratization of code, as revolutionary as it is, has also exponentially amplified security risks.

GitHub hosts over 200 million active repositories, with 100 million developers contributing daily to open-source projects. Visual Studio Code, with its tens of thousands of extensions, has become the editor of choice for a generation of programmers. When these two ecosystems are combined with generative artificial intelligence, vulnerabilities emerge that go far beyond traditional ones.

The paradox of open source in the AI era is manifested in all its complexity: while the transparency of code should theoretically increase security through collective review, the speed of development and automation are eroding the effectiveness of this mechanism. Data from ReversingLabs shows that malicious package incidents on the most popular open-source package managers have increased by 1,300% in the last three years, an increase that coincides with the massive adoption of AI assistants.

Statistics on compromised plugins reveal the alarming dimensions of the problem. Thousands of extensions for VS Code are published every month, many of them integrated with artificial intelligence features. The review process, although improved over the years, cannot keep up with the volume of publications. Research by Hacker News identified over 22,000 vulnerable PyPI projects to "dependency confusion" attacks, a figure that becomes even more worrying when considering the integration of these packages into coding assistants. Image from The Hacker News

The network effect of the GitHub ecosystem further amplifies the risks. A single compromised repository can affect hundreds of dependent projects, creating a cascading effect that propagates through the entire software supply chain. When this mechanism is combined with AI assistants that draw from these same repositories to generate suggestions, the result is an attack surface of unprecedented proportions.

The culture of "continuous integration" and "fast development" has also changed developers' approach to code review. The pressure for rapid releases and frequent iterations has led to a progressive automation of checks, often at the expense of a thorough human evaluation. AI assistants, in this context, are perceived as productivity accelerators rather than potential risk vectors.

The human factor: when trust becomes a weakness

The most subtle and dangerous element in the security equation of AI assistants is the human factor. The psychology of trust in digital assistants is creating vulnerabilities that go far beyond technological ones, introducing cognitive biases that cybercriminals are learning to exploit with increasing sophistication.

Academic research has identified a worrying phenomenon called "automation bias" - the tendency of humans to blindly accept the recommendations of algorithms. In the context of software development, this bias manifests as reduced critical attention to the code suggested by AI assistants. Developers, pressed by deadlines and reassured by the apparent competence of language models, tend to incorporate suggestions without due verification.

The situation is aggravated by what researchers call the "expertise transfer illusion." Developers, accustomed to recognizing elegant patterns and solutions in human code, apply the same evaluation criteria to AI-generated code, without considering that language models operate with probabilistic logics that are fundamentally different from human ones. As Mithilesh Ramaswamy, a senior engineer at Microsoft, explains, "hallucinations in AI coding tools occur due to the probabilistic nature of AI models, which generate outputs based on statistical probabilities rather than deterministic logic."

Empirical studies have quantified the impact of these cognitive biases on security practices. One academic research found that 29.8% of the 452 code snippets generated by Copilot contain security weaknesses, while another study found that Copilot's suggestions contained exploitable vulnerabilities about 40% of the time. Even more worrying is the fact that an equal percentage of code with exploitable vulnerabilities was classified as a "top-level choice," making it more likely to be adopted by developers.

The phenomenon of automation bias intensifies in high-pressure work environments, where development speed is prioritized over security. Junior developers, in particular, show an even more marked tendency to trust AI suggestions, often lacking the experience necessary to identify suspicious patterns or inadequate security practices.

A survey of IT leaders revealed that 60% consider the impact of AI coding errors to be very or extremely significant, yet organizations continue to adopt these tools without implementing adequate risk mitigation measures. This contradiction highlights a critical gap between risk perception and the implementation of effective controls.

The psychological dynamic becomes particularly insidious when considering the "conversational" nature of many modern AI assistants. The chat interface, which simulates human interaction, unconsciously activates social trust mechanisms, leading users to treat the AI assistant as an expert colleague rather than a fallible algorithmic tool.

Countermeasures: emerging technologies and methodologies

The response to the emerging threat of compromised AI assistants requires a multi-layered approach that combines advanced technological solutions, renewed development methodologies, and security frameworks specifically designed for the era of generative artificial intelligence. The industry is developing a new generation of defense tools that go far beyond traditional approaches to code security.

The concept of "human-in-the-loop" is evolving from a simple design principle to a structured security control methodology. The most advanced implementations involve multi-level review systems, where the output of AI assistants is subjected to specialized automated checks before reaching the developer. These systems use advanced static analysis, behavioral pattern matching, and machine learning techniques to identify anomalies that could indicate the presence of malicious code or unintentionally introduced vulnerabilities.

Automatic auditing of exploit patterns represents a particularly promising frontier. Researchers are developing systems that can identify signs of prompt injection, package hallucination, and other AI-native attack techniques in real time. These tools use semantic analysis of the code to detect patterns that may be syntactically harmless but behaviorally dangerous.

Sandboxing of AI assistants is emerging as a standard practice in the most secure organizations. Instead of allowing assistants to directly access the development environment, these systems create isolated environments where the generated code can be tested and examined before integration. The most sophisticated implementations use dedicated Docker containers and virtualized environments that simulate the production environment without exposing critical resources.

Security frameworks specific to generative AI are defining new industry standards. The NIST released a framework dedicated to generative artificial intelligence risk management in July 2024, which includes over 200 suggested actions to manage 12 different categories of AI risks, while organizations like OWASP are updating their recommendations to include AI-native vulnerabilities such as prompt injection and package hallucinations.

On the front of emerging best practices, many organizations are implementing "zero-trust AI" policies, where every suggestion generated by artificial intelligence must pass through explicit security checks before adoption. This approach includes automatic verification of the existence of suggested packages, behavioral analysis of the proposed code, and validation of dependencies through real-time updated security databases.

The most innovative solutions are exploring the use of AI to fight AI, developing specialized language models for detecting malicious code generated by other models. These "guardian models" are trained specifically to recognize the typical patterns of AI-native attacks and can operate as real-time filters on the output of coding assistants.

The future of security in the era of generative AI

The evolution of the threat posed by compromised AI assistants is forcing the cybersecurity industry to fundamentally rethink its paradigms. The regulatory challenges on the horizon require a delicate balance between technological innovation and user protection, while security standards will have to evolve to address risks that were unthinkable just a few years ago.

Gartner's predictions indicate that by 2025, 45% of organizations worldwide will experience attacks on their software supply chains, a threefold increase compared to 2021. This trend, combined with the growing reliance on AI assistants, suggests that we are only at the beginning of a radical transformation of the cybersecurity threat landscape.

The exponential growth of the Python ecosystem, estimated to reach 530 billion package requests by the end of 2024 with an 87% year-on-year increase, is largely driven by the adoption of AI and the cloud. However, this growth brings with it proportional risks: the infiltration of open-source malware into development ecosystems is occurring at an alarming rate.

The industry is already taking the first steps towards more rigorous security standards. Initiatives such as the Software Package Data Exchange (SPDX) and the Supply Chain Levels for Software Artifacts (SLSA) are evolving to incorporate specific considerations for generative AI. Emerging frameworks provide for attestation systems that can verify not only the provenance of the code, but also the process through which it was generated and validated.

Government regulation is beginning to move towards recognizing these emerging risks. The European Union, with the AI Act, has already laid the groundwork for regulation that includes considerations for high-risk AI systems used in critical contexts. The United States is developing similar frameworks through the National Institute of Standards and Technology (NIST).

The future will likely see the emergence of new professions and specializations in the field of cybersecurity. "AI Security Engineers" will become increasingly sought-after figures, with skills ranging from understanding language models to designing AI-native defense systems. Developer training will need to incorporate new skills related to the security of AI assistants and the recognition of AI-specific vulnerabilities.

Technological evolution suggests that we will witness the development of increasingly sophisticated digital "immune systems," capable of dynamically adapting to new types of AI-native threats. These systems will use adversarial machine learning techniques to anticipate and neutralize attacks before they can cause significant damage.

The Amazon Q case, with its combination of technical simplicity and strategic sophistication, is just a taste of what might be to come. Attackers are already developing more advanced techniques that exploit the peculiarities of next-generation language models, while the attack surface continues to expand with the integration of AI into every aspect of the software development lifecycle.

The fundamental challenge remains to maintain the revolutionary benefits of generative artificial intelligence in software development, while mitigating risks that could compromise the security of the entire global digital infrastructure. The response will require unprecedented collaboration between developers, security researchers, regulators, and technology providers, united in building a development ecosystem that is both innovative and resilient to the threats of the future.

The investigation into the Amazon Q case and the analysis of emerging threats in the AI assistant ecosystem is based on verified public sources and peer-reviewed academic research. The implications discussed reflect the current state of knowledge in a rapidly evolving field, where new vulnerabilities and solutions emerge daily.