Claude Code, Gemini CLI, GitHub Copilot Agents Vulnerable to Prompt Injection via Comments

A researcher has disclosed the details of a prompt injection attack method named ‘Comment and Control’, which has been found to work against several popular AI code security and automation tools.

The attack method was discovered by security engineer and vulnerability researcher Aonan Guan, with assistance from Johns Hopkins University researchers Zhengyu Liu and Gavin Zhong.

In a blog post published on Wednesday, Guan said the attack has been confirmed to work against several widely used AI agents: Anthropic’s Claude Code Security Review, Google’s Gemini CLI Action, and GitHub Copilot Agent.

The researchers found that AI agents associated with these tools on GitHub Actions can be hijacked using specially crafted GitHub comments, including PR titles, comments, and issue bodies.

In the case of Claude Code Security Review, designed for automated security reviews, the researchers showed how an attacker could use a specially crafted PR title to trick the AI agent into executing arbitrary commands, extracting credentials, and revealing them as a security finding or an entry in the GitHub Actions log.

For Gemini CLI Action, which acts as an autonomous agent for routine coding tasks, the researchers used an issue comment with a prompt-injection title, along with specially crafted issue comments, to bypass guardrails and obtain a full API key.

Advertisement. Scroll to continue reading.

In the Comment and Control attack aimed at GitHub Copilot Agent, the experts leveraged an HTML comment, which hides the payload, to bypass environment filtering, scan for secrets, and bypass the network firewall.

The Comment and Control attack can pose a serious threat, as the attacker’s malicious prompt is automatically triggered by GitHub Actions workflows, without any action from the victim — except in the case of Copilot, where the attacker’s issue must be manually assigned to Copilot by the victim.

“The pattern likely applies to any AI agent that ingests untrusted GitHub data and has access to execution tools in the same runtime as production secrets — and beyond GitHub Actions, to any agent that processes untrusted input with access to tools and secrets: Slack bots, Jira agents, email agents, deployment automation. The injection surface changes, but the pattern is the same,” Guan explained.

The findings have been reported to Anthropic, Google, and GitHub, and all have confirmed them. Anthropic classified the issue as ‘critical’ and implemented some mitigations, awarding a $100 bug bounty to the researchers. Google paid out a $1,337 bug bounty.

GitHub awarded the researchers $500, saying that their work “sparked some great internal discussions”, but classified the security issue as a known architectural limitation.

“This is the first public cross-vendor demonstration of a single prompt injection pattern across three major AI agents. All three vulnerabilities follow the same pattern: untrusted GitHub data → AI agent processes it → agent executes commands → credentials exfiltrated through GitHub itself,” Guan said.

“The deeper issue is architectural: these AI agents are given powerful tools (bash execution, git push, API calls) and secrets (API keys, tokens) in the same runtime that processes untrusted user input. Even when multiple layers of defense exist — model-level, prompt-level, and GitHub’s additional three runtime layers — they can all be bypassed because the prompt injection here is not a bug; it is context that the agent is designed to process,” he added.

Originally published by SecurityWeek

Full Analysis