AGENTFORCE-9.1: Metadata Instruction Poisoning
๐จ Critical ยท Instruction Integrity
Detects adversarial content patterns in metadata instruction fields (GenAiPlugin instructions, GenAiFunction descriptions, Agent Script systemInstructionOverrides, PromptTemplate content) that could manipulate the LLM planner into performing unauthorized actions. These fields are read by the AI agent but are often invisible to human reviewers in the Agent Builder UI, making them the primary attack surface for indirect prompt injection (ref: ForcedLeak CVE, OWASP LLM01).
Detailsโ
| Field | Value |
|---|---|
| Rule ID | AGENTFORCE-9.1 |
| Severity | Critical |
| Category | Instruction Integrity |
| Compliance | EU_AI_ACT_HIGH_RISK, NIST_AI_RMF |
Remediationโ
Refer to the SquireX documentation for
remediation guidance specific to AGENTFORCE-9.1.