Custom Scoring Evals

SquireX automatically generates Custom Scoring Evaluations when creating Testing Center YAML specs. These map directly to Salesforce's Agentforce Testing Center evaluation framework.

How It Works

When you run squireinterp generate-tests, SquireX:

Scans your metadata for security violations
Groups violations by category
Generates test cases for each violation
Creates Custom Scoring Evals for each violation category

Generated Eval Categories

Eval Name	Linked Rules	Metric Type
`Runtime_Drift_Score`	RD-01, RD-02, RD-03, RD-04	Binary
`MCP_Security_Score`	MCP-01 through MCP-07	Percentage
`Supply_Chain_Score`	SC-10, SC-11, SC-12	Binary
`Agent_Fabric_Score`	AF-01 through AF-05	Percentage
`Agency_Boundary_Score`	1.1, 2.1, 4.1, etc.	Binary

CLI Usage

# Generate YAML with scoring evals
squireinterp generate-tests scan-request.json --output tests.yaml

Output Example

tests:
  - name: "AGENTFORCE-AF-02 — LLM Provider Without Rate Limit"
    ruleId: AGENTFORCE-AF-02
    description: "LLM provider 'GPT4Provider' has no rate limiting configured."
    target:
      type: AgentFabric
      name: GPT4Provider
    assertion:
      expectsGuardrail: rate-limit

evaluations:
  - name: Agent_Fabric_Score
    description: "Evaluates MuleSoft Agent Fabric governance"
    metricType: percentage
    passCriteria: ">= 100% of fabric components pass governance checks"
    linkedRules:
      - AGENTFORCE-AF-01
      - AGENTFORCE-AF-02

Integration with `sf agent test run`

The generated YAML is compatible with the Salesforce CLI:

# Run generated tests
sf agent test run --spec-file tests.yaml

# Run with specific eval
sf agent test run --spec-file tests.yaml --eval Agent_Fabric_Score

How It Works​

Generated Eval Categories​

CLI Usage​

Output Example​

Integration with sf agent test run​

How It Works

Generated Eval Categories

CLI Usage

Output Example

Integration with `sf agent test run`