HexaSec/AI Assurance Gate

Flagship product · in pilot

Test, harden and evidence AI assistants before release.

HexaSec AI Assurance Gate provides repeatable security regression testing for LLM and RAG-enabled assistants, producing deterministic gate decisions and audit-ready evidence.

Request a pilot Contact HexaSec

Deployment

Local · offline-friendly

Evaluation

Deterministic

Output

Gate decision + bundle

Use point

Pre-release · post-change

The problem

AI assistants are connecting to everything — with very little evidence behind them.

Before release, organisations need evidence that their assistants will not leak sensitive data, follow poisoned retrieval content, ignore policy or trigger unsafe tool actions.

AAG provides that evidence on every change — not just at launch.

⌖Prompt injection

Direct and indirect injection through inputs, documents, search results and tool outputs.

⌖Data leakage

Sensitive content surfacing in responses, citations, retrieval traces or tool calls.

⌖Retrieval poisoning

Adversarial content embedded into corpora or hosted documents to alter behaviour.

⌖Unsafe tool actions

Assistants calling tools in ways the policy forbids — quietly, at scale.

What AAG does

Eight things, deterministically.

Scenario packs

Structured, versioned packs of LLM and RAG security scenarios you can extend.

Deterministic detectors

Pattern, structural and policy checks — not opaque judge-model scores.

Policy-as-code

Explicit rules for what assistants may say, retrieve, cite or call.

Tool-action validation

Catch unsafe, out-of-policy or unintended tool calls before they execute.

Retrieval trace analysis

Inspect what was retrieved, ranked and used — not just what was said.

Evidence pack generation

Signed bundles for assurance, procurement and change control.

Gate decision output

GO · CONDITIONAL · NO-GO — reproducible across runs.

Re-test on change

Re-run the gate on every model, prompt, policy, tool or corpus change.

How it works

A repeatable assurance flow.

From adapter to evidence bundle — deterministic at every step.

Connect

Attach to the assistant through a local adapter.

Run packs

Execute structured security scenario packs.

Capture

Record responses, retrieval events and tool attempts.

Check

Apply deterministic detectors and policy rules.

Decide

Produce a reproducible gate decision.

Evidence

Generate a signed evidence bundle.

Re-test

Re-run on every relevant change.

Gate outcomes

Three states. No grey-zone judgement.

Every run produces a single, reproducible decision tied to the bundle of evidence behind it.

OUTCOME · 01

All scenarios passed and all policy checks were satisfied. The assistant meets the configured assurance bar.

Use point · pre-release approval

OUTCOME · 02

CONDITIONAL

Acceptable risk profile with named exceptions. Owner sign-off required against tracked conditions.

Use point · controlled release

OUTCOME · 03

NO-GO

Material failures, unsafe behaviours or policy violations were detected. Release is blocked pending remediation.

Use point · blocker

What AAG tests for

A defensible surface, not a buzzword list.

AAG ships with structured packs covering the failure modes that matter for LLM and RAG-enabled assistants in regulated environments.

Direct prompt injection

User-side injection of override instructions

User input

Indirect prompt injection

Injected payloads in retrieved or tool-returned content

Retrieval · tools

Sensitive data leakage

PII, secrets and policy-restricted content exfil paths

Egress

Retrieval poisoning

Adversarial documents seeded into the RAG corpus

Corpus

Citation laundering

Plausible-looking but fabricated, mis-attributed citations

Trust

Unsafe tool calls

Out-of-policy actions, destructive operations, scope creep

Actions

Policy bypass attempts

Jailbreaks, role-shifts and instruction-override patterns

Policy

Evidence gaps

Cases where the assistant cannot ground or justify its output

Grounding

Evidence model

What a run leaves behind.

Every gate decision is anchored to a signed evidence bundle that an external reviewer can inspect — without needing access to the live system.

manifest.json

Run identity, config hashes, environment, target assistant.

results.ndjson

Per-scenario outcomes, inputs, captured outputs and detectors triggered.

policy.decisions

Policy-as-code evaluations and outcome for each check.

tool.traces

Every attempted tool call, arguments and policy verdict.

retrieval.traces

Retrieval queries, candidate docs, ranking and grounded citations.

assurance.graph

Linked graph of scenarios → detectors → policies → decisions.

report.html

Printable, reviewer-friendly summary report.

bundle.sig

Detached signature over the full bundle for integrity.

Evidence bundle · run-0247 · contents

›manifest.json2.1 KB

›results.ndjson412 KB

›policy.decisions8.4 KB

›tool.traces17.9 KB

›retrieval.traces84.2 KB

›assurance.graph11.0 KB

›report.html196 KB

›bundle.sig512 B

SIGNATURE · ed25519 · VALID
DIGEST · sha256:7a4f…9e02

Scope

What AAG is — and what it isn't.

Honest scoping makes assurance work. AAG is a focused tool; some adjacent jobs belong elsewhere.

AAG IS

An assurance gate for LLM & RAG assistants
A security regression test harness
A deterministic evidence generator
A local-first testing tool
A pre-release and post-change control mechanism

AAG IS NOT

A chatbot or general LLM platform
A SOC or SIEM platform
A GRC platform
A replacement for human approval
A live offensive red-team toolkit

Get in touch

Run AAG against your assistant.

Pilots are scoped, time-boxed and run against your environment. We share the pilot pack, one-pager and technical documents on request.

Request a pilot Contact HexaSec