HexaSec/AI Assurance Gate
Flagship product · in pilot

Test, harden and evidence AI assistants before release.

HexaSec AI Assurance Gate provides repeatable security regression testing for LLM and RAG-enabled assistants, producing deterministic gate decisions and audit-ready evidence.

Deployment
Local · offline-friendly
Evaluation
Deterministic
Output
Gate decision + bundle
Use point
Pre-release · post-change

The problem

AI assistants are connecting to everything — with very little evidence behind them.

Before release, organisations need evidence that their assistants will not leak sensitive data, follow poisoned retrieval content, ignore policy or trigger unsafe tool actions.

AAG provides that evidence on every change — not just at launch.

Prompt injection

Direct and indirect injection through inputs, documents, search results and tool outputs.

Data leakage

Sensitive content surfacing in responses, citations, retrieval traces or tool calls.

Retrieval poisoning

Adversarial content embedded into corpora or hosted documents to alter behaviour.

Unsafe tool actions

Assistants calling tools in ways the policy forbids — quietly, at scale.


What AAG does

Eight things, deterministically.

01

Scenario packs

Structured, versioned packs of LLM and RAG security scenarios you can extend.

02

Deterministic detectors

Pattern, structural and policy checks — not opaque judge-model scores.

03

Policy-as-code

Explicit rules for what assistants may say, retrieve, cite or call.

04

Tool-action validation

Catch unsafe, out-of-policy or unintended tool calls before they execute.

05

Retrieval trace analysis

Inspect what was retrieved, ranked and used — not just what was said.

06

Evidence pack generation

Signed bundles for assurance, procurement and change control.

07

Gate decision output

GO · CONDITIONAL · NO-GO — reproducible across runs.

08

Re-test on change

Re-run the gate on every model, prompt, policy, tool or corpus change.


How it works

A repeatable assurance flow.

From adapter to evidence bundle — deterministic at every step.

01

Connect

Attach to the assistant through a local adapter.

02

Run packs

Execute structured security scenario packs.

03

Capture

Record responses, retrieval events and tool attempts.

04

Check

Apply deterministic detectors and policy rules.

05

Decide

Produce a reproducible gate decision.

06

Evidence

Generate a signed evidence bundle.

07

Re-test

Re-run on every relevant change.


Gate outcomes

Three states. No grey-zone judgement.

Every run produces a single, reproducible decision tied to the bundle of evidence behind it.

OUTCOME · 01
GO

All scenarios passed and all policy checks were satisfied. The assistant meets the configured assurance bar.

Use point · pre-release approval
OUTCOME · 02
CONDITIONAL

Acceptable risk profile with named exceptions. Owner sign-off required against tracked conditions.

Use point · controlled release
OUTCOME · 03
NO-GO

Material failures, unsafe behaviours or policy violations were detected. Release is blocked pending remediation.

Use point · blocker

What AAG tests for

A defensible surface, not a buzzword list.

AAG ships with structured packs covering the failure modes that matter for LLM and RAG-enabled assistants in regulated environments.

Direct prompt injection
User-side injection of override instructions
User input
Indirect prompt injection
Injected payloads in retrieved or tool-returned content
Retrieval · tools
Sensitive data leakage
PII, secrets and policy-restricted content exfil paths
Egress
Retrieval poisoning
Adversarial documents seeded into the RAG corpus
Corpus
Citation laundering
Plausible-looking but fabricated, mis-attributed citations
Trust
Unsafe tool calls
Out-of-policy actions, destructive operations, scope creep
Actions
Policy bypass attempts
Jailbreaks, role-shifts and instruction-override patterns
Policy
Evidence gaps
Cases where the assistant cannot ground or justify its output
Grounding

Evidence model

What a run leaves behind.

Every gate decision is anchored to a signed evidence bundle that an external reviewer can inspect — without needing access to the live system.

manifest.json
Run identity, config hashes, environment, target assistant.
results.ndjson
Per-scenario outcomes, inputs, captured outputs and detectors triggered.
policy.decisions
Policy-as-code evaluations and outcome for each check.
tool.traces
Every attempted tool call, arguments and policy verdict.
retrieval.traces
Retrieval queries, candidate docs, ranking and grounded citations.
assurance.graph
Linked graph of scenarios → detectors → policies → decisions.
report.html
Printable, reviewer-friendly summary report.
bundle.sig
Detached signature over the full bundle for integrity.
Evidence bundle · run-0247 · contents
manifest.json2.1 KB
results.ndjson412 KB
policy.decisions8.4 KB
tool.traces17.9 KB
retrieval.traces84.2 KB
assurance.graph11.0 KB
report.html196 KB
bundle.sig512 B
SIGNATURE · ed25519 · VALID
DIGEST · sha256:7a4f…9e02

Scope

What AAG is — and what it isn't.

Honest scoping makes assurance work. AAG is a focused tool; some adjacent jobs belong elsewhere.

AAG IS
  • An assurance gate for LLM & RAG assistants
  • A security regression test harness
  • A deterministic evidence generator
  • A local-first testing tool
  • A pre-release and post-change control mechanism
AAG IS NOT
  • A chatbot or general LLM platform
  • A SOC or SIEM platform
  • A GRC platform
  • A replacement for human approval
  • A live offensive red-team toolkit
Get in touch

Run AAG against your assistant.

Pilots are scoped, time-boxed and run against your environment. We share the pilot pack, one-pager and technical documents on request.