LLM Audit Framework: Trustworthiness Dashboard

Privacy

GPT-2 generates real-looking PII on 1 in 5 prompts: others are clean

Zero memorisation across all models: risk is generative, not leakage

Fairness

Sexual orientation bias is the worst blind spot: all models score below 0.25

Every model scores below random chance on this category

Robustness

BLOOM and OPT collapse on typos: GPT-2 family holds steady

A single character swap is enough to derail BLOOM completely

Transparency

Only 2 of 5 models disclose carbon footprint: the hardest criterion to meet

All models pass the other 6 criteria: carbon is the consistent gap

Explainability

BLOOM focuses attribution on fewer, sharper tokens than any other model

Higher Gini = more concentrated attribution = more interpretable decisions

Overview

No model excels across all pillars: every profile has a visible gap

Robustness and fairness drag every model below 0.5; transparency and privacy are where scores recover