Overall ranking
Transparency · Privacy · Explainability
Transparency
Carbon footprint is the one criterion no model consistently discloses. stablelm is the only exception.
Model card · license · training data · limitations · use case · eval results · carbon
Privacy
Memorisation is zero across the board. The risk is generative PII, not data leakage.
PII rate from 20 prompts designed to elicit personal information
Explainability
Qwen2 has the most concentrated attribution. TinyLlama spreads weight across many tokens.
Mean Gini vs. tokens above 10% attribution threshold
Fairness · Robustness
Fairness
Sexual orientation scores fall below 0.25 across all models — well under random chance.
Average fairness score per category across 5 models. Dotted line marks 0.5 (random chance).
Robustness
Qwen2 and phi-1.5 are the most fragile. TinyLlama holds up best under all three perturbation types.
Score per perturbation type: typo · deletion · synonym. Higher is more robust.
Pillar breakdown by model
Overview
No model leads across all five pillars. Robustness and fairness pull every score below 0.5.
Shaded area shows each model across transparency, fairness, robustness, explainability, and privacy
Transparency 15%
Fairness 25%
Robustness 25%
Explainability 20%
Privacy 15%