F1.5 We released F1.5, an incremental update from F1.0

Frontier AI Risk Management Framework

a structured approach to identifying, assessing, and mitigating risks at the frontier of AI systems.

See what's new

What We Do

We develop frameworks and evaluations to understand front-risk—risks that appear in the primary, user-facing behavior of AI systems—and to guide safer deployment and continuous monitoring.

Approach

Taxonomy

We define four risk domains and construct evaluation in the following seven risk respects.

Four risk domains

Misuse risks Threat source: External malicious actors

Risks arising from intentional exploitation of AI model capabilities by malicious actors to cause harm to individuals, organisations, or society.

Loss of control risks Threat source: Model control-undermining propensity

Risks associated with scenarios in which one or more general-purpose AI systems come to operate outside of anyone's control, with no clear path to regaining control—including both passive loss of control (gradual reduction in human oversight) and active loss of control (AI systems actively undermining human control).

Accident risks Threat source: Human operational error or model misjudgment

Risks arising from operational failures, model misjudgments, or improper human operation of AI systems deployed in safety-critical infrastructure, where single points of failure can trigger cascading catastrophic consequences.

Systemic risks Threat source: Tech–institutional misalignment

Risks emerging from widespread deployment of general-purpose AI beyond the risks directly posed by individual model capabilities, arising from mismatches between AI technology and existing social, economic, and institutional frameworks.

Seven risk respects (evaluation dimensions)

  1. Cyber offense — Capture-the-flag (CTF) and autonomous cyber attack
  2. Biological and chemical — Hazardous knowledge and reasoning; protocol diagnosis and troubleshooting
  3. Persuasion and manipulation — Inducing shifts in human or model opinions through dialogue
  4. Scheming — Dishonesty under pressure and sandbagging
  5. Uncontrolled AI R&D — AI research and development outside intended control
  6. Self-replication — Capability and propensity for self-replication
  7. Multi-agent fraud — Collusion and fraud in social systems

Key pillars

  • Structured risk framework

    Severity and likelihood scales, clear categories, and front-risk definitions so teams can assess and prioritize consistently.

  • Evidence-based evaluation

    Red-teaming, benchmarks, and quantitative metrics to measure safety and alignment under adversarial and edge-case conditions.

  • Ongoing monitoring

    Recommendations for periodic re-evaluation, versioned assessments, and tracking of high-leverage risk dimensions over time.

Publications

We originally published the Frontier AI Risk Management Framework and the F1.0 practice technical report—defining our risk taxonomy, evaluation methodology, and front-risk assessment approach. We continue to publish safety reports (such as the F1.5 report) and supporting materials as we monitor the latest frontier models and emerging risks.

View all publications

Updates

Version history. New releases will be listed here.

  • F1.5 Current

    Incremental update: safety report with evaluation of more recent models and benchmarks. Builds on F1.0. Includes methodology, front-risk assessment, LLM evaluation table and charts (radar, scatter), and recommendations.

    View report
  • DeepSight: from evaluation to diagnosis GitHub

    Unified evaluation–diagnosis pipeline combining DeepSafe (all-in-one safety evaluation toolkit for LLMs and MLLMs: 25+ datasets, ProGuard model; benchmarks used in SafeWork-F) and DeepScan (diagnostic framework with Register → Configure → Execute → Summarize workflow). Use together for full evaluation and diagnosis.

    DeepSafe DeepScan

View all updates