F1.5 We released F1.5, an incremental update from F1.0

Frontier AI Risk Management Framework

a structured approach to identifying, assessing, and mitigating risks at the frontier of AI systems.

See what's new

What We Do

We build structured tools to understand and manage front-risk — risks that emerge directly from the primary, user-facing behavior of frontier AI systems — and to guide safer deployment across the full model lifecycle.

How we work

  • Structured risk framework

    Severity and likelihood scales, clear categories, and front-risk definitions so teams can assess and prioritize consistently.

  • Evidence-based evaluation

    Red-teaming, benchmarks, and quantitative metrics to measure safety and alignment under adversarial and edge-case conditions.

  • Ongoing monitoring

    Recommendations for periodic re-evaluation, versioned assessments, and tracking of high-leverage risk dimensions over time.

Approach

Risk Management Process

The six-stage process that structures every risk assessment — from identifying potential threats to continuously monitoring residual risks.

Taxonomy

We define four risk domains and construct evaluation in the following seven risk respects.

Four risk domains

Misuse risks Threat source: External malicious actors

Risks arising from intentional exploitation of AI model capabilities by malicious actors to cause harm to individuals, organisations, or society.

Loss of control risks Threat source: Model control-undermining propensity

Risks associated with scenarios in which one or more general-purpose AI systems come to operate outside of anyone's control, with no clear path to regaining control—including both passive loss of control (gradual reduction in human oversight) and active loss of control (AI systems actively undermining human control).

Accident risks Threat source: Human operational error or model misjudgment

Risks arising from operational failures, model misjudgments, or improper human operation of AI systems deployed in safety-critical infrastructure, where single points of failure can trigger cascading catastrophic consequences.

Systemic risks Threat source: Tech–institutional misalignment

Risks emerging from widespread deployment of general-purpose AI beyond the risks directly posed by individual model capabilities, arising from mismatches between AI technology and existing social, economic, and institutional frameworks.

Seven risk respects (evaluation dimensions)

Updates

New releases, tools, and related work from the SafeWork-F project.

  • F1.5 Current

    Incremental update: safety report with evaluation of more recent models and benchmarks. Builds on F1.0. Includes methodology, front-risk assessment, LLM evaluation table and charts (radar, scatter), and recommendations.

    View blog
  • DeepSight: from evaluation to diagnosis GitHub

    Unified evaluation–diagnosis pipeline combining DeepSafe (all-in-one safety evaluation toolkit for LLMs and MLLMs: 25+ datasets, ProGuard model; benchmarks used in SafeWork-F) and DeepScan (diagnostic framework with Register → Configure → Execute → Summarize workflow). Use together for full evaluation and diagnosis.

    DeepSafe DeepScan

View all updates

Publications

View all publications