Blog

AI Risk Assessment: A Step-by-Step Guide

How to run an AI risk assessment: the process, the risk categories to evaluate, how to score and prioritize risks, and what to do with the results.

Phos Team ·
AI Strategy

An AI risk assessment is the structured process of identifying what can go wrong with a specific AI system, how likely and severe each failure would be, and what controls are needed. Done well, it is the foundation of every other governance control.

When to run an AI risk assessment

Risk assessments are triggered by specific events, not just scheduled reviews. Run a risk assessment before deploying any new AI system, before significantly expanding an existing system’s use or data access, before changing the model underlying an AI system, and when a regulatory change affects the system’s classification.

For high-risk AI systems already in production, annual assessments are a minimum. Quarterly assessments are appropriate for AI in credit, employment, healthcare, or other regulated decision-making contexts.

The assessment process

A structured AI risk assessment follows a defined sequence. Skipping steps or collapsing them produces assessments that miss material risks.

Step 1: Define the system scope. Document what the system does, what data it uses, what outputs it produces, and what decisions it influences or makes. Without a precise scope, risk identification misses the risks at the edges of the system’s actual use.

Step 2: Identify stakeholders. Map the people affected by the system’s outputs: the individuals whose data it processes, the people whose decisions it influences, the teams responsible for its operation, and any external parties in the AI chain.

Step 3: Identify risks. For each risk category, systematically identify specific risks associated with this system. Use the categories below as a checklist, not an exhaustive list. Domain experts and red team exercises surface risks that checklist reviews miss.

Step 4: Score each risk. Assign each identified risk a likelihood score (1-5) and an impact score (1-5). Multiply to get a risk priority score. Document the rationale for each score.

Step 5: Map existing controls. For each identified risk, document what controls are already in place and assess how effectively they mitigate the risk. A risk with adequate controls has a different residual risk than one with no controls.

Step 6: Identify control gaps. For risks where existing controls are inadequate, document the gap and recommend the control needed. This is the action output of the assessment.

Step 7: Document and register. Record the assessment in the risk register, including the system description, identified risks, scores, existing controls, and recommended actions with owners and timelines.

Risk categories to evaluate

Model risk

Model risk encompasses the risks that arise from the AI model itself: inaccurate outputs, hallucinations, performance degradation over time, and behavior that differs from test performance in production.

Questions to answer: How often does the model produce incorrect outputs in testing? What happens when the model encounters inputs outside its training distribution? How does performance change as the data distribution shifts over time? What is the impact of a model error on downstream decisions?

Data risk

Data risk encompasses the risks arising from the data the AI system uses: personal data regulatory obligations, data quality issues that affect model performance, data access controls, and the risk that training data contains biases that the model perpetuates.

Questions to answer: Does the system process personal data, and if so, what legal basis applies? How is data quality validated before use? Who has access to training and production data? Were any protected characteristics or proxies for them present in training data?

Operational risk

Operational risk encompasses the risks of system failure, integration failures, and unexpected behavior at scale. It also includes the risks of over-reliance: organizations that depend heavily on AI systems for critical decisions and have lost the human capacity to perform those decisions independently.

Questions to answer: What happens if the system is unavailable? What other systems depend on this AI’s outputs? Are human reviewers genuinely capable of evaluating AI outputs, or have they lost the expertise to do so?

Compliance risk

Compliance risk encompasses regulatory obligations that apply to this specific AI system: EU AI Act requirements if it touches EU residents, GDPR obligations if it processes personal data, sector-specific regulations in finance or healthcare, and contractual obligations to customers or partners.

Questions to answer: Does this system fall under any specific AI regulations? What data protection obligations apply? Are there audit or reporting requirements for this system’s decisions?

Scoring and prioritization

A risk score is only useful if it drives action. The scoring methodology should be consistent across systems so that you can compare risk levels and prioritize investments.

A 5x5 likelihood-impact matrix is the standard approach. Scores of 20-25 (high likelihood and high impact) require immediate control implementation. Scores of 12-19 require control development and implementation within a defined timeframe. Scores below 12 require monitoring but not necessarily active control development.

Qualitative factors that should override purely quantitative scores include: regulatory bright-line requirements (some systems require controls regardless of score), irreversibility of harm (errors that cannot be corrected warrant higher priority than reversible ones), and visibility (errors that reach customers or regulators warrant higher priority than internal errors).

The risk register

The risk register is the master document that records all identified risks, their scores, their controls, and their status. It is the primary output of the risk assessment process and the ongoing reference for risk management.

Each entry in the risk register should include: system name and scope, risk description, risk category, likelihood score, impact score, priority score, existing controls, residual risk, recommended additional controls, control owner, and remediation deadline.

The risk register is not a static document. It is updated when assessments are completed, when controls are implemented, when incidents reveal new risks, and when periodic reviews change risk scores.

Acting on assessment results

An assessment that produces a risk register and no action is a documentation exercise. The assessment output drives a specific set of actions.

Immediate actions. High-priority risks without adequate controls should trigger immediate control development. For risks above a defined threshold, consider pausing or restricting the AI system’s use until controls are in place.

Planned control implementation. Medium-priority risks should enter a prioritized control implementation roadmap with owners and timelines.

Monitoring enhancements. Even for risks with adequate controls, the assessment should identify whether monitoring is sufficient to detect if controls fail.

Escalation to leadership. High-priority risks and significant control gaps should be reported to the governance committee or executive leadership, not handled at the operational level.

For the broader risk management program that this process feeds, see AI risk management. For a professional assessment of your organization’s AI risk posture, visit the AI audit service.

Frequently asked questions

How long does an AI risk assessment take?

A risk assessment for a single AI system of moderate complexity takes two to five days of dedicated effort. A comprehensive assessment of all AI systems across an organization can take four to eight weeks depending on the number of systems and available expertise. External assessors typically accelerate the process because they bring assessment methodology and domain experience.

Who should conduct the AI risk assessment?

The assessment team should include the AI system owner, a risk or compliance professional, a technical expert who understands the model’s architecture and data, and a representative of the business function that uses the system’s outputs. For high-risk systems, an independent reviewer who was not involved in the system’s development adds objectivity.

What is the difference between an AI risk assessment and an AI audit?

A risk assessment evaluates the specific risks of a specific AI system and recommends controls. An AI audit evaluates whether a system or program is operating according to documented standards. Risk assessments feed the governance program. Audits verify that the governance program is working. Both are necessary components of mature AI governance.

Ready to assess your AI risk?

You have the methodology. The next step is applying it to your specific AI systems and translating the results into a governed AI program.

Path one: start with a professional audit. An AI audit provides an expert-led assessment of your AI systems’ risk posture, with a prioritized remediation roadmap.

Path two: work with Phos AI Labs. If you want a complete AI risk assessment program designed for your organization, Phos AI Labs is a CCA-F certified Claude implementation partner. Thirty minutes, no deck. Start here.

Related articles

The fastest way to know whether we're the right fit, is a conversation.

STEP 1/2 · ABOUT YOU