// Level 3 · Controls

Measure

Analyze, assess, benchmark, and monitor AI risks.

MEASURE-1.1High

Approaches Selected

Approaches and metrics for measurement of AI risks enumerated during the MAP function are selected for implementation.

MEASURE-1.2Medium

Appropriate Metrics

Appropriateness of AI metrics and effectiveness of existing controls is regularly assessed and updated.

MEASURE-1.3Medium

Internal Experts

Internal experts who did not serve as front-line developers for the system and/or independent assessors are involved in regular assessments and updates.

MEASURE-2.1High

Test Sets

Test sets, metrics, and details about the tools used during TEVV are documented.

MEASURE-2.2Medium

Human Subject Evaluations

Evaluations involving human subjects meet applicable requirements and are representative of the relevant population.

MEASURE-2.3High

Performance Demonstrated

AI system performance or assurance criteria are measured qualitatively or quantitatively and demonstrated for conditions similar to deployment.

MEASURE-2.4High

Functionality and Behavior

The functionality and behavior of the AI system and its components — as identified in the MAP function — are monitored when in production.

MEASURE-2.5High

Validity and Reliability

The AI system to be deployed is demonstrated to be valid and reliable. Limitations of the generalizability beyond the conditions under which the technology was developed are documented.

MEASURE-2.6High

Safety Demonstrated

AI system is evaluated regularly for safety risks — as identified in the MAP function.

MEASURE-2.7Critical

Security and Resilience

AI system security and resilience — as identified in the MAP function — are evaluated and documented.

MEASURE-2.8High

Risks of Privacy

Risks associated with transparency and accountability — as identified in the MAP function — are examined and documented.

MEASURE-2.9Medium

Explainability

The AI model is explained, validated, and documented, and AI system output is interpreted within its context — as identified in the MAP function — to inform responsible use and governance.

MEASURE-2.10High

Privacy Risk

Privacy risk of the AI system — as identified in the MAP function — is examined and documented.

MEASURE-2.11High

Fairness and Bias

Fairness and bias — as identified in the MAP function — are evaluated and results are documented.

MEASURE-2.12Low

Environmental Impact

Environmental impact and sustainability of AI model training and management activities — as identified in the MAP function — are assessed and documented.

MEASURE-2.13Medium

Effectiveness of Measures

Effectiveness of the employed TEVV metrics and processes in the MEASURE function are evaluated and documented.

MEASURE-3.1High

Risk Tracking Approaches

Approaches, personnel, and documentation are in place to regularly identify and track existing, unanticipated, and emergent AI risks based on factors such as intended and actual performance in deployed contexts.

MEASURE-3.2Medium

Risk Tracking Information

Risk tracking approaches are considered for settings where AI risks are difficult to assess using currently available measurement techniques or where metrics are not yet available.

MEASURE-3.3Medium

Feedback Mechanisms

Feedback processes for end users and impacted communities to report problems and appeal system outcomes are established and integrated.

MEASURE-4.1Medium

Measurement Approaches Validated

Measurement approaches for identifying AI risks are connected to deployment context(s) and informed through consultation with domain experts and other end users.

MEASURE-4.2Medium

Measurement Results

Measurement results regarding AI system trustworthiness in deployment context(s) and across the AI lifecycle are informed by input from domain experts and relevant AI actors to validate whether the system is performing consistently as intended.

MEASURE-4.3Medium

Measurable Performance

Measurable performance improvements or declines based on consultations with relevant AI actors, including affected communities, and field data about context-relevant risks and trustworthiness characteristics, are identified and documented.