Measure
Analyze, assess, benchmark, and monitor AI risks.
Approaches Selected
Approaches and metrics for measurement of AI risks enumerated during the MAP function are selected for implementation.
Appropriate Metrics
Appropriateness of AI metrics and effectiveness of existing controls is regularly assessed and updated.
Internal Experts
Internal experts who did not serve as front-line developers for the system and/or independent assessors are involved in regular assessments and updates.
Test Sets
Test sets, metrics, and details about the tools used during TEVV are documented.
Human Subject Evaluations
Evaluations involving human subjects meet applicable requirements and are representative of the relevant population.
Performance Demonstrated
AI system performance or assurance criteria are measured qualitatively or quantitatively and demonstrated for conditions similar to deployment.
Functionality and Behavior
The functionality and behavior of the AI system and its components — as identified in the MAP function — are monitored when in production.
Validity and Reliability
The AI system to be deployed is demonstrated to be valid and reliable. Limitations of the generalizability beyond the conditions under which the technology was developed are documented.
Safety Demonstrated
AI system is evaluated regularly for safety risks — as identified in the MAP function.
Security and Resilience
AI system security and resilience — as identified in the MAP function — are evaluated and documented.
Risks of Privacy
Risks associated with transparency and accountability — as identified in the MAP function — are examined and documented.
Explainability
The AI model is explained, validated, and documented, and AI system output is interpreted within its context — as identified in the MAP function — to inform responsible use and governance.
Privacy Risk
Privacy risk of the AI system — as identified in the MAP function — is examined and documented.
Fairness and Bias
Fairness and bias — as identified in the MAP function — are evaluated and results are documented.
Environmental Impact
Environmental impact and sustainability of AI model training and management activities — as identified in the MAP function — are assessed and documented.
Effectiveness of Measures
Effectiveness of the employed TEVV metrics and processes in the MEASURE function are evaluated and documented.
Risk Tracking Approaches
Approaches, personnel, and documentation are in place to regularly identify and track existing, unanticipated, and emergent AI risks based on factors such as intended and actual performance in deployed contexts.
Risk Tracking Information
Risk tracking approaches are considered for settings where AI risks are difficult to assess using currently available measurement techniques or where metrics are not yet available.
Feedback Mechanisms
Feedback processes for end users and impacted communities to report problems and appeal system outcomes are established and integrated.
Measurement Approaches Validated
Measurement approaches for identifying AI risks are connected to deployment context(s) and informed through consultation with domain experts and other end users.
Measurement Results
Measurement results regarding AI system trustworthiness in deployment context(s) and across the AI lifecycle are informed by input from domain experts and relevant AI actors to validate whether the system is performing consistently as intended.
Measurable Performance
Measurable performance improvements or declines based on consultations with relevant AI actors, including affected communities, and field data about context-relevant risks and trustworthiness characteristics, are identified and documented.