Data Quality Dimensions for AI
Core Contribution
Framework for assessing data quality specifically in AI/ML contexts. Extends traditional data quality dimensions with AI-specific considerations.
Traditional Dimensions Applied
- Accuracy, completeness, consistency, timeliness (standard DQ)
- Additional focus on label quality, representativeness
AI-Specific Dimensions
Label Quality: Annotation accuracy, inter-annotator agreement Representativeness: Coverage of target distribution Provenance: Source documentation for audit Freshness: Temporal relevance for deployment context
Quality-Performance Relationship
Examines how data quality dimensions relate to model performance metrics. Not all quality dimensions equally important for all tasks.
Related: 04-atom—data-quality-dimensions-consensus-gap, 04-molecule—data-cascades-concept