KPIs for AI Agents and Generative AI: A Rigorous Framework for Evaluation and Accountability
DOI:
https://doi.org/10.38124/ijsrmt.v3i4.572Keywords:
AI Agents, Generative AI, Kpis, Evaluation Framework, Performance Metrics, Ethical Drift, Adaptability, Machine Learning, Multi-Modal AI, Deep LearningAbstract
AI agents and generative AI systems are increasingly becoming integral across sectors such as healthcare, finance, and creative industries. However, the rapid evolution of these systems has outpaced traditional evaluation methods, leaving gaps in evaluating them. This paper proposes a comprehensive Key Performance Indicator (KPI) framework spanning across five vital dimensions – Model Quality, System Performance, Business Impact, Human-AI Interaction, and Ethical and Environmental Considerations – to holistically evaluate these systems. Drawing insights from multiple studies, benchmarks like MLPerf, AI Index and standards like the EU AI Act [1] and NIST AI RMF, this framework blends established metrics like accuracy, latency and efficiency with novel metrics like “ethical drift” and “creative diversity” for tracking AI’s moral compass in real time. Evaluated on systems like GPT-4, DALL-E 3 and MidJourney, and validated through case studies such as Waymo [1] and Claude3, this framework addresses technical, operational, and ethical dimensions to enhance accountability and performance.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 International Journal of Scientific Research and Modern Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
PlumX Metrics takes 2–4 working days to display the details. As the paper receives citations, PlumX Metrics will update accordingly.