KPIs for AI Agents and Generative AI: A Rigorous Framework for Evaluation and Accountability

Authors

  • Vivek Lakshman Bhargav Sunkara Citi, USA

DOI:

https://doi.org/10.38124/ijsrmt.v3i4.572

Keywords:

AI Agents, Generative AI, Kpis, Evaluation Framework, Performance Metrics, Ethical Drift, Adaptability, Machine Learning, Multi-Modal AI, Deep Learning

Abstract

AI agents and generative AI systems are increasingly becoming integral across sectors such as healthcare, finance, and creative industries. However, the rapid evolution of these systems has outpaced traditional evaluation methods, leaving gaps in evaluating them. This paper proposes a comprehensive Key Performance Indicator (KPI) framework spanning across five vital dimensions – Model Quality, System Performance, Business Impact, Human-AI Interaction, and Ethical and Environmental Considerations – to holistically evaluate these systems. Drawing insights from multiple studies, benchmarks like MLPerf, AI Index and standards like the EU AI Act [1] and NIST AI RMF, this framework blends established metrics like accuracy, latency and efficiency with novel metrics like “ethical drift” and “creative diversity” for tracking AI’s moral compass in real time. Evaluated on systems like GPT-4, DALL-E 3 and MidJourney, and validated through case studies such as Waymo [1] and Claude3, this framework addresses technical, operational, and ethical dimensions to enhance accountability and performance.

Downloads

Download data is not yet available.

Downloads

Published

2024-04-28

How to Cite

Lakshman Bhargav Sunkara, V. (2024). KPIs for AI Agents and Generative AI: A Rigorous Framework for Evaluation and Accountability. International Journal of Scientific Research and Modern Technology, 3(4), 22–29. https://doi.org/10.38124/ijsrmt.v3i4.572

PlumX Metrics takes 2–4 working days to display the details. As the paper receives citations, PlumX Metrics will update accordingly.

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.