InfraLLM: A Generic Large Language Model Framework for Production-Grade Microservice Auto-Scaling in Cloud Infrastructure

Muhamed Ramees Cheriya Mukkolakkal

doi:10.38124/ijsrmt.v4i11.1023

InfraLLM: A Generic Large Language Model Framework for Production-Grade Microservice Auto-Scaling in Cloud Infrastructure

Authors

Muhamed Ramees Cheriya Mukkolakkal

DOI:

https://doi.org/10.38124/ijsrmt.v4i11.1023

Abstract

Current microservice auto-scaling solutions operate in isolation, focusing on individual service metrics without considering global cloud resource availability, cross-datacenter performance, or mission-critical application priorities. This paper presents InfraLLM, a novel framework leveraging large language models to orchestrate intelligent, context-aware auto-scaling decisions across entire cloud infrastructures. Our approach integrates three key components: a distributed Collection Service for comprehensive metric aggregation, an LLM Service for predictive resource allocation, and an Execution Service for policy enforcement. Evaluation across large-scale Kubernetes deployments demonstrates up to 57.2% reduction in CPU overutilization, 51.1% improvement in resource allocation efficiency, 48% reduction in average response time, and 16× reduction in SLO violations compared to traditional per-service auto-scaling approaches. InfraLLM represents a paradigm shift from reactive, service-level scaling to proactive, infrastructure-wide resource orchestration.

Downloads

Download data is not yet available.

Downloads

Published

2025-12-08

How to Cite

Cheriya Mukkolakkal, M. R. (2025). InfraLLM: A Generic Large Language Model Framework for Production-Grade Microservice Auto-Scaling in Cloud Infrastructure. International Journal of Scientific Research and Modern Technology, 4(11), 113–123. https://doi.org/10.38124/ijsrmt.v4i11.1023

Download Citation

Issue

Vol. 4 No. 11 (2025): Volume 4, Issue 11, 2025

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

PlumX Metrics takes 2–4 working days to display the details. As the paper receives citations, PlumX Metrics will update accordingly.

InfraLLM: A Generic Large Language Model Framework for Production-Grade Microservice Auto-Scaling in Cloud Infrastructure

Authors

DOI:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

Latest Published Issue

Announcements

Current Issue

Browse

Information

Explore

Join as a Reviewer