Continuous performance and scale validation of Red Hat OpenShift AI model-serving stack

2024-01-17 in RHOAI, PSAP, Work

Today, my blog post on continuous performance and scale testing of Red Hat OpenShift KServe model serving stack was published!

Great work in collaboration with multiple persons from PSAP team and RHOAI QE/dev teams.

Continuous performance and scale validation of Red Hat OpenShift AI model-serving stack

It presents the results of three different flavors of performance and scale testing. Each flavor focuses on a particular aspect of the KServe model serving stack:

Single-model performance testing
- Focuses on the performance of the model and its serving runtime to verify that it does not regress over the releases.
Multi-model performance and scale testing
- Focuses on the performance of the model serving stack when running under heavy load but at low scale.
Many-model scale testing
- Focuses on the scalability of the model deployment stack when running at large scale.

RHOAI_kserve_testing