Introducing KubeAI: Open AI Inference Operator

We recently launched KubeAI. The goal of KubeAI is to get LLMs, embedding models and Speech to text running on Kubernetes with ease.

KubeAI provides an OpenAI compatible API endpoint which makes it work out of the box with most software that works wit…


This content originally appeared on DEV Community and was authored by Sam Stoelinga

We recently launched KubeAI. The goal of KubeAI is to get LLMs, embedding models and Speech to text running on Kubernetes with ease.

KubeAI provides an OpenAI compatible API endpoint which makes it work out of the box with most software that works with the OpenAI APIs.

Repo on GitHub: substratusai/kubeai

Image description

When it comes to LLMs, KubeAI directly operates vLLM and Ollama servers in isolated Pods, configured and optimized on a model-by-model basis. You get metrics-based auto scaling out of the box (including scale-from-zero). When you hear scale-from-zero in Kubernetes-land you probably think Knative and Istio - but not in KubeAI! We made an early design decision to avoid any external dependencies (Kubernetes is complicated enough as-is).

We are hoping to release more functionality soon. Next up: model caching, metrics and dashboard.

If you need any help or have any feedback, reach out directly, here, or via the channels listed in the repo. We are currently making it our priority to assist the project’s early adopters. So far users have seen success in use cases ranging from processing large scale batches in the cloud to running lightweight inference at the edge.


This content originally appeared on DEV Community and was authored by Sam Stoelinga


Print Share Comment Cite Upload Translate Updates
APA

Sam Stoelinga | Sciencx (2024-09-16T23:31:16+00:00) Introducing KubeAI: Open AI Inference Operator. Retrieved from https://www.scien.cx/2024/09/16/introducing-kubeai-open-ai-inference-operator/

MLA
" » Introducing KubeAI: Open AI Inference Operator." Sam Stoelinga | Sciencx - Monday September 16, 2024, https://www.scien.cx/2024/09/16/introducing-kubeai-open-ai-inference-operator/
HARVARD
Sam Stoelinga | Sciencx Monday September 16, 2024 » Introducing KubeAI: Open AI Inference Operator., viewed ,<https://www.scien.cx/2024/09/16/introducing-kubeai-open-ai-inference-operator/>
VANCOUVER
Sam Stoelinga | Sciencx - » Introducing KubeAI: Open AI Inference Operator. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/09/16/introducing-kubeai-open-ai-inference-operator/
CHICAGO
" » Introducing KubeAI: Open AI Inference Operator." Sam Stoelinga | Sciencx - Accessed . https://www.scien.cx/2024/09/16/introducing-kubeai-open-ai-inference-operator/
IEEE
" » Introducing KubeAI: Open AI Inference Operator." Sam Stoelinga | Sciencx [Online]. Available: https://www.scien.cx/2024/09/16/introducing-kubeai-open-ai-inference-operator/. [Accessed: ]
rf:citation
» Introducing KubeAI: Open AI Inference Operator | Sam Stoelinga | Sciencx | https://www.scien.cx/2024/09/16/introducing-kubeai-open-ai-inference-operator/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.