Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
-
Updated
Apr 9, 2026 - Python
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.
High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
Database system for AI-powered apps
TensorFlow template application for deep learning
DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
A unified end-to-end machine intelligence platform
Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon
ML pipeline orchestration and model deployments on Kubernetes.
MLModelCI is a complete MLOps platform for managing, converting, profiling, and deploying MLaaS (Machine Learning-as-a-Service), bridging the gap between current ML training and serving systems.
Serving Inside Pytorch
ClearML - Model-Serving Orchestration and Repository Solution
Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.
Deploy DL/ ML inference pipelines with minimal extra code.
A Streaming-Native Serving Engine for TTS/STS Models
[⛔️ DEPRECATED] Friendli: the fastest serving engine for generative AI
Add a description, image, and links to the serving topic page so that developers can more easily learn about it.
To associate your repository with the serving topic, visit your repo's landing page and select "manage topics."