serving

Here are 70 public repositories matching this topic...

ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Updated Apr 9, 2026
Python

skyzh / tiny-llm

Star

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

python course serving llm large-language-model vllm qwen qwen2

Updated Mar 26, 2026
Python

Lightning-AI / LitServe

Star

A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.

api web ai deep-learning rest-api artificial-intelligence developer-tools serving fastapi

Updated Apr 9, 2026
Python

PaddlePaddle / FastDeploy

Star

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

inference openai serving ernie llm llm-serving vllm ernie-45 ernie-45-vl

Updated Apr 9, 2026
Python

georgia-tech-db / evadb

Star

Database system for AI-powered apps

agent database ai data-analysis eva object-detection labeling hacktoberfest video-analytics serving huggingface gpt-4 llm chatgpt langchain gpt4all auto-gpt

Updated May 17, 2024
Python

tobegit3hub / tensorflow_template_application

Star

TensorFlow template application for deep learning

machine-learning csv deep-learning tensorflow inference cnn lstm tensorboard mlp libsvm tfrecords wide-and-deep serving

Updated Jul 5, 2023
Python

Delta-ML / delta

Star

DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/

Updated Apr 16, 2025
Python

underneathall / pinferencia

Star

Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

Updated Feb 14, 2023
Python

meta-soul / MetaSpore

Star

A unified end-to-end machine intelligence platform

training ai machinelearning deeplearning abtesting serving

Updated Sep 2, 2024
Python

zzsza / Boostcamp-AI-Tech-Product-Serving

Star

부스트캠프 AI Tech - Product Serving 자료

serving mlops

Updated Dec 25, 2025
Python

polyaxon / haupt

Star

Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon

Updated Apr 5, 2026
Python

bodywork-ml / bodywork-core

Star

ML pipeline orchestration and model deployments on Kubernetes.

python kubernetes devops data-science machine-learning framework pipeline continuous-deployment orchestration batch cicd serving mlops

Updated Aug 18, 2023
Python

cap-ntu / ML-Model-CI

Star

MLModelCI is a complete MLOps platform for managing, converting, profiling, and deploying MLaaS (Machine Learning-as-a-Service), bridging the gap between current ML training and serving systems.

deep-learning continuous-integration profiler inference pytorch dispatcher tensorflow-serving tensorrt serving onnx mlops tensorrt-inference-server convert-models

Updated Mar 6, 2023
Python

torchpipe / torchpipe

Star

Serving Inside Pytorch

deployment inference pytorch ray serve tensorrt serving pipeline-parallelism torch2trt triton-inference-server llm-serving

Updated Feb 3, 2026
Python

clearml / clearml-serving

Star

ClearML - Model-Serving Orchestration and Repository Solution

kubernetes devops machine-learning ai deep-learning triton tensorflow-serving model-serving serving mlops serving-pytorch-models triton-inference-server clearml serving-ml

Updated Mar 12, 2026
Python

krystianity / keras-serving

Star

bring keras-models to production with tensorflow-serving and nodejs + docker 🍕

nodejs python docker cpp tensorflow network production keras grpc neuronal serving

Updated Dec 9, 2022
Python

AI-Hypercomputer / gpu-recipes

Star

Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.

gpu benchmarks google-cloud-platform serving distributed-training

Updated Apr 9, 2026
Python

notAI-tech / fastDeploy

Star

Deploy DL/ ML inference pipelines with minimal extra code.

Updated Feb 10, 2026
Python

vox-serve / vox-serve

Star

A Streaming-Native Serving Engine for TTS/STS Models

machine-learning text-to-speech speech inference serving speech-to-speech

Updated Apr 8, 2026
Python

friendliai / friendli-client

Star

[⛔️ DEPRECATED] Friendli: the fastest serving engine for generative AI

ai ml inference gpt inference-server mistral inference-engine serving mlops gpt3 llm stable-diffusion llms generative-ai llmops llm-serving llm-inference llama2 llm-ops

Updated Jun 25, 2025
Python

Improve this page

Add a description, image, and links to the serving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the serving topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

serving

Here are 70 public repositories matching this topic...

ray-project / ray

skyzh / tiny-llm

Lightning-AI / LitServe

PaddlePaddle / FastDeploy

georgia-tech-db / evadb

tobegit3hub / tensorflow_template_application

Delta-ML / delta

underneathall / pinferencia

meta-soul / MetaSpore

zzsza / Boostcamp-AI-Tech-Product-Serving

polyaxon / haupt

bodywork-ml / bodywork-core

cap-ntu / ML-Model-CI

torchpipe / torchpipe

clearml / clearml-serving

krystianity / keras-serving

AI-Hypercomputer / gpu-recipes

notAI-tech / fastDeploy

vox-serve / vox-serve

friendliai / friendli-client

Improve this page

Add this topic to your repo