feat: add MLX local model support for Apple Silicon#544
Open
wangericx wants to merge 2 commits intovirattt:mainfrom
Open
feat: add MLX local model support for Apple Silicon#544wangericx wants to merge 2 commits intovirattt:mainfrom
wangericx wants to merge 2 commits intovirattt:mainfrom
Conversation
Enables running LLMs locally on Apple Silicon (M1/M2/M3/M4) via the mlx-lm library, with no API key required. - src/utils/mlx_lm.py: MLX inference engine wrapping mlx-lm generate, exposes a LangChain-compatible chat interface - src/llm/mlx_models.json: curated list of MLX-compatible HuggingFace model IDs (Llama, Mistral, Gemma, Qwen, Phi families) - src/llm/models.py: register 'mlx' as a provider; get_model() returns MLX chat model when provider is 'mlx' - src/utils/llm.py: pass MLX_API_KEY / mlx base URL through LangChain call_options so the inference server URL is configurable - app/backend/routes/mlx.py: GET /mlx/models endpoint returns available MLX models for the frontend model selector - app/frontend/src/components/settings/models/mlx.tsx: UI panel to configure the MLX server URL and browse available models - app/frontend/src/data/models.ts: fetch and merge MLX models into the global model list used by all node selectors - docker/run.sh: pass MLX_API_KEY env var through to the container Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prevents backend crash when a previously-selected MLX (or any removed) model is sent to a backend that no longer supports it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enables running LLMs locally on Apple Silicon (M1/M2/M3/M4) via the mlx-lm library, with no API key required.
src/utils/mlx_lm.py: MLX inference engine wrapping mlx-lm generate, exposes a LangChain-compatible chat interface
src/llm/mlx_models.json: curated list of MLX-compatible HuggingFace model IDs (Llama, Mistral, Gemma, Qwen, Phi families)
src/llm/models.py: register 'mlx' as a provider; get_model() returns MLX chat model when provider is 'mlx'
src/utils/llm.py: pass MLX_API_KEY / mlx base URL through LangChain call_options so the inference server URL is configurable
app/backend/routes/mlx.py: GET /mlx/models endpoint returns available MLX models for the frontend model selector
app/frontend/src/components/settings/models/mlx.tsx: UI panel to configure the MLX server URL and browse available models
app/frontend/src/data/models.ts: fetch and merge MLX models into the global model list used by all node selectors
docker/run.sh: pass MLX_API_KEY env var through to the container
Summary
mlx-lm— no API key requiredsrc/utils/mlx_lm.py: MLX inference engine with a LangChain-compatible chat interfacesrc/llm/mlx_models.json: curated list of MLX-compatible models (Llama, Mistral, Gemma, Qwen, Phi)mlxas a provider insrc/llm/models.py;get_model()returns MLX model when provider ismlxGET /mlx/modelsbackend endpoint to serve available models to the frontendMLX_API_KEYenv var through DockerTest plan
mlx-lmand start a local MLX server