News

Running Large Language Models Locally with Docker

Running Large Language Models Locally with Docker

April 24, 2025
Docker LLMs AI local development Docker Model Runner AnythingLLM llama.cpp OpenAI RAG AI agents
Docker Model Runner and AnythingLLM simplify the process of setting up, managing, and experimenting with large language models (LLMs) on local machines, offering powerful solutions for local AI integration.

Running LLMs Locally with Docker

Video: Run LLMs Locally with Docker Model Runner | Simplify AI Dev with Docker Desktop

Running large language models (LLMs) locally using Docker has become more accessible with tools like Docker Model Runner and AnythingLLM. These tools simplify the process of setting up, managing, and experimenting with LLMs on your local machine.

Docker Model Runner

Docker Model Runner, introduced in Docker Desktop 4.40 for macOS on Apple silicon, allows developers to easily pull, run, and test LLMs locally. It provides:

  • Local LLM inference powered by an integrated engine built on top of llama.cpp.
  • GPU acceleration on Apple silicon.
  • A collection of popular, usage-ready models packaged as standard OCI artifacts.

To enable Docker Model Runner, use the command: docker desktop enable model-runner. You can also specify a TCP port for host process interaction.

For example, to pull and run a model like SmolLM, use:

docker model pull ai/smollm2:360M-Q4_K_M
docker model run ai/smollm2:360M-Q4_K_M "Give me a fact about whales."

Model Runner exposes an OpenAI-compatible API, making it easy to integrate with existing applications.

AnythingLLM

AnythingLLM is an all-in-one AI application that supports both desktop and Docker installations. It offers:

  • Built-in RAG (Retrieval-Augmented Generation) and AI agents.
  • Multi-user support and permissions.
  • Compatibility with various LLMs, vector databases, and embedding models.

AnythingLLM allows you to chat with your documents, create custom AI agents, and manage workspaces efficiently. It supports a wide range of LLMs, including OpenAI, Azure OpenAI, and open-source models like llama.cpp.

For Docker installation, refer to the AnythingLLM Docker documentation.

Conclusion

Both Docker Model Runner and AnythingLLM provide powerful solutions for running LLMs locally. Whether you're looking for a quickstart with Docker Model Runner or a comprehensive AI application with AnythingLLM, these tools make it easier to integrate AI into your local development workflow.

Sources

Mintplex-Labs/anything-llm: The all-in-one Desktop & Docker AI ... A full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting.
Local Docker Installation - AnythingLLM Docs Go to http://localhost:3001 and you are now using AnythingLLM! All your data and progress will persist between container rebuilds or pulls from Docker Hub.
Run LLMs Locally with Docker: A Quickstart Guide to Model Runner Now available in Beta with Docker Desktop 4.40 for macOS on Apple silicon, Model Runner makes it easy to pull, run, and experiment with LLMs ...