An MCP server enabling LLMs to interact with Databricks for SQL queries and job management.
Databricks MCP Server Integration
Overview
The Databricks MCP Server is a Model Context Protocol (MCP) server that connects to the Databricks API, enabling Large Language Models (LLMs) to interact with Databricks. It allows users to run SQL queries, list jobs, and retrieve job statuses directly from their Databricks workspace.
Features
- Run SQL Queries: Execute SQL queries on Databricks SQL warehouses.
- List Jobs: Retrieve a list of all Databricks jobs in your workspace.
- Job Status: Get the status of specific Databricks jobs.
- Job Details: Access detailed information about specific Databricks jobs.
Prerequisites
- Python 3.7+: Ensure you have Python 3.7 or later installed.
- Databricks Workspace: You need a Databricks workspace with:
- Personal access token
- SQL warehouse endpoint
- Permissions to run queries and access jobs
Setup
- Clone the Repository:
bash
git clone https://github.com/JordiNeil/mcp-databricks-server.git
- Create and Activate a Virtual Environment:
bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install Dependencies:
bash
pip install -r requirements.txt
- Create a
.env
File:
bash
DATABRICKS_HOST=your-databricks-instance.cloud.databricks.com
DATABRICKS_TOKEN=your-personal-access-token
DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/your-warehouse-id
- Test Your Connection:
bash
python test_connection.py
Obtaining Databricks Credentials
- Host: Your Databricks instance URL (e.g.,
your-instance.cloud.databricks.com
).
- Token: Create a personal access token in Databricks:
- Go to User Settings > Developer > Manage Access Tokens.
- Generate a new token and save it immediately.
- HTTP Path: For your SQL warehouse:
- Go to SQL Warehouses in Databricks.
- Select your warehouse and copy the HTTP Path.
Running the Server
Start the MCP server:
python main.py
Test the MCP server using the inspector:
npx @modelcontextprotocol/inspector python3 main.py
Available MCP Tools
- run_sql_query(sql: str): Execute SQL queries on your Databricks SQL warehouse.
- list_jobs(): List all Databricks jobs in your workspace.
- get_job_status(job_id: int): Get the status of a specific Databricks job by ID.
- get_job_details(job_id: int): Get detailed information about a specific Databricks job.
Example Usage with LLMs
When used with LLMs that support the MCP protocol, this server enables natural language interaction with your Databricks environment:
- "Show me all tables in the database"
- "Run a query to count records in the customer table"
- "List all my Databricks jobs"
- "Check the status of job #123"
- "Show me details about job #456"
Troubleshooting
Connection Issues
- Ensure your Databricks host is correct and doesn't include
https://
prefix.
- Check that your SQL warehouse is running and accessible.
- Verify your personal access token has the necessary permissions.
- Run the included test script:
python test_connection.py
.
Security Considerations
- Token Security: Your Databricks personal access token provides direct access to your workspace.
- Environment File: Secure your
.env
file and never commit it to version control.
- Permission Scopes: Consider using Databricks token with appropriate permission scopes only.
- Secure Environment: Run this server in a secure environment.
About
The Databricks MCP Server is designed to enhance the capabilities of LLMs by integrating them with Databricks, allowing for seamless interaction and data retrieval.
Resources
Stars
8 stars
Watchers
1 watching
Forks
1 fork