Simplify Databricks metadata retrieval with MCP Server using FastMCP and Databricks SDK automation
This project introduces an MCP (Model Context Protocol) server specifically designed to simplify interactions with a Databricks workspace. By leveraging FastMCP framework, it exposes tools that allow users to query and retrieve detailed information about Databricks resources like schemas, tables, samples, and job results directly via MCP commands. The primary goal is to streamline common metadata retrieval tasks for users interacting through the MCP interface, all while utilizing the robust capabilities of the Databricks SDK and CLI.
The core design of this server emphasizes efficiency and ease of use. It allows AI applications like Claude Desktop, Continue, Cursor, and more to connect seamlessly with specific data sources within a Databricks workspace. By adhering to the Model Context Protocol (MCP), it ensures interoperability across different platforms, making it easier for developers building AI workflows.
databricks-cli
profiles to ensure secure access.The architecture of this server is meticulously designed to adhere closely to the Model Context Protocol (MCP). Key components include:
init.py
script configures the connection to your Databricks workspace, ensuring that all necessary authentication steps and metadata queries are properly set up.uv
package for easy virtual environment creation and dependency installation.get_schemas
, get_table_sample_tool
, and others are defined in Python scripts using FastMCP to handle MCP communication via standard input/output.graph TD
A[AI Application] -->|MCP Client| B[MCP Server]
B --> C[Databricks Workspace]
C --> D[Data Source/Tool]
style A fill:#e1f5fe
style B fill:#f3e5f5
style C fill:#e8f5e8
This diagram illustrates how an AI application interacts with the MCP server, which then communicates with the Databricks workspace and retrieves necessary data.
git clone https://github.com/astral-sh/uv.git
cd uv
python setup.py install
Clone this repository:
git clone https://github.com/your-repo-url.git
cd MCP-Server-for-Databricks-Interaction
Create and activate a virtual environment with dependencies:
uv venv # Create a virtual environment (e.g., .venv)
uv sync # Install dependencies using pyproject.toml and uv.lock
source .venv/bin/activate # On Windows, use `.venv\\Scripts\\activate`
python init.py
config.yaml
file will be populated with necessary settings.Use Case 1: Data Quality Verification
def verify_data_quality():
# Step 1: Use MCP Server to retrieve metadata
result = mcp.send_command("get_table_sample_tool", {"catalog": "warehouse1", "schema_name": "sales", "table": "orders"})
# Step 2: Analyze the sample and validate data quality
if not is_data_valid(result["sample_values"]):
raise ValueError("Data quality issues detected")
Use Case 2: Job Execution Monitoring
def monitor_job():
latest_run = mcp.send_command("get_job_run_result", {"job_name": "data-processing-job"})
# Step 1: Check for errors or failure in the job run
if "error" in latest_run:
print(f"Job encountered an error: {latest_run['error']}")
else:
print("Recent job run executed successfully")
The server is compatible with select MCP clients, including:
MCP Client | Resources | Tools | Prompts | Status |
---|---|---|---|---|
Claude Desktop | ✅ | ✅ | ✅ | Full Support |
Continue | ✅ | ✅ | ✅ | Full Support |
Cursor | ❌ (Tools Only) | ✅ (limited) | ❌ | Tools Only |
Advanced configurations include:
config.yaml
.{
"mcpServers": {
"MCP-Server-for-Databricks-Interaction": {
"command": "./server/uv",
"args": [
"--directory",
"/path/to/MCP-Server-for-Databricks-Interaction",
"run",
"main.py"
],
"env": {
"API_KEY": "your_secret_api_key"
}
}
}
}
Q: How does the server handle authentication securely?
databricks-cli
profiles to ensure secure access, with detailed configurations managed in config.yaml
.Q: Can this be integrated into a multi-tenant environment?
Q: How does the server handle job failure during execution?
Q: What level of customization is possible with this server?
Q: Are there any compatibility issues with older versions of the SDKs?
By following this documentation, developers can leverage the power of MCP servers to build robust, scalable AI applications that integrate seamlessly with Databricks workspaces.
Learn to connect to MCP servers over HTTP with Python SDK using SSE for efficient protocol communication
Integrate AI with GitHub using MCP Server for profiles repos and issue creation
Build a local personal knowledge base with Markdown files for seamless AI conversations and organized information.
Next-generation MCP server enhances documentation analysis with AI-powered neural processing and multi-language support
Python MCP client for testing servers avoid message limits and customize with API key
Explore MCP servers for weather data and DigitalOcean management with easy setup and API tools