Advanced web search and content extraction tool with Python leveraging Firecrawl API for powerful scraping and crawling
WebSearch is an advanced web search and content extraction tool built with Python, leveraging the Firecrawl API for sophisticated web scraping, searching, and content analysis capabilities. As an MCP server, it aims to enhance AI applications like Claude Desktop by providing a standardized interface that enables seamless integration. This documentation will guide you through the setup, usage, and optimization of WebSearch as part of your AI workflow.
WebSearch MCP Server integrates deeply with various AI frameworks, including Claude Desktop, Continue, Cursor, and more through MCP v1.0 compliance. It supports advanced web search functionalities, content extraction from web pages using natural language prompts, web crawling for discovery purposes, and web scraping with support for various output formats.
Imagine an AI assistant like Claude Desktop needs to research a topic quickly and effectively. By setting up WebSearch as an MCP client, it can leverage the full power of the server's intelligent search capabilities. For instance, when Claude receives a query "best practices for machine learning," it could query the WebSearch server through the MCP protocol, which would perform a deep web search and return top-relevant results.
Consider an AI writer that needs to extract specific data points from multiple web pages. The MCP client can make requests to the WebSearch server, providing URLs and natural language instructions: "Extract information on recent advancements in autonomous vehicle technology." WebSearch would then analyze these sites, extract relevant details, format them as per the request, and return the structured results.
The WebSearch MCP server architecture follows a client-server model where the AI application (MCP client) communicates with the server using standardized messages. The protocol flow diagram illustrates this interaction seamlessly:
graph TD
A[AI Application] -->|MCP Client| B[MCP Protocol]
B --> C[MCP Server]
C --> D[Data Source/Tool]
style A fill:#e1f5fe
style C fill:#f3e5f5
style D fill:#e8f5e8
The MCP protocol ensures secure and efficient data transmission between the client and server, adhering to version 1.0 standards as defined by the MCP community.
To get started with WebSearch MCP Server, follow these steps:
uv
package managerInstall uv:
# On Windows (using pip)
pip install uv
# On Unix/MacOS
curl -LsSf https://astral.sh/uv/install.sh | sh
# Add to PATH (Unix/MacOS)
export PATH="$HOME/.local/bin:$PATH"
# Add to PATH (Windows - add to Environment Variables)
# Add: %USERPROFILE%\.local\bin
Clone the repository:
git clone https://github.com/yourusername/websearch.git
cd websearch
Create and activate a virtual environment with uv:
# Create virtual environment
uv venv
# Activate on Windows
.\.venv\Scripts\activate.ps1
# Activate on Unix/MacOS
source .venv/bin/activate
Install dependencies with uv:
# Install from requirements.txt
uv sync
Set up environment variables:
touch .env
# Add your API keys
FIRECRAWL_API_KEY=your_firecrawl_api_key
OPENAI_API_KEY=your_openai_api_key
TAVILY_API_KEY= your_tavily_api_key
WebSearch MCP Server excels in several key use cases that benefit from its rich set of features:
WebSearch MCP Server is compatible with several popular AI clients like Claude Desktop, Continue, Cursor, etc., as seen in the following compatibility matrix:
MCP Client | Resources | Tools | Prompts | Status |
---|---|---|---|---|
Claude Desktop | ✅ | ✅ | ✅ | Full Support |
Continue | ✅ | ✅ | ✅ | Full Support |
Cursor | ❌ (No support) | ✅ | ❌ (No support) | Tools Only |
WebSearch MCP Server is optimized for performance and offers compatibility with a wide range of tools and platforms. However, it specifically supports the following:
The server architecture ensures that these clients can send requests, receive responses, and process data efficiently.
The .env
file is crucial for configuring WebSearch MCP Server:
# OpenAI API key - Required for AI-powered features
OPENAI_API_KEY=your_openai_api_key_here
# Firecrawl API key - Required for web scraping and searching
FIRECRAWL_API_KEY=your_firecrawl_api_key_here
Ensure all required keys are correctly set in the .env
file.
Here’s a sample configuration snippet to get you started:
{
"mcpServers": {
"websearch": {
"command": "uv",
"args": [
"--directory",
"/path/to/WebSearch",
"run",
"main.py"
],
"env": {
"FIRECRAWL_API_KEY": "your_firecrawl_api_key",
"OPENAI_API_KEY": "your_openai_api_key"
}
}
}
}
A1: Check the compatibility matrix provided in this documentation to ensure your specific AI application or tools are listed as fully supported.
A2: Yes, WebSearch is designed for efficient and scalable web crawling. You can set maximum depth and limit parameters to control crawl size.
A3: Review the troubleshooting section in the README; ensure all API keys are correctly configured and that environment variables are loaded properly.
A4: No, Firecrawl is essential for web scraping and searching functionalities. An OpenAI or Tavily API key is optional but enhances additional features like prompt-based extraction.
A5: WebSearch adheres to secure protocol standards to protect client-server communication. Ensure all keys and secrets are stored securely.
This documentation ensures technical accuracy, completeness, and originality while emphasizing AI application integration through consistent use of professional language focused on developers building AI applications and MCP integrations.
By integrating WebSearch as an MCP server, you can significantly enhance the capabilities of your AI applications, making them more powerful and versatile in handling complex data extraction and analysis tasks.
RuinedFooocus is a local AI image generator and chatbot image server for seamless creative control
Simplify MySQL queries with Java-based MysqlMcpServer for easy standard input-output communication
Learn to set up MCP Airflow Database server for efficient database interactions and querying airflow data
Build stunning one-page websites track engagement create QR codes monetize content easily with Acalytica
Access NASA APIs for space data, images, asteroids, weather, and exoplanets via MCP integration
Explore CoRT MCP server for advanced self-arguing AI with multi-LLM inference and enhanced evaluation methods