Your AI Needs Eyes on the Web
AI models are incredibly smart, but they have a blind spot. They only know what they were trained on, which means anything that happened after their training cutoff is invisible to them. Ask about last week's news, a GitHub issue from this morning, or the latest package release, and they'll either shrug or make something up.
That's where the Model Context Protocol (MCP) comes in. Think of MCP as a way to give your AI a toolbox. Each tool is a small program the AI can run when it needs fresh information. A calculator tool for math, a file system tool for reading your code, and most importantly - a search tool for looking things up on the web.
The Web Search MCP server is one of those toolboxes. It bundles search engines, social platforms, academic databases, and web page fetchers into a single server that any MCP-compatible AI client can use. Instead of guessing, your AI can go find the answer.
What the Model Context Protocol Actually Is
MCP is a standard way for AI applications to talk to tools and data sources. If you've used Claude Code, Cursor, or any AI coding assistant that can run commands or read files, you've already seen this idea in action - just through proprietary implementations.
MCP makes it universal. Any MCP client can connect to any MCP server. Your AI doesn't care whether the server is written in Python, TypeScript, or Go. As long as it speaks MCP, the tools just work.
The Web Search MCP server is written in Python using the FastMCP library, which handles all the protocol plumbing so the author could focus on the actual search logic.
A Tour of the Toolbox
The server groups its tools by what they're good at. Here's what you get out of the box.
General Web Search
Two search engines, one interface. DuckDuckGo for fast, free lookups. Exa as a fallback or primary provider when you need semantic search. You can scope results to a specific domain, filter by date, switch to news mode, or target a geographic region.
# Broad search
search_web(query="uv package manager")
# Targeted docs search
search_web(query="useEffect cleanup", domain="react.dev")
# News with region
search_web(query="elections", search_type="news", region="us-en")
# Last week only
search_web(query="Python 3.13 features", time_range="w")
Page Fetching
Search results give you snippets. Sometimes you need the full article. The fetch_page tool extracts clean text from any URL, stripping away ads, navigation, and JavaScript cruft. It supports a dozen output formats including markdown, JSON, and plain text.
fetch_page(url="https://docs.python.org/3/library/os.html")
Social and Community Intelligence
This is where the server really shines. It doesn't just search the open web - it taps into the platforms where real people discuss technology.
Reddit searches community discussions and returns top comments with scores. This is great for finding real user opinions, troubleshooting threads, and product recommendations.
Hacker News gives you access to technical discourse and startup discussions. The results include top comments and pre-extracted insights, so the AI can quickly understand the consensus without reading every reply.
GitHub lets the AI search issues and pull requests. It returns state labels, reaction counts, and top comments. This is invaluable for tracking bugs, understanding feature requests, or checking whether a breaking change has been discussed upstream.
X/Twitter provides real-time signal from breaking news and expert threads. It requires session cookies from a logged-in account, so it's not as plug-and-play as the others, but the real-time coverage is unmatched.
# Reddit
search_reddit(query="Best mechanical keyboards 2024",
subreddits=["MechanicalKeyboards"])
# Hacker News
search_hackernews(query="MCP server architecture")
# GitHub
search_github(query="uv package manager")
# X/Twitter
search_x(query="Llama 4 release")
Academic and Reference
arXiv searches academic papers with Lucene field prefixes. You can search by author, category, title, or abstract.
Wikipedia pulls factual summaries for background research.
search_arxiv(query="transformer attention", max_results=10)
search_wikipedia(query="Model Context Protocol")
How It Actually Works Under the Hood
The server follows a clean architecture. A central server.py file creates a FastMCP instance and registers each tool with the @mcp.tool decorator. The tool's docstring becomes its description, which the AI reads to understand when to use it.
The actual logic lives in separate modules:
-
search/handles DuckDuckGo and Exa integration -
social/handles Reddit, Hacker News, GitHub, and X -
tools/handles arXiv and Wikipedia -
_http/provides a centralized HTTP client with consistent timeouts and error handling -
_config/manages rate limits and environment variables -
_models/defines Pydantic models for type-safe requests and responses
The X/Twitter integration is the most interesting piece. Instead of using the official API (which requires expensive enterprise access), the server vendors a stripped-down version of the Bird CLI, a third-party tool that reverse-engineers Twitter's internal GraphQL API. This is why it needs browser session cookies rather than an API key.
The server also implements a rate limiter to prevent abuse. Each social tool has three depth tiers - quick, default, and deep - that control how many results and comments to fetch. This lets the AI request a quick skim or a deep dive without overwhelming the source platforms.
Setting It Up in Five Minutes
You need Python 3.11 or newer and the uv package manager.
uv tool install git+https://github.com/sydasif/web-search-mcp.git
Then add it to your MCP client configuration:
{
"mcpServers": {
"web-search": {
"command": "web-search-mcp"
}
}
}
That's it for basic usage. All the search tools work without any API keys or authentication.
Optionally, you can set environment variables for additional features:
-
GITHUB_TOKENor theghCLI for authenticated GitHub searches -
AUTH_TOKENandCT0(browser cookies from x.com) for X/Twitter search -
EXA_API_KEYfor Exa as an alternative search provider
A Real Research Workflow in Action
The repository includes a research skill that shows how these tools chain together.
A deep research session goes through three phases:
Broad discovery starts with
search_webto map the landscape, then usesfetch_pageto read the most promising articles in full.Community mining fans out across Reddit, Hacker News, GitHub, and X using the platform-specific tools. Each platform reveals a different angle - HN for technical critique, Reddit for user experiences, GitHub for development signals, X for real-time reaction.
Synthesis combines everything into a structured report with source citations, confidence levels, and identified gaps.
The skill uses a source weighting system: official docs are highest confidence, community discussions are medium, and social media is treated as real-time signal that needs cross-referencing.
Best Practices
Let the fallback work for you. The default provider is auto, which tries DuckDuckGo first and falls back to Exa if the results are empty. This handles most cases without configuration.
Match result depth to your need. Use quick depth for simple answers (5 results). Use deep for comprehensive research (up to 60 results). The rate limiter will thank you.
Use JSON output for programmatic use. The default markdown output is great for human reading, but if you're building a pipeline, set response_format="json" and work with the structured data.
Cross-reference before trusting. No single platform gives you the full picture. If a claim appears on Reddit, Hacker News, and a blog post, that's stronger evidence than any one source.
What You Can Build With This
The Web Search MCP server turns your AI from a static knowledge base into an active researcher. You can ask it to investigate a bug by searching GitHub issues, check community sentiment on Reddit, read the relevant documentation, and summarize everything - all in one conversation.
The best next step is to install it and try a multi-source search. Ask your AI to research a topic using search_web for background, search_hackernews for technical discussion, and search_github for current development status. You'll quickly see the difference between an AI that guesses and an AI that finds out.
Acknowledgment: This blog is inspired by publicly available materials, standards, and community research or similar work. See full configuration at https://github.com/sydasif/web-search-mcp/wiki.












