Skip to content

Latest commit

 

History

History
33 lines (24 loc) · 687 Bytes

File metadata and controls

33 lines (24 loc) · 687 Bytes

MCP Servers scraper

This project scrapes https://mcpservers.org/all?sort=newest&page=1, paginates through the full listing, follows each server detail page, and writes one JSON object per server to output/mcpservers.jsonl.

Each JSONL record includes:

  • title
  • summary
  • github_url
  • repo_title
  • content_text
  • content_html
  • _source_url
  • _listing_url

Run

uv sync
uv run mcpservers-scraper

You can also run it directly with:

uv run main.py

The crawler uses silkworm with:

  • pagination from the Next button on the /all listing
  • detail-page follows for every /servers/... link
  • JSONL output in output/mcpservers.jsonl