generated from best-of-lists/best-of
-
Notifications
You must be signed in to change notification settings - Fork 192
Open
Labels
add-projectAdd new project to best-of listAdd new project to best-of list
Description
Project details:
- Project Name: Crawl4AI
- Github URL: https://github.com/unclecode/crawl4ai
- Category: Web Crawling & Scraping
- License: Apache-2.0
- Package Managers: pypi:crawl4ai dockerhub:unclecode/crawl4ai
Additional context:
Crawl4AI is an open-source, AI-friendly web crawler designed to extract clean Markdown or structured data for use in RAG pipelines, LLM agents, or custom automation. It supports:
- Automatic crawl-depth detection
- Stealth crawling via Playwright
- Proxy rotation and headless browser control
- Output in Markdown, JSON, or HTML
- A simple CLI and Python SDK
It has active development, great documentation, and offers performance advantages over alternatives like Firecrawl. Perfect for scraping AI training data or building agent-ready corpora.
Metadata
Metadata
Assignees
Labels
add-projectAdd new project to best-of listAdd new project to best-of list