Robots.txt

A robots.txt file instructs crawlers which paths they may access (Disallow: /private). Ethical scraping means checking this file before downloading pages. When allowed, rotate IPs via Proxied to comply with per-IP crawl-delay directives while gathering permitted data efficiently.