What robots.txt does
It gives crawl instructions to compatible bots.
It can disallow certain paths or allow more specific areas.
What it does not do
It does not guarantee a page stays out of search results.
It does not securely protect private content.
Common directives
User-agent identifies which crawler the rule applies to.
Disallow blocks crawling for matching paths.
Allow can open a specific path within a broader disallow rule.
Common mistakes
- Blocking important pages accidentally
- Thinking robots.txt is a security layer
- Forgetting to link or maintain the sitemap reference
Good practice
Keep the file simple and intentional.
Test important pages after any crawl rule changes.
FAQ
What is robots.txt?
It is a text file that gives crawl instructions to search engine bots.
Can robots.txt hide private pages securely?
No. It is not a security system.
Should I add my sitemap to robots.txt?
It is a common good practice and can help discovery.