Robots.txt Configuration
Time between requests (0-30 seconds)
Crawling Directives
Allow Access
Disallow Access
Block File Types
Common Directives
About Robots.txt
What is Robots.txt?
A robots.txt file is a text file that tells search engine crawlers which pages or sections of your website they can or can't request from your site. It's part of the Robots Exclusion Protocol (REP), a standard used by websites to communicate with web crawlers and other web robots.
Benefits
- Control search engine crawling
- Prevent indexing of private areas
- Optimize crawl budget
- Protect sensitive content
- Improve SEO efficiency
Best Practices
- Place robots.txt in your root directory
- Use specific user-agent directives when needed
- Test thoroughly before deployment
- Keep it simple and clear
- Update when site structure changes