Back to glossary

Robots.txt

Robots.txt is a text file located in the root directory of a website, used to instruct web robots (typically search engine crawlers) which pages or sections of the site should not be crawled or indexed.

It serves as a request to the robots not to crawl or index certain parts of a website, helping site owners manage the information they want search engines to access.

Example of a simple robots.txt file

In this example, the User-agent: * line indicates that the rules apply to all web robots. The Disallow lines specify the folders or pages that the robots should not crawl or index.

User-agent: *
Disallow: /private/
Disallow: /temp/

How to create a robots.txt file

  1. Create a plain text file using a text editor like Notepad or TextEdit.
  2. Add the necessary rules (User-agent and Disallow) to the file.
  3. Save the file as “robots.txt” (without quotes).
  4. Upload the file to the root directory of your website.

Things to consider

Be careful when using the Disallow directive, as it can unintentionally block important pages or sections of your site from being indexed.

Remember that the robots.txt file is publicly accessible, so don’t include sensitive information in it.

Not all search engines interpret the robots.txt file the same way, and some may not follow the rules you’ve set.