A robots.txt file gives instructions to web crawlers and search engine robots about which pages of your website to crawl and index.
After uploading and testing your robots.txt file, Google’s crawlers will automatically discover and fetch it. You don’t need to do anything.
If you’ve made updates to your robots.txt file and want to promptly refresh Google’s cached version, you can request a recrawl in the robots.txt report inside Google Search Console.
In this blog post, learn how to submit your robots.txt file to Google Search Console with a step-by-step guide.
How to Submit Your Robots.txt File to Google Search Console
1. Create and review your robots.txt file
To create a robots.txt file, you can use a text editor or an online tool like Free Robots.txt Generator.
In the file, use the appropriate syntax to define rules for web crawlers, specifying which areas of your website they are allowed or disallowed to access. You can also include directives for sitemaps, crawl delay, and other instructions. Once you have created your robots.txt file, save it as “robots.txt” (no quotes) without any extension.
Review your robots.txt file content to ensure that you are not blocking crawlers from indexing essential pages on your site accidentally. The file must be written in a way that ensures crawlers can navigate through your website and index important pages. Avoid using disallow statements for pages that you want crawling and indexing.
2. Upload Your Robots.txt File
Once the file is created and reviewed, the next step is to upload it to the root of your site. You can do this by connecting to your server via FTP or using a file manager provided by your hosting provider.
If you are using a Content Management System (CMS) like WordPress, you can access the root directory of your site through the CMS’s built-in file manager. Or install a plugin to manage your robots.txt file.
According to RFC 9309, the robots.txt file must be at the root of each protocol and host combination of your site. Cut off everything after the host (and optional port) in the URL of a file and add “/robots.txt” to determine the URL.
If you have subdomains, make sure they have their own robots.txt as Google consider them as a separate property. However, if your subdomains are set up as separate sites, they will need their own robots.txt files.
3. Request a recrawl or wait patiently
After uploading your robots.txt file, Google’s crawlers will automatically discover and fetch your robots.txt file. Just sit and wait patiently.
If you’ve updated robots.txt file, to promptly refresh Google’s cached version, you can request a recrawl in the robots.txt report to ask Google to refresh your cached version.
4. Test and monitor your robots.txt file
Changes in your site’s structure or content may require updates to the robots.txt file. Keep track of any changes and regularly review and update your file to avoid any potential issues with search engine crawling and indexing.
As Google has discontinued Robots.txt Testing Tool, you can use third-party testing tools, such as Logeix’s Robots.txt Testing Tool to check if you have properly allowed or disallowed pages on your site. Or you can use Search Console’s URL Inspection Tool to perform a “Live Test”.
If there are any problems detected, make the necessary adjustments and reupload the file. Keep in mind that it may take some time for changes to reflect in search engines, so be patient and monitor your site’s performance regularly.
To see how Google fetches your robots.txt file, you can now use the Robots.txt report in Google Search Console. Follow these steps to see the report:
- Open Google Search Console
- Click “Settings” in the left sidebar
- Navigate to Crawling > Robots.txt
- Select the “File” will allow you see the directives of the linked robots.txt
Final Words
For more information on robots.txt and best practices, refer to the Google Developer documentation. Remember that a well-managed robots.txt file plays a crucial role in your site’s SEO and overall performance.
Regularly review and update your file, and use the available tools to ensure that search engines can properly crawl and index your site.