Robots.txt Generator
Free robots.txt generator. WordPress, WooCommerce, Shopify templates plus AI crawler blocks (GPTBot, ClaudeBot). Copy or download.
About the tool
What is the SBMM Robots.txt Generator?
The SBMM Robots.txt Generator is a free online tool that builds a production-ready robots.txt file from a guided wizard. Pick a stack template (Generic, WordPress, WooCommerce, or Shopify), toggle AI crawler blocks (GPTBot, ClaudeBot, Google-Extended, PerplexityBot, Bytespider, Applebot-Extended, Meta-ExternalAgent, CCBot, cohere-ai), paste your sitemap URL, add any custom Disallow rules, and download the finished robots.txt.
A correct robots.txt is the first ranking signal every search engine and AI crawler reads on your site. A misconfigured rule can wipe organic traffic by blocking Googlebot from your money pages. A missing rule can leak admin paths to scrapers or quietly opt your content into AI training datasets you wanted out of. Writing the file by hand from a blank text editor is unforgiving; this generator gives you the safe defaults plus the precision to customise.
The output is plain text you can paste straight into your hosting control panel, copy into a Git repo, or download as a robots.txt file ready to upload to the root of your domain. Validate the generated file with the SBMM Robots.txt Analyzer before going live to confirm every check passes on your production setup.
Step by step
How to use this tool in 3 steps
-
Step 01
Pick a template that matches your stack
Choose Generic (any site), WordPress (with wp-admin and wp-includes covered), WooCommerce (cart, checkout, my-account excluded), or Shopify (collection / cart / checkout handled). The template drops in safe baseline rules for that platform.
-
Step 02
Toggle AI crawlers + custom rules
Switch the AI crawler block on or off depending on whether you want your content used for training (most publishers block it). Paste your sitemap URL so search engines find every page. Add any custom Disallow / Allow rules in the text area.
-
Step 03
Copy or download robots.txt
Copy the generated text to your clipboard, download it as robots.txt, and upload it to the root of your domain so it lives at your-site.com/robots.txt. Test the live file with the Robots.txt Analyzer to confirm every check passes.
Why this tool
Why use this tool
-
WordPress / WooCommerce / Shopify templates
Pre-built baseline rules per CMS so the generated file covers admin paths, cart and checkout, search result pages, and pagination patterns specific to your stack. Generic mode is also available for custom stacks.
-
AI crawler blocking on by default
One-click block for GPTBot, OAI-SearchBot, ClaudeBot, anthropic-ai, Google-Extended, PerplexityBot, Bytespider, Applebot-Extended, Meta-ExternalAgent, CCBot, Omgilibot, FacebookBot, and cohere-ai. Toggle off if you want your content trained on.
-
Sitemap directive included
Paste your sitemap URL and the generator drops a Sitemap directive into the output so Google, Bing, and other crawlers discover your sitemap on the first robots.txt fetch with no extra signal required.
-
Custom rule injection
Paste any custom Disallow / Allow rules into the dedicated text area and the generator inserts them into the output in the correct User-agent group. Useful for blocking SEO research bots (Ahrefs, Semrush, Moz, Majestic) or specific paths.
-
One-click download
Copy the generated robots.txt to your clipboard or download it as a plain-text file. The output is ready to upload to the root of your domain with no further editing required.
-
Free, no cap, no sign-up
Unlimited use, no daily cap, no email gate, no Pro paywall. SBMM Pro adds saved templates for multi-site agencies, automated diff against the live robots.txt, and a scheduled regeneration mode for sites that change crawler policy quarterly.
FAQ
Frequently asked questions
What is a robots.txt file?
Robots.txt is a plain-text file at the root of every website that tells search-engine and AI crawlers which paths they can fetch and which they cannot. It is the first thing every well-behaved bot reads before crawling. The rules in it directly control which pages can appear in Google or be cited by ChatGPT.
Do I need a robots.txt file?
For most sites, yes. Without one, all paths are crawlable by default, which means admin endpoints, search-result pages, and other low-value URLs waste crawl budget that should be spent on your money pages. Even a minimal robots.txt that just declares your sitemap is meaningfully better than nothing.
Should I block AI crawlers in robots.txt?
It depends on your content rights strategy. Blocking GPTBot, ClaudeBot, and Google-Extended prevents your content being used to train future AI models but does not affect live AI search citations from PerplexityBot or OAI-SearchBot. Most publishers block training crawlers but allow live search crawlers.
Where do I upload the generated robots.txt?
Upload it to the root of your domain so it lives at your-site.com/robots.txt. The path is fixed by the robots exclusion standard; serving it from any other location means crawlers will not find it. Most hosting control panels accept the upload through the same file manager you use for the rest of your site.
How do I test the generated file before going live?
Use our Robots.txt Analyzer to fetch the live robots.txt and grade it against 22 plus best-practice checks. Run the analyzer once before deploying to confirm the rules behave as intended, and once after deploying to confirm production matches your local copy. To verify which AI bots your rules actually allow, run the AI Crawler Access Checker on the live URL.
What is the difference between Disallow and Allow?
Disallow tells crawlers not to fetch the matching path. Allow explicitly permits a path that would otherwise be blocked by a parent Disallow rule. Most rules in a normal robots.txt are Disallow; Allow is used surgically to carve out exceptions like Allow: /wp-admin/admin-ajax.php inside an otherwise-blocked wp-admin path.
Why does WordPress mode block wp-includes?
wp-includes contains the internal PHP libraries WordPress uses to render pages. Crawlers visiting wp-includes URLs only see error pages or directory listings, which waste crawl budget and can leak software version information that attackers use to identify vulnerable installs.
Does robots.txt actually stop scrapers and AI training crawlers?
The major search engines and AI vendors (Google, OpenAI, Anthropic, Perplexity, Apple, Meta) publicly commit to respecting robots.txt and do honour the rules. A small set of lower-quality scrapers ignore robots.txt entirely. For stronger enforcement than the file alone, add server-side IP blocks or user-agent firewall rules on top.