SEO - v1.0.0
Robots.txt
ArtisanPack UI SEO provides dynamic robots.txt generation with support for global rules, bot-specific directives, and automatic sitemap inclusion.
Overview
The robots.txt file tells search engine crawlers which pages they can or cannot crawl. The package generates this file dynamically based on your configuration.
Configuration
// In config/seo.php
'robots' => [
'enabled' => true,
'global' => [
'disallow' => ['/admin', '/api', '/private'],
'allow' => ['/api/public'],
],
'bots' => [
'GPTBot' => ['disallow' => ['/']],
'CCBot' => ['disallow' => ['/']],
'Googlebot' => ['allow' => ['/']],
],
'crawl_delay' => null,
'sitemaps' => true,
'host' => null,
],
| Option | Description |
|---|---|
enabled |
Enable dynamic robots.txt |
global |
Rules for all bots |
bots |
Bot-specific rules |
crawl_delay |
Delay between requests (seconds) |
sitemaps |
Auto-include sitemap URLs |
host |
Host directive (for Yandex) |
Serving Robots.txt
Via Routes
// In config/seo.php
'routes' => [
'robots' => true,
],
// Registers: GET /robots.txt
Via Controller
use ArtisanPackUI\Seo\Services\RobotsService;
class RobotsController extends Controller
{
public function __invoke(RobotsService $robots)
{
return response($robots->generate())
->header('Content-Type', 'text/plain');
}
}
Generated Output
# Robots.txt generated by ArtisanPack UI SEO
User-agent: *
Disallow: /admin
Disallow: /api
Disallow: /private
Allow: /api/public
User-agent: GPTBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Googlebot
Allow: /
Sitemap: https://example.com/sitemap.xml
Programmatic Management
Using the Service
use ArtisanPackUI\Seo\Services\RobotsService;
$robotsService = app('seo.robots');
// Generate content
$content = $robotsService->generate();
// Add disallow rule (default user-agent: *)
$robotsService->disallow('/secret');
// Add allow rule
$robotsService->allow('/public');
// Add bot-specific rules
$robotsService->disallow('/archive', 'Bingbot');
$robotsService->crawlDelay(5, 'Bingbot');
// Add sitemap
$robotsService->addSitemap('https://example.com/sitemap.xml');
// Set crawl delay for all bots
$robotsService->crawlDelay(10);
// Set host directive
$robotsService->setHost('example.com');
// Get rules for a specific user-agent
$rules = $robotsService->getRulesForUserAgent('Googlebot');
// Get all user agents with rules
$userAgents = $robotsService->getUserAgents();
// Clear all rules and start fresh
$robotsService->clearRules();
// Remove rules for a specific user-agent
$robotsService->removeUserAgent('GPTBot');
Using Helper
// Get robots.txt content
$content = seoRobotsTxt();
Bot-Specific Rules
Common Bots
| Bot | Description |
|---|---|
Googlebot |
Google's crawler |
Bingbot |
Microsoft Bing's crawler |
Slurp |
Yahoo's crawler |
DuckDuckBot |
DuckDuckGo's crawler |
Baiduspider |
Baidu's crawler |
YandexBot |
Yandex's crawler |
AI/LLM Bots
| Bot | Description |
|---|---|
GPTBot |
OpenAI's crawler |
ChatGPT-User |
ChatGPT browsing |
CCBot |
Common Crawl |
anthropic-ai |
Anthropic's crawler |
Claude-Web |
Claude browsing |
Blocking AI Crawlers
'bots' => [
'GPTBot' => ['disallow' => ['/']],
'ChatGPT-User' => ['disallow' => ['/']],
'CCBot' => ['disallow' => ['/']],
'anthropic-ai' => ['disallow' => ['/']],
],
Advanced Configuration
Crawl Delay
'robots' => [
// Global crawl delay for all bots
'crawl_delay' => 10,
// Per-bot crawl delay
'bots' => [
'Bingbot' => [
'crawl_delay' => 5,
],
],
],
Request Rate
Some crawlers support request rate:
'bots' => [
'Googlebot' => [
'request_rate' => '1/10', // 1 request per 10 seconds
],
],
Visit Time
Specify preferred crawl times:
'bots' => [
'Googlebot' => [
'visit_time' => '0400-0845', // Crawl between 4 AM and 8:45 AM
],
],
Environment-Based Configuration
Development/Staging
Block all crawlers in non-production environments:
'robots' => [
'enabled' => true,
'global' => [
'disallow' => app()->environment('production') ? [] : ['/'],
],
],
Or in the service:
$robotsService = app('seo.robots');
if (!app()->environment('production')) {
$robotsService->disallow('/');
}
Generated Output (Staging)
User-agent: *
Disallow: /
Multiple Sitemaps
Include multiple sitemap URLs:
'robots' => [
'sitemaps' => [
'https://example.com/sitemap.xml',
'https://example.com/sitemap-news.xml',
'https://example.com/sitemap-images.xml',
],
],
Or programmatically:
$robotsService->addSitemap('https://example.com/sitemap.xml');
$robotsService->addSitemap('https://example.com/sitemap-news.xml');
Clean URLs
Ensure proper disallow patterns:
// Correct - blocks directory
'disallow' => ['/admin/'],
// Also blocks single file
'disallow' => ['/admin'],
// Block with wildcard
'disallow' => ['/admin/*'],
Common Patterns
Standard Website
'robots' => [
'global' => [
'disallow' => [
'/admin/',
'/api/',
'/private/',
'/cart/',
'/checkout/',
'/account/',
'/*.pdf$',
'/*?*', // Block URLs with query strings
],
'allow' => [
'/api/public/',
],
],
],
E-commerce Site
'robots' => [
'global' => [
'disallow' => [
'/admin/',
'/cart/',
'/checkout/',
'/account/',
'/wishlist/',
'/compare/',
'/search/',
'/*?sort=',
'/*?filter=',
],
],
],
Blog/Content Site
'robots' => [
'global' => [
'disallow' => [
'/admin/',
'/wp-admin/', // If migrated from WordPress
'/tag/', // Avoid duplicate content
'/author/',
'/search/',
],
],
],
Testing Robots.txt
Google's Robots.txt Tester
Use Google Search Console to test your robots.txt.
Programmatic Testing
$robotsService = app('seo.robots');
// Get rules for a specific user-agent
$rules = $robotsService->getRulesForUserAgent('Googlebot');
// Check the disallow and allow arrays
$disallowedPaths = $rules['disallow'] ?? [];
$allowedPaths = $rules['allow'] ?? [];
// Get all configured user agents
$userAgents = $robotsService->getUserAgents();
Caching
Robots.txt can be cached:
'robots' => [
'cache' => true,
'cache_ttl' => 3600, // 1 hour
],
Clear cache using the CacheService:
$cacheService = app('seo.cache');
$cacheService->forget('robots');
Events
The robots.txt generation doesn't dispatch events by default, but you can hook into the HTTP request.
Next Steps
- XML Sitemaps - Sitemap configuration
- Configuration - Full config reference
- Artisan Commands - CLI tools