Meta-ExternalAgent — Meta's AI Training Crawler
Meta-ExternalAgent collects training data for LLaMA and Meta AI. Learn about its robots.txt compliance, how to block it, and what still works after blocking.
QUICK FACTS
Meta-ExternalAgent What is Meta-ExternalAgent?
Meta-ExternalAgent is Meta's web crawler for gathering training data for their AI initiatives, including LLaMA large language models. It also supports Meta's development of independent search infrastructure. The crawler identifies itself as meta-externalagent/1.1 in HTTP headers. Some website administrators have reported inconsistent compliance with robots.txt directives.
How to Block Meta-ExternalAgent
Add the following to your robots.txt file (located at the root of your website):
User-agent: Meta-ExternalAgent Disallow: /
What Happens When You Block Meta-ExternalAgent
Prevents Meta from using your content in LLaMA and other Meta AI training. Facebook link previews (handled by facebookexternalhit) are not affected.
Enforcement Beyond robots.txt
Meta-ExternalAgent has been reported to have inconsistent robots.txt compliance. For stronger enforcement, consider using:
- Cloudflare WAF rules — Block requests matching the Meta-ExternalAgent user-agent string at the edge
- Server-level blocking — Use .htaccess (Apache) or nginx rules to return 403 for the user-agent
- Rate limiting — Throttle requests from the user-agent to reduce server load
Should You Block Meta-ExternalAgent?
Meta-ExternalAgent is a training crawler — it collects data to build AI models. If you want to prevent your content from being used in future AI training by Meta, block it. This is a one-way decision: blocking today only affects future crawls, not data already collected.
Meta-ExternalAgent vs Other Meta Crawlers
Meta operates multiple crawlers, each serving a different purpose:
| User-agent | Purpose | Type |
|---|---|---|
| Meta-ExternalAgent | Collects training data for Meta's LLaMA models | AI Training |
| FacebookBot | Indexes content for Meta's AI features beyond link previews | AI Feature Indexing |
Each crawler operates independently. Blocking Meta-ExternalAgent does not block FacebookBot — you must add a separate rule for each.
GENERATE YOUR ROBOTS.TXT
Use our visual generator to create a robots.txt file that blocks Meta-ExternalAgent and any other crawlers you want to opt out of.