GPTBot identifies itself with the user agent GPTBot and respects robots.txt. Allowing it lets OpenAI include your content in training data; disallowing it keeps your pages out of future model training. The crawl is substantial: Botify's April 2026 analysis of about 7 billion log files found OpenAI's total crawling roughly tripled after GPT-5 launched in August 2025, with GPTBot alone up 2.9x.
The common mistake is assuming GPTBot controls whether you appear in ChatGPT. It does not. ChatGPT's live search answers are gathered by a different crawler, OAI-SearchBot. If you block GPTBot to stay out of training but leave OAI-SearchBot allowed, you can still be cited in ChatGPT search. A broad User-agent: * block, however, can catch both.