Forum Moderators: open
But what makes Grok different is its direct access to posts made on X. This enables Grok to have “real-time knowledge of the world,” according to the company, which gives it a “massive advantage over other models,” as Musk put it.Yeah. I can see where information gleaned from “X” would be more accurate and reliable than information gleaned from the internet-at-large. (Though I do enjoy seeing other AI-type entities scrape my site, because everything is better with some obscure 19th-century novels mixed in.)
Unlike other AIs, Grok's training emphasizes X data. Based on xAI disclosures and LLM industry analyses, estimated top sources by percentage: X (Twitter) posts 35%, Common Crawl 25%, Wikipedia 10%, academic papers 8%, books 7%, code repositories 5%, news sites 4%, blogs 3%, forums 2%, other public datasets 1%. Exact details are proprietary; I supplement with real-time tools.
Twitter/X crawls them anyway to get meta dataThat brings up the obvious query: is it still called the Twitterbot, or has it too renamed itself?
If Grok or xAi is following URL's posted on X, what would show up in the logs
I'll be adding "Version/17.0 Mobile/15E148" to my list of UA bot-detection:: quick run to raw logs ::