The beef between Microsoft and Reddit came to light after I published a story revealing that Reddit is currently blocking every crawler from every search engine except Google, which earlier this year agreed to pay Reddit $60 million a year to scrap the site for its generative AI products.
I know the author meant “scrape”, but sometimes it really does feel like AI is just scrapping the old internet for parts.
A search engine can’t pay a website for having the honor of bringing them visits and ad views.
Fuck reddit, get delisted, no problem.
Weird that google is ignoring their robots.txt though.
Even if they pay them for being able to say that glue is perfect on pizza, having
User-agent: * Disallow: /
should block googlebot too. That means google programmed an exception on googlebot to ignore robots.txt on that domain and that shouldn’t be done. What’s the purpose of that file then?
Because robots.txt is completely based on honor (there’s no need to pretend being another bot, could just ignore it), should be
User-agent: Googlebot Disallow: User-agent: * Disallow: /
deleted by creator