• @zod000@lemmy.ml
    link
    fedilink
    English
    252 days ago

    Bullshit. This bot doesn’t identify itself as a bot and doesn’t rate limit itself to anything that would be an appropriate amount. We were seeing more traffic from this thing that all other crawlers combined.

      • @Zangoose@lemmy.world
        link
        fedilink
        English
        42 days ago

        Even if they were rate limiting they’re still just using the bot to train an AI. If it’s from a company there’s a 99% chance the bot is bad. I’m leaving 1% for whatever the Internet Archive (are they even a company tho?) is doing.

      • @zod000@lemmy.ml
        link
        fedilink
        English
        31 day ago

        I don’t hate all bots, I hate this bot specifically because:

        • they intentionally hide that they are a bot to evade our, and everyone else’s, methods of restricting which bots we allow and how much activity we allow.
        • they do not respect the robots.txt
        • the already mentioned lack of rate limiting