(cont’d)

…with “key takeaways” and regurgitated paragraphs that all follow the same format. It’s gross, and yet it generates an article long enough with enough keywords to show up on Google.

  • fearout
    link
    fedilink
    20
    edit-2
    11 months ago

    Reposting my comment from another similar thread to show that this is easily fixable, and you should be wary of any non-reputable news source anyway.

    So I was curious how current LLMs might handle this with proper instructions, so I asked chatGPT this: “What can you tell me about this Reddit post? Would you write a news article about this? Analyze the trustworthiness of this information:” and pasted the text from the post. Here’s a part of its reply:

    This Reddit post appears to be discussing updates in the context of […] Hearthstone and World of Warcraft. However, there are several factual inaccuracies which raises questions about the trustworthiness of the post:

    1. Hearthstone was first released in 2014, not 1994. This error may be a typo or a misunderstanding by the user.
    1. Glorbo does not appear to be a known character in either the Hearthstone or World of Warcraft universe […]
    1. Chen Stormstout is a known character from the Warcraft universe, but the claim about him being the end boss of a new version of Karazhan doesn’t match with any known information […]
    1. Dragonflight is not known as a video game developer or associated term in the context of Hearthstone or World of Warcraft […]

    Given these points, it seems this post might be either a joke, speculation, or misinformation. […] So I’d recommend taking this information with a grain of caution and verify it from reliable sources.

    […] I would caution against using the information in this post for a news article due to the factual inaccuracies and lack of verifiable sources. Good journalism relies on accuracy, fact-checking, and use of reliable sources. This Reddit post contains several claims that contradict established knowledge, and without further verification from reliable, official sources (like game developers or official news releases), it would not be advisable to use this information in a news article.

    So it’s not even an issue with current models, just bad setup. An autoGPT with several fact-checking questions added in can easily filter this stuff.

      • fearout
        link
        fedilink
        5
        edit-2
        11 months ago

        Half of the deleted […] things are chatGPT mentioning its 2021 knowledge cutoff and suggesting double-checking that info. It was mentioned in this case as well.

        If it were an autoGPT with internet access, I think these would prompt an automated online lookup to fact-check it.

  • Dr. Moose
    link
    fedilink
    English
    8
    edit-2
    11 months ago

    I don’t really get the hostility towards AI scraping. Don’t we want to have a healthy shared graph of human knowledge? This data is also used by open source models. It’s poisoning the well for everybody to spite some companies who also have the resources to filter this - so you’re really just hurting the good guys.

    Hate is making people do stupid things. It’s emberassing to call yourself a gamer these days ngl.

    • @abrasiveteapot@sh.itjust.works
      link
      fedilink
      English
      611 months ago

      Because regurgitation without understanding leads to demonstrably untrue information being propagated as fact. There have been a number of instances also where AIs have straight up made stuff up as well.

      • Dr. Moose
        link
        fedilink
        English
        211 months ago

        Its shouldn’t be uses as a fact tool tho and not intended for it.

  • @jocanib@lemmy.world
    link
    fedilink
    English
    6
    edit-2
    11 months ago

    tbf this is not very much different from how many flesh’n’blood journalists have been finding content for years. The legendary crack squirrels of Brixton was nearly two decades ago now (yikes!). Fox was a little late to the party with U.K. Squirrels Are Nuts About Crack in 2015.

    Obviously, I want flesh’n’blood writers getting paid for their plagiarism-lite, not the cheapskates who automate it. But this kind of embarrassing error is a feature of the genre. And it has been gamed on social media for some time now (eg Lib Dem leader Jo Swinson forced to deny shooting stones at squirrels after spoof story goes viral)

    I don’t know what it is about squirrels…

    • BarqsHasBiteOP
      link
      fedilink
      English
      211 months ago

      I can’t edit the top box, but I edited mine to take out the duplicate text. The link was automatic though.

    • Max-P
      link
      fedilink
      English
      211 months ago

      Automatic feature. Anything that looks like a valid domain gets autolinked

  • exohuman
    link
    fedilink
    211 months ago

    Isn’t this what /u/spez was also concerned about? AI mining Reddit for content?