• Cethin@lemmy.zip
    link
    fedilink
    English
    arrow-up
    28
    ·
    edit-2
    6 months ago

    Also, captchas are meant to gather data to train on. That’s why we used to have pictures of writing, but that’s basically solved now. It’s why we now have a lot of self driving vehicle focused ones now, like identifying busses, bikes, traffic lights/signs, and that sort of thing.

    Captchas get humans to label data so the ML algorithms can train on it, eventually being able to identify the tests themselves.

    • AwkwardLookMonkeyPuppet@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 months ago

      Now it’s making me identify developed pictures from a photo negative. I’m not quite sure what they’re going to do with that training since computers can already perform that task.

      • TheOakTree@lemm.ee
        link
        fedilink
        arrow-up
        3
        ·
        6 months ago

        Also the “select the image below containing the example image above.”

        Like… we already have computers that can recognize image repetitions.

        • Cethin@lemmy.zip
          link
          fedilink
          English
          arrow-up
          5
          ·
          6 months ago

          So that’s almost certainly trying to gather data to defeat data poisoning. The other image is probably slightly altered in a way you can’t detect.

      • bitwolf
        link
        fedilink
        arrow-up
        2
        ·
        6 months ago

        A common OCR tactic is to turn the image negative and bump the contrast to make text easier to recognize.

        It could be a precursor for that step.