Study shows AI image-generators being trained on explicit photos of children::Hidden inside the foundation of popular artificial intelligence image-generators are thousands of images of child sexual abuse, according to a new report that urges companies to take action to address a harmful flaw in the technology they built

  • cyd@lemmy.world
    link
    fedilink
    English
    arrow-up
    43
    ·
    1 year ago

    3200 images is 0.001% of the dataset in question, obviously sucked in by mistake. The problematic images ought to be removed from the dataset, but this does not “contaminate” models trained on the dataset in any plausible way.

    • L_Acacia
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      It’s not even 3200 images used, it’s 3200 hashed url found in the dataset. Most images were most likely removed and the url are dead, and no model was trained on them.