I feel like with the rise of AI something that anonymizes writing styles should exist. For example it could look for differences in American versus British spelling like color versus colour or contextual things like soccer versus football and make edits accordingly. ChatGPT could be fed a prompt that says “Rewrite the following paragraphs as if they were written by an Australian” but I don’t know if it would have a good enough grasp on the objective or if it would start shoehorning in references to koalas and fairy floss.

I tried searching online to see if something like this existed and found a few articles from around the 2010s such as Software Helps Identify Anonymous Writers or Helps Them Stay That Way by the New York Times. It talks about stylometry and Anonymouth but it seems like Anonymouth hasn’t been updated in years. All recent articles seem to be about plagiarism and AI.

For context what got me thinking about the topic was remembering JK Rowling being revealed to be the author of a mystery novel called The Cuckoo’s Calling. Smithsonian wrote an article about it called How Did Computers Uncover J.K. Rowling’s Pseudonym?. I thought it could make for a neat post here.

  • @MigratingtoLemmy@lemmy.world
    link
    fedilink
    44 months ago

    I had asked for the same thing a while back but didn’t really get much. The round-about method that I have found is to finetune FOSS LLMs on data you want it to represent (largely text) and then diving into some prompt engineering to get it to say something you like.

    However, I haven’t been able to find a test which can accurately point towards text not having specific weights that it relies on. Cue the attacks on GPT-4 which deanonymises data it was trained on. You might also want to read about DPT and Shadowing techniques to red-team LLMs and LLM-generated text as literature.

    Cheers