• Lvxferre [he/him]@mander.xyzM
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 days ago

    You got me curious, so I checked it.

    I downloaded this wordlist with 479k words, and used find+replace to count four strings: cie, cei, ie, ei. Here’s the result:

    • 16566 (75%) ie vs. 5649 (25%) ei
    • 875 cie (74%) vs. 302 cei (26%)

    So the basic rule (i before e) holds some merit, but the “except after c” part is bullshit - it’s practically the same distribution.

    Of course, this takes all words as equiprobable; results would be different if including the odds of a word appearing in the text into the maths.

    • CanadaPlus@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 day ago

      Of course, this takes all words as equiprobable; results would be different if including the odds of a word appearing in the text into the maths.

      I feel like it works more like 90% of the time when it comes up, so maybe this. And could it be that the words where “ie” appears are more ambiguous somehow, like don’t fit neatly into some existing pattern?

      I don’t remember the “after c” bit ever being of use, though, so that part totally makes sense.

      Edit: For an example, I’d never forget the spelling of “either”, because it’s so common and initial letters are more memorable. But, “piece” is tricky - “peice” is my first instinct, and I literally say “i before e” in my head when I write it now.