• Echo Dot@feddit.uk
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    That’s because regular expressions are a terrible way to try and solve the problem. You don’t do exact tracking matching you do probabilistic pattern matching and then if the probability of something exceeds a certain preset value then you block it then you alter the probability threshold on the frequency of the comment coming up in your data set. Then it’s just a matter of massaging your probability values.