see Methods and materials section for our definition of toxic comments […] We define a toxic comment as a comment that has a score of at least 0.8 on any of the six dimensions provided by Perspective API. The score means that on average 8 out of 10 raters would mark it as toxic.
In other words, they did not define it. How was this even published???
Digging into the project’s page (the reader shouldn’t need to do it - this info should be in the article), the six dimensions are “Severe Toxicity, Insult, Profanity, Identity attack, Threat, Sexually explicit” (note the recursiveness in “severe toxicity” - it’s fine for the project’s goals, but not a real definition). With the in-line pop-up saying “We define toxicity as a rude, disrespectful, or unreasonable comment that is likely to make someone leave a discussion.”
With that into account, you actually get what the researchers “discovered”: that comments likely to make someone leave a discussion also reduce activity of volunteer editors on Wikipedia. That is almost tautological - “things likely to make you leave are likely to make you leave”.
It highlights yet again that the word “toxic” is mostly useless, as it gives a façade of objectiveness to what’s intrinsically subjective; doubly so in a scientific context.
(I apologise beforehand for vomiting uncalled advice: if you want to complain about “toxic” behaviour, identify exactly what the other person is doing, that rubs you off. You’ll have better grounds to promote change this way.)
A better approach would be to focus on a specific type of behaviour, and then seeing its impact on Wikipedia contribution. They could even do it through the API, if they focused on one or more of the dimensions.