Detecting Toxic Spans with spaCy
Introduction
An expression is toxic if it uses rude, disrespectful, or unreasonable language that is likely to make someone leave a discussion. Toxic language can be short like “idiot” or longer like “your ‘posts’ are as usual ignorant of reality”.
We will use the SpanCategorizer
from spaCy
to detect toxic spans. For illustration we will use a well-researched dataset. The present article focuses on the spaCy configuration and usage. Part two, “Custom evaluation of spans in spaCy”, focuses on integrating the metric calculations presented here as a scorer in the spaCy pipeline.
You can find the full code for this article tagged as release 1.0.0 at: https://github.com/ceesroele/toxic_spans
To download this release:
git clone https://github.com/ceesroele/toxic_spans.git —-branch v1.0.0
Categorizing spans
A span is a fragment of a text which we may label as belonging to some category. A text can consists of multiple spans which may have different labels. Spans are similar to Named Entity Recognition (NER).
However, in addition to tokens representing entities, spans may overlap: a word may be part of two different spans at the same time.
SpaCy contains a SpanCategorizer
since version 3.1. This is an element — briefly named spancat
— in the spaCy pipeline that can detect spans for which it has been trained.
Toxic spans dataset
We use the dataset from task 5 of SemEval2021. You can find a description of the task and a discussion of the outcome for different participants in “SemEval-2021 Task 5: Toxic Spans Detection” (2021) by John Pavlopoulos et al¹. The dataset and baseline code are available at github.
The dataset contains some ten thousand posts with annotations for toxic spans. These annotations consist of a list of all indexes of the characters of the post that are part of a toxic span.
Example from the CSV file containing the data:
“[11, 12, 13, 14, 16, 17, 18, 19, 20]”,you’re one sick puppy.
The character indexes 11 to 20 refer to “sick puppy”.
Generally, about the dataset:
- Many of the posts contain one span
- The vast majority of the spans consists of one one word, e.g. “idiot”.
Note that there is only one category (“toxic”) and the identification of spans is per character, so even though it would theoretically be possible to have two partially overlapping toxic spans, in practice in our dataset it doesn’t occur. One result is that it is possible to treat the problem as a NER-problem, rather than as a span problem. Actually, this is how detecting toxic spans was implemented in a spaCy NER example script provided with the dataset.
Project
Our goal is to create a spaCy pipe that detects toxic spans in a text. To do this, we must train a model using our dataset. Once the model is trained we want to evaluate it. If it is to our satisfaction, we will deploy our model.
Here are our steps:
- Download the CSV data files (already divided in train, dev, and test sets)
- Convert the data into a format that can be used for training.
- Train
- Evaluate
- Deploy the resulting model
SpaCy provides standard functionality for training and evaluating models. This can be called as part of programming code, but also be called through a project.
We will go the latter way and define a spaCy project for our steps. As a basis we take the project.yml from the experimental NER-spancat project. This project provides NER for the Indonesian language using SpanCategorizer functionality.
Download datafiles
In our project file we define the location of our assets, that is the data we are looking for. Here we get all files from a path in a repository in github and place them in the subdirectory or destination “assets”.
Let’s download the assets:
We see that the “1 asset(s)” consists of three CSV files. These are the ones that were in the path SemEval2021/data
in the repository defined above.
Convert
Now that we have our data files, we need to convert the data to a format suitable for training. Here we merge the list of indexes and text strings in the rows of our three CSV files into Doc
objects with defined spans.
These Doc
objects we bundle into a single DocBin
object which we save to a file in the custom binaryspacy
format.
We call these methods from a file named make_corpus.py
. Note that we transform the original names of the data files to train, dev, and eval.
In our project definition we define a command from calling the above make_corpus.py
:
To actually create the corpus of spacy
files from the CSV assets by:
Train
We now have train
and dev
datasets in a format we can process. In order to actually train a model to recognise toxic span using this data, we need to configure such a model and some parameters for our training process first.
SpaCy provides an easy way to define such a configuration through a quickstart widget on its webpage documenting the training process.
The only change compared to the default settings we make in the widget above is that we check spancat
to enable the SpanCategorizer
pipe. Note that selecting accuracy
for “optimize for” would lead to a large language model being used.
The displayed part of the generated configuration is only part of the entire file.
We copy the contents of the configured widget and paste it into a file configs/base_config.cfg
. We then fill it in with all default values and save the result as configs/config.cfg
with the following command:
Note on the next step:
The default
spans_key
for SpanCategorizer issc
. You can use this default, which will spare you some of the extra steps I have made below. However, I think it is good practice to let each application adding spans, here toxic spans, set its ownspans_key
as it allows you to mix spans from different sources. E.g. SpaCy’s SpanRuler uses defaultspans_key=ruler
.
We make a few modifications in our new configs/config.cfg:
- We set our
spans_key
totxs
so as not to use the defaultsc
value. - We define sizes for ngrams for the
Suggester
to everything from 1 to 8. (Suggester
is beyond the scope of this article. For info see spaCySpanCategorizer
documentation.) - We define columns for representing scores during training by excluding those for the default
sc
spans_key and including those for our newly definedtxs
spans_key.
In our project file we define how to run our training. Most importantly, we refer to our config file ( configs/${vars.config}.cfg
, where vars.config
was defined as “config”), use our created corpus/train.spacy
for basic training, and corpus/dev.spacy
for the development set.
Now, let’s start the training process and follow the results as the training is executed.
We see that we have a highest score — equaling spans_txs_f
— of 0.57. The model that was used to generate the best score is saved in ./training/model-best
.
Evaluate
As we already saw during the training, the standard metric used for evaluating spans is Precision — Recall — F1.
F1 combines precision and recall into a single measure that captures both properties.
Precision = TruePositives / (TruePositives + FalsePositives)
Recall = TruePositives / (TruePositives + FalseNegatives)
F1 = (2 * Precision * Recall) / (Precision + Recall)
Standard, the scorer for the spaCy SpanCategorizer
calculates these metrics over entire tokens. As we want to compare our outcome with the published scores of participants in the SemEval2021 Task where a character-based score was used, we also want to generate a character-based F1 score for our evaluation. This means that in “Hello idiot
world!” we count idiot
not as 1 TruePositive, but as 5 TruePositives, that is, one for each character of the token idiot
. There is a slight numerical difference between these two methods of calculation.
In our project file we define how to run our evaluation. Most importantly, we use our created corpus/eval.spacy
to provide the data for evaluating the model, use training/best-model
, and we invoke scripts for the standard token-based — and the custom character-based scores.
Let’s start the evaluation:
At the bottom we see that our character-based F1 score: F1 = 63.01
which is one percent point lower than the token-based F1 score.
How does this compare to the results of participants of SemEval2021 Task 5: Toxic Span Detection
? Let’s see:
We see that our outcome trails at the bottom of the ranking, but not below it. Interpretation of the outcome will be done in a separate article, but for now let it suffice that we created the functionality without any optimization.
Deploy the resulting model
For illustration purposes it suffices to just load the best model from the directory where the training process has saved it, in training/model-best
.
The script below uses the trained model and displays predicted spans using displaCy.
Running the above script in a Jupyter notebook we get:
Good, this works!
Conclusion
We implemented toxic span detection using spaCy and came to a usable but non-optimized result. Although our dataset covers only the simplest case with only non-overlapping spans and only one category, our steps illustrate how to train and use the spaCy SpanCategorizer
.
Interpretation of the results was not part of the scope of the article. It will be covered in a follow-up article.
References
[1] “SemEval-2021 Task 5: Toxic Spans Detection” (2021) by John Pavlopoulos et al.
Update history
8 May 2022: linked to part 2: “Custom evaluation of spans in spaCy”
25 Aug 2022:
1. Added link to clone release v1.0.0
at the top of the article.
2. Explained why I use a custom spans_key
, rather than SpanCategorizer’s default sc
.
6 Sept 2022:
Displaying spans now uses displaCy. Removed reference to custom-code for displaying spans.