Content Moderation for Online Communities

If you’ve ever developed or integrated technologies that scan, detect, and flag sensitive content in its various manifestations — for example, bullying, harassment, predatory behavior and self-harm — you know that there are no cheat-sheets, or black magic to rely on. It’s a Data Science challenge, plain and simple.

You also know that a ‘one size fits all’ attitude toward features and pricing is dead-on-arrival. That’s why we developed a product and pricing philosophy that is akin to cybersecurity sensibilities where an organization’s ‘risk tolerance’ determines investment of cost and resources.

Comparison of anti-toxicity protection between L1ght’s approach and that of competitors.

What is your Toxicity Tolerance

Similar to the concept of “risk tolerance” in cybersecurity, harmful content requires a custom and dialed approach that balances and modulates between accuracy and cost, in order to achieve a desired level of ‘tolerance’. At L1ght we achieve this by coupling Data Science with Deep Learning & AI.

Every single one of L1ght’s customer environments is both similar and unique. That’s why 80% of our AI will work out of the box, the rest requires a training period because — as you know — there are no short-cuts when it comes to implementing negative behavior detection.

Our Approach to Data Science

At Day-1 we recognized the importance of a multi-disciplined approach to understanding online toxic behavior. That’s why L1ght’s data scientists work alongside psychologists and behavioral scientists to harness the latest developments in their fields.

We extended this approach to how we construct our work teams so researchers, domain experts, analysts, scientists and moderators interact fluidly — in fact, before COVID-19 impacted our workplaces, they all sat together.

Monthly Classifier Improvement

This approach yields both high-quality definitions to construct micro-behaviors upon, as well as diverse data for the model.

Our Analysts team is then challenged to seek consensus with the aim of assisting our data scientists build accurate models.

L1ght’s NLP Neural Network.

Selecting the correct data to train on, choosing the best sequence of mathematical functions, handling overfitting and exploring the errors, are all key to delivering the toxicity detection levels L1ght provides.

A main component of the non-linear neural network approach we chose to pursue is its ability allowing us to leverage ‘embeddings’ in order to represent each feature as a vector in a low dimensional space. Given sufficient training data, this allows us to initialize the embeddings vectors to random values, and let the network-training procedure tune them into “good” vectors.

When training a neural network, there are many factors that need to be taken into consideration: The tokenization, network architecture, learning rate, loss function, regularization, and the optimization technique. That’s because each neuron in the neural network is a complex mathematical function that receives a set of values through its input connections and computes an output value that is taken by another connection and transferred to another neuron.

L1ght’s NLP Neural Network architecture also includes an attention mechanism that allows us to boost certain parts in the fixed-size vector. By applying it to encoder-decoder architectures, we boost model training for enhanced results.

Deconstructing the ‘Personal Insult’ Micro-Behavior Model.

Typically, when we begin exploring a new behavior, we usually start with low agreement rates between our analysts. However, as the behaviors are more granularly defined and split into micro behaviors, agreement rates rise to ~95%.

The Outcome

Some of the challenges we had to overcome:


What do you do about the various uses of the apostrophe for possession and contractions? What about emojis and emoticons? Sometimes terms can be written as a single word and sometimes space separated (such as whitespace vs. white space). Overcoming this required the assessment of several algorithms for tokenization, including: Penn Treebank, Byte Pair Encoding, Wordpiece, Unigram Language model and SentencePiece.

Network Architecture

Network architecture depends on the amount of data, the computational power and the inference speed that is needed.  Transformer-based architectures are the current state-of-the-art in NLP. They are based on a multi-head attention module which has shown substantial success in both vision and linguistic tasks. This is in contrast to common architectures such as Bert, XLNet and RoBERTa that improve accuracy, but have higher computational complexity. Methods such as pruning, distillation and quantization reduce the computational complexity (training, prediction).

Dynamic Yield Optimization

The choice of optimization algorithm for a deep learning model can mean the difference between good results in minutes, hours, and days. We chose the Adam optimization algorithm, an extension to stochastic gradient descent, that has recently seen broader adoption for deep learning applications in computer vision and NLP.

AI Models traditionally classify and provide a confidence level as a percentage. In turn, business rules can interpret this for decision making. We discovered that modern neural networks, unlike those from a decade ago, are poorly calibrated. Calibrated confidence estimates are also important for model interpretability. Therefore, we have built an approach based on customer data and customer specific attributes to provide accurate estimations and insights to help businesses make impactful decisions. For example, in human moderation applications, this approach allows customers to determine whether or not to send data for human moderation.


Is the learning rate right? Is the regularizer too strong? How many neurons in each layer? What should be the embedding size? What should be the appropriate sequence size? These parameters were fine-tuned according to our data and network structure.

The Process Visualized:

Train Size

Average F1 Score

Agreement Rate

Model F1 & Data Quality