Disinfo Radar aims to identify new disinformation technologies at an early stage by way of automated text-analysis tools. By auto-collecting and auto-analysing electronic preprint repositories (e.g., arXiv), industry papers (e.g., syncedreview.com), and policy publications (e.g., IEEE), Disinfo Radar scans the environment for indications of emerging technologies. Through a daily updated pipeline, it collects, processes, and subjects texts to state-of-the-art machine learning models in order to identify technical innovations that could be abused for disinformation purposes.
Once texts have been collected (i.e., auto-collection powered by web-scraping), they are assessed using a pipeline of self-trained classifiers. First, each text is broken down into sentences. Each sentence is scanned for mentions of technologies or technology-related terms, using an in-house Span Categorization model (similar to Named Entity Recognition models), trained on data from similar sources.
Every mention of a tech-related term is then assigned a score for several disinformation-potential factors: Automation, Content-Generation, and Accessibility. These scores are assigned by Relationship Extraction (i.e., sentence-level text classification) models trained on curated and synthetic data, with a separate model for each factor.
Low-confidence mentions are filtered out, while the rest are aggregated across the entire data set and normalized to single topics via Affinity Propagation clustering. For example, mentions of “neural networks”, “NN”, and “artificial neural nets” should be normalized to a single tech-related topic, such as “Neural Networks”. Note that since this is unsupervised learning without manual adjustment, there can sometimes be sub-optimal clustering and topic labels.
For every tech-related topic, average scores for each disinformation-potential factor are calculated based on all of its mentions across the dataset. These factors are combined using a weighted average to create a final Disinfo Score for that tech-related topic.
The scores for each of the factors, as well as the final Disinfo Score, are z-scores, meaning that they range from around -3 to around +3. Each integer represents a standard deviation from the mean (0). For example, if “GPT-3” has an Accessibility score of 2.5, that means it much more accessible than average (2.5 standard deviations above average, to be precise). Based on these scores, qualitative grades ranging from “Very High” to “Very Low” have also been defined for each factor and the final Disinfo Score. Topics that rate high in disinformation potential are more likely to be noteworthy.
Identifying outliers in the previous steps assists DRI’s disinformation experts in their qualitative analysis. Using the registry results, they evaluate the identified technologies and determine the threat potential of each by conducting additional desk research. When a technology is seen as embodying a potential threat, meaning that it could potentially be used to produce or amplify disinformation, DRI utilises the data obtained from the registry to inform potential stakeholders.