01

Tools

The increasing sophistication of AI and lower technical barriers to entry are making it possible to scale up disinformation campaigns. It’s now more important than ever to monitor new technologies with the potential to spread lies and half-truths. Disinfo Radar provides users with a range of analytical tools to discover the disinformation technologies of tomorrow at an early stage. 

icon

Text Outlier Map

Using a content similarity algorithm, the Text Outlier Map converts texts published about technological developments to vectors before clustering them into readily identifiable topics. This allows or the identification of possible outliers, which may be indicative of new emerging technologies in the disinformation space. 

icon

Text Uniqueness Tracker

The Uniqueness Tracker provides another perspective on those texts identified as outliers. After scrapping the web for published texts on technological developments and converting them to vectors, an algorithm harnessing cosine similarity ranks which texts differ from the others the most. This provides an early warning system for spotting technological innovations which may be used to disseminate disinformation.

icon
icon icon

Our methodology

Disinfo Radar aims to identify new disinformation technologies at an early stage by way of automated text-analysis tools. By auto-collecting and auto-analysing electronic preprint repositories (e.g., arXiv), industry papers (e.g., syncedreview.com), and policy publications (e.g., IEEE), Disinfo Radar scans the environment for indications of emerging technologies. Through a daily updated pipeline, it collects, processes, and subjects texts to state-of-the-art machine learning models in order to identify technical innovations that could be abused for disinformation purposes. 

Once texts have been collected (i.e., auto-collection powered by web-scraping), they are assessed using a self-trained classifier (i.e., a support vector machine). The classifier serves as a form of pre-selection. In evaluating a text, the classifier determines whether it (a) refers to a particular technology and (b) whether that technology has the potential to mislead or increase mis- and disinformation. This classifier was trained using approximately 1,000 descriptions of diverse AI tools determined by DRI experts to have the greatest disinformation potential. Among these are the latest text-generation models (e.g., GPT-3), and text-to-image or text-to-video generators (e.g., Dall-E 2, Stable Diffusion, or Meta’s Make-A-Video). 

After passing the pre-selection round, the originality of the texts is re-evaluated. A second machine learning algorithm measures whether a given text is an outlier. Outliers, in this context, are those texts that contain novel textual information. Such novel elements might, inter alia, be bespoke model names, new approaches to leveraging data, or new forms of synthetic content. It is important to note that outliers can occur for various reasons, including a unique style or vocabulary an author uses. As such, Disinfo Radar works on the basis of the interplay between automation and expert assessment. Disinfo Radar identifies these outliers by using transformer models (deep learning models that incorporate self-attention mechanisms). As these algorithms assist in clustering texts based on similarity, they can also be leveraged for identifying outliers.

Identifying outliers in the previous steps assists DRI’s disinformation experts in their qualitative analysis. Using the registry results, they evaluate the identified technologies and determine the threat potential of each by conducting additional desk research. When a technology is seen as embodying a potential threat, meaning that it could potentially be used to produce or amplify disinformation, DRI utilises the data obtained from the registry to inform potential stakeholders.