System for Situation Assessment support

Situation assessment (SA) has to provide everything what we need to know not to be surprised. Despite the simple definition, this is not an easy task the considering large amount of available heterogeneous data and turbulent environments. Therefore, we have to rely on increasingly efficient algorithms for data analysis and innovative SA processes.

The COPKIT LTA SA solution presented here is focused on identification and analysis of threats in semi-structured data such as messages in forums, ads and emails.

Legind Technologies’ solution (Legind SA) developed in the COPKIT project

The concept of the solution is based on the assumption that anomalies can be connected with suspicious activities and their analyses can reveal interesting details such as threats. As anomalies is considered increases in time and space, outliers, outstanding values, etc., of, for instance, messages indicating a threat of a particular kind. The analysis of fund anomalies and threats is an important part of the SA.

This concept is presented as an intuitive cyclic 4-step SA process as illustrated in the figure above. Step 1, data gathering, is focused on automated gathering of semi-structured data that are texts with meta-data such as forums, messages, emails, conversation transcripts, documents, tweets, and so on. The majority of data analysed by law enforcement agencies (LEAs) can be transformed in this structure.

COPKIT solution support this step with system for data import and transformation, identification of threat indicators, Named Entity Recognition (NER) and calculation of new indicators. For data import, the user drags and drops, or select, one or more files (.csv or .xlsx). The next version of the LTA SA tool will also support direct connection to mailboxes.

Step 2 is aimed at anomaly detection in the input data. The purpose of this step is identification of time intervals including suspicious activities. The COPKIT solution supports this step with a time chart showing threat indicators and an automatically generated textual summary of the threat situation in the time interval (see the figure below). These indicators are based on a threat table composed of threat names, keywords, and explanation. For instance, a threat can be CCG (Cyber Crime Group), where the search keywords are names of the groups and explanation can include link to a page with more information about the group. Intervals with increasing number of the threats are anomalies.

The previous step identifies interesting intervals but this is not enough for LEAs to take some actions. Step 3, anomaly analysis, provides details about this interval in order to reveal new threats. The details comprise conversation subjects, threats, characteristic users and entities, etc.

The COPKIT solution supports this step through displaying the conversation both in table form and graph form, and the generated summary of the threat situation. The table shows conversations with indicators such as number of messages in conversation, number of threats, characteristic entities etc. Selected conversations are visualized in conversation graphs. The generated text provides entities that are characteristic for the selected time interval.

Here, we focus on analysis of conversations, as needed to understand the meaning of the often very short individual messages, such as tweets and forum messages.

Step 4, threat analysis, assesses the threat and details provided in previous step. The LTA SA solution supports insertion of new threat indicators and their visualization in time chart as well as export data for enrichment and analysis with other graph analysis tools.

Contact: (Subject: Legind SA)