Using the module Nemesida AI allows you to minimize false positives, improve accuracy of attacks on web application and develop new attack vectors on the web application, taking into account a set of signs of attack and a precedent base.

Nemesida AI is a module of Nemesida WAF that implements the functions of machine learning. It makes a basis of behavioral analytics to forecast attacks and preventing blockage of intruder. The result of this work includes:

  • reduce false positives to 0,01% (e.g. there are false positives on the average is 5-7% in use signature analysis);
  • improve accuracy of revealing of attacks on web based application;
  • reveal new attacks on web based application.

Machine learning is wide subsection of artificial intelligence which investigates different methods of building algorithms which would be able to learn. There are two types of training:

  • training on precedents or inductive learning, based on the identification of general patterns of private empirical data;
  • deductive training involves the formalization of experts knowledge and its transfer to the computer as a knowledge base.

Deductive training refers to expert systems area, that’s why the definitions «machine training» and «training on precedents» are synonymous.

The scheme of work Nemesida AI and feature space are based on analysis of scientific researches and current prototypes. Because of the most of features are textual the vectorization of features has done to use them in identification algorithm in the future. Because of the fields of queries are not single words and often include sequences of symbols, the decision to use method of frequency n-gramm (TF-IDF) analysis was taken.

Analysis of current data allowed to form feature space. It was used as a classifier base. On the mathematical point of view the task of attack detection was formed as a typical classification task (there are two classes legitimate traffic and illegitimate traffic). The criteria of choosing algorithms were availability of implementation and possibility of testing. As a result of testing the best algorithm was gradient boosting algorithm. Therefore, after behaviour pattern training the decision to block request is taken by Nemesida AI taking into account analyze data’s properties.

Comparing signature method and Nemesida Al there is a disadvantage of signature method.
It is a high probability of WAF bypass with many false positives (5-15%). But signature analysis is the main source, which provides basic web application security during the Nemesida AI training.