Literature
With the advancement in technology which has made huge advances in packet capture and has designed both software and hardware solutions to capture packets across the network, this research mainly focuses on network data which are gathered from multiple sources. It also focuses on the security of the network where the data transmitted over the communication medium should be well secured
since there is a possibility of man in the middle attack where the hacker could obtain the data and re-insert a false message. It is needed to make sure that data is being collected from various sources are used for network monitoring. The authors have found that DAG technology is used as a hardware solution since it can capture 100G of packet size but due to its high cost compared to
software solution, it is considered inefficient.
With the advancement in technology which has made huge advances in packet capture and has designed both software and hardware solutions to capture packets across the network, this research mainly focuses on network data which are gathered from multiple sources. It also focuses on the security of the network where the data transmitted over the communication medium should be well secured
since there is a possibility of man in the middle attack where the hacker could obtain the data and re-insert a false message. It is needed to make sure that data is being collected from various sources are used for network monitoring. The authors have found that DAG technology is used as a hardware solution since it can capture 100G of packet size but due to its high cost compared to
software solution, it is considered inefficient. Moreover, it has been mentioned that by focusing on capturing data the identification of well-known ports and DPI (deep packet inspection) are the mainstays of conventional approaches to network traffic classification. However, there are many new cyber threats that could easily bypass these kinds of traditional methods and since the
use of HTTPS governments are promoting laws to prevent organizations in inspecting data.
One of research study proposed deep learning approach for building an IDS based on recurrent neural networks which has a strong modeling ability for intrusion detection with high accuracy was trained using commonly used dataset known as NSL-KDD which is a revised and refined version of KDD99 dataset. Another proposed method
study in reference used deep learning method with the help of autoencoders and random forest algorithm for intrusion detection and it was trained using both KDD99 and NSL-KDD.
One of the research authors believed that analyzing network packets as paragraph vectors using NLP algorithm would learn the attacks' features and detect malicious traffic. Since the Paragraph Vector is an unsupervised model, it learns the difference between anomalous traffic and benign traffic. Authors used TShark network protocol analyzer to decode network traffic. Then using Paragraph Vector (Doc2Vec algorithm)
converts packets paragraphs into vector labels while identifying relationship between sentences by semantics and syntactic analyzing and construct a vector space from the corpus. Also, it used the Distributed-Bag-of-Words (DBoW) algorithm to calculate the semantic distance between words and detect anomalous traffic. Finally, it classifies the network traffic as either malicious or benign.
According to the available research papers and the researchers conducted it shows;
How the data is captured and stored using an AI system.
How the captured data is stored in a blockchain.
Available research papers shows that they have developed AI enabled databases are built with the integration of both Relational database management systems and knowledge bases to offer a natural way to deal with information, making it easy to store, access and apply.
Research Problem
When it comes to Network IDS, Most of the legacy systems are developed based on signature-based detection mechanisms. They are only rely on predefine network threats, and not able to detect zero day attacks.
Small and mid-sized organization struggles a lot when it comes to adopting AI technologies due to the ongoing higher cost of the available systems. Also they required high end machines to run the IDS.
The Available datasets which are contain network traffics related data are quite old.
Unavailability of proper mechanisms to store analyzed network traffic data orderly and in a secured manner.
Main Objective
Develop a self-learning cyber-AI for real-time anomaly detection without human involvement.
Specific Objectives
-
Deep investigation and data gathering on cyber-AI.
-
Train the deep learning-based models to detect anomaly activities using existing datasets.
-
Increase the accuracy of anomaly detections with real time threat detection.
-
Securely store and share data through block chain.
-
Integrate the system in cloud environment.
-
Enhance the system to detect novel attacks by using frequently updating dataset.
Solution
The Autonomous cyber-AI is meant to detect all kind of network threats including zero days
using an AI engine which is trained using deep learning techniques such as artificial neural network, natural language processing, to analyze any suspicious or abnormal behaviors in the system using the network logs.
By integrate the AI engine on a cloud environment, Autonomous Cyber AI became a lightwight detection system, that can be adopted on either low specification machines.