Cyber Security Threats and Countermeasures using Machine and Deep Learning Approaches: A Survey

: Recent advancements in e-business, e-healthcare, e-governance, and online digital transactions have brought valuable benefits. Unfortunately, it raises severe cyber-attacks. Cyberattacks disrupt normal operations, try to retrieve confidential information and defense secrets, and subvert the nation’s defense systems and Internet-connected devices. Cyber security solutions are required to detect, analyze, defend against threats and protect sensitive data from unauthorized access. This study gives a detailed survey of different cybersecurity attacks, like Denial-of-service attacks, Botnet Evasion Attacks, Malware invasions, Spam and phishing invasion, Spoofing, Domain Generation algorithms, Probing attacks, R2L, and U2R attacks. This research review emphasizes Machine Learning and Deep Learning-based approaches to Cybersecurity problems. This study’s key highlights are the research challenges, cybersecurity issues, cyber security domains, and tools for the Intrusion detection system. Data sets play a vital role in cybersecurity research; hence, Private and Publically available datasets are reviewed in this study. Various performance matrices are discussed in this survey which can be used to evaluate the effectiveness of cybersecurity solutions.


Introduction
The recent advancements in cloud computing, wireless communication, big data, social network, the Internet of Things, and the availability of high-speed internet has enabled rapid growth in cyberspace.Cyberspace is a global domain combining computer communication systems with technology (Starodubtsev et al., 2020).Although advancements in technologies bring valuable benefits, unfortunately, it raises stringent cyber threats as well.Cyber security can be coined as "Information Technology Security".With the massive growth in Internet-connected devices, securing these devices is a high need.The process of protecting hosts in networks, applications, computing devices, and data from adversaries' attacks is referred to as 'Cyber security'.This involves the collection of policies, processes, techniques, and technologies to prevent vulnerabilities/attacks (Berman et al., 2019).Generally, cyber security systems protect user data and devices via encryption, user authentication, anti-virus software, Intrusion Detection Systems (IDS), and firewalls.Under different contexts, cyber security is termed network security, disaster recovery, end-user training, operational security, and information security.
A cyberattack is a conscious endeavor by an individual or organization to gain the confidential data of individuals or organizations.In a cyber-attack, the attacker introduces an attack on the network, disabling applications, malfunctioning devices to interrupt routine services, and stealing confidential information.The attacker could make the Internet-connected devices malfunction.Cyberattacks exploit vulnerabilities in software and hardware design through malware.Denial of Distributed Service attacks is introduced to overwhelm the target websites.Through hacking, attackers pierce the defense of potential computer systems and integrity with their functionality.The attacker tries to retrieve confidential information from the organization; the retrieved information can be shared with an unauthorized person for financial or business benefits.Cyber-attacks interrupt the normal operations of software applications.Inexperienced or untrained personnel, improper systems configuration, and insufficient procedures increase computer network systems' vulnerabilities.The COVID-19 pandemic also has raised many people to adopt technology, thus exposing people to cyber vulnerabilities.During the COVID-19 pandemic, many cyber security firms identified a drastic increase of 35% in cyber-attacks (Andrade et al., 2020).
There is a need for novel and efficient methods in cyber security as there is an emergence of new smart network technologies (Duić et al., 2017).These systems should be protected from digital attack, damage, or unauthorized access.Efficient Cyber security solutions are required in various domains such as business applications, online transactions, cloud computing, mobile computing, software solutions, etc. Cyber Security solution is very much required as it encloses protecting sensitive data from unauthorized access.There is a requirement to employ threat intelligence and Machine Learning approaches to identify, analyze and defend against cyber security risks in real time.In recent years, emerging technologies that can be used in the detection of cyber-attacks are Dynamic Networks, Predictive Semantics, Quantum Computing, Behavioral Identity, Cloud Computing, etc., (Geluvaraj et al., 2019).This study is comprehensive about the current research work in the field of Cybersecurity using the ML and DL approaches.

Research Challenges and Issues in Cyber Security
Cybersecurity is a dynamic area where challenges will always proliferate and professionals or individuals must be ready to face these challenges.Figure 1 gives the major sectors affected by Cyber-attacks in the year 2021 (Gmcdouga, 2022).E-business, online transactions are more vulnerable to cyber threats.It results in the loss of confidential information, reputation damage and even being liable for legal issues.Cyber threats can be of any one form based on the motives of the attacker, such as Cybercrime, Cyberwarfare, Cyberterrorism, and Cyber Espionage (Rohith and Batth, 2019).Cybercrime leads to various criminal activities causing significant financial losses to businesses and individuals.Because of Cyber Espionage, a massive amount of data, sensitive information, and intellectual property are extracted from government and private sector websites for economic benefit or political reasons.Statistics revealed that 11% of cyber-attacks are because of espionage.
The Aerospace and defense sectors face cyber threats with intentions of stealing intellectual property and defense secrets.In Cyber warfare, Cyber attackers monitor, infiltrate and subvert other nations' defense to disrupt their critical infrastructure (Duić et al., 2017).Cyberattacks on defense have cascading effects and breach the national security system.Cyber-attacks are launched covertly to weaken or strike at an adversary to achieve political objectives.The enemy is unseen and the victim is unsure how and where to react.The attacker does not leave any proof of their involvement in these attacks.The attackers are called non-state attackers (Goel and Nussbaum, 2021).To name a few, non-state actors include criminal organizations, script kiddies, hacktivists, scammers, and black hat hackers.The future war will be 'Cyberterrorism' or no contact war wherein there is no 'physical' or 'kinetic' action across the borders, which is constantly increasing.In Cyberterrorism, cyberspace is deliberately used for devising terrorist attacks.Recently terrorists are using cyberspace for their communication, to have command and control, to brainwash innocent people, and for training and funding goals.Providing Cyber security in the defense system is a complex issue that requires multidimensional, multilayered initiatives and responses.According to Professor Dorothy E. Denning, the definition of cyber terrorism is: Unlawful invasion and imminence of attack against computing nodes, networks, and critical data when done to bogart or compel a government or public servants or elected people emphasizing socio-political goals (Luiijf et al., 2013).The other terms for cyber terrorism are cyber jihad, electronic jihad, e-jihad, and Internet jihad (Malin et al., 2017).
Satellite communication systems, navigation systems, and Earth observation systems often pose threats from cyberattacks (Caprolu et al., 2020).Cyber attackers can use software mechanisms, amplifiers, transmitters, and steerable antennas to interfere with or generate satellite signals.The vulnerabilities in the satellite communication systems are mission-critical as they can disturb launch systems, telemetry, tracking, and command and communications.Continuous monitoring and protection measures have to be taken to protect these space-based systems.
A myriad of cyber threats plays the health sector.Data privacy in healthcare is more concern in many countries.Cyber threats to the health sector may arise from Malware that compromises the virtue of the system or from DDOS attacks by losing patients' privacy or disrupting the facilities available to patients.Cyber threats in the health sector have ramifications beyond financial loss and breach of privacy (Thamer and Alubady, 2021).For instance, 'Ransomware' malware for hospitals steals patient data and puts patient lives at risk.It is reported that more than 18 million patient data are afflicted by ransomware attacks.This sector is more prone to cyberattacks since the patient's personal information and medical data were compromised by Cyber attackers.The illness data can be used to blackmail the confidential information of patients such as results of diagnosis, severity, types of treatment, and diseases shared with the marketing firms for advertising and recommending their products.In the Electronic Healthcare Record (EHR) system, the patient records are maintained and the medical devices such as infusion pumps, remote e-patient observation equipment, and Ventilation and Air Conditioning (HVAC) systems are connected to EHR systems.,teardrop,back,land,pod,smurf 2,3,9,26,41,4,26,27,41 U2R loadmodule,rootkit,buffer_overflow,perl 6,11,29,30,3,10,14 R2L imap,ftp_write,multihop,guess_passwd,phf,warezclient,warezmaster,spy 1,2,7,33,3,40,34,30,21 Probe ipsw eep,nmap,portsweep,satan 2,3,9,30,34,38,40        The cyber attacker can compromise the EHR system, and the devices connected to the EHR system and can introduce cyber-attacks (de la Torre et al., 2017).
The cyberattacks in IoT (Internet-of-Things), wherein devices connected to the network, are more susceptible when adversaries try to capture the IP address, application port, DNS server, and server IP address.IoT devices are sensitive to cyber-attacks as most IoT nodes are constantly attached and share data over the Internet (Roopak et al., 2019).The current development of portable and IoT devices has further amplified the consequence of malware attacks (He et al., 2021).As a result, the risks are exponentially more significant for IoT devices.Safeguarding the IoT device is complicated by the scale and scope of data being generated and collected.Software piracy and malware attacks put organizations and specific operational capabilities at risk.These attacks on the IoT with the growth in ubiquitous devices, the number of security threats increased (Ullah et al., 2019;Le Jeune et al., 2021).Cyber security algorithms assist in defending hosts, can defend against these applications and data, and recover from failure in a controlled, measurable way.
Although there are many cyber-attack detection mechanisms available, the rapid enhancement in hacking skills and the increase in the number of cyberattacks demands new cyber-attack detection systems (Duić et al., 2017).In recent days, all digital transactions such as online e-business transactions, online banking transactions, online stock market transactions, and online patient record maintenance.All these online transactions are prone to cyberattacks wherein the attacker interprets and eavesdrops on vital information related to the business transaction.Similarly, in online bank transactions, adversaries capture user login credentials, loan credentials, etc., to know the financial status of customers.It reuses login credentials, and financial status to introduce security threats.
Cyber security attacks on the power grid of a country are considered as vulnerable because vital information on nuclear reactor design and operations of reactors could be shared with other countries.The cyberattack on the power grid could result in the interruption of power and abruptly terminate the reactor functionality (Farooq et al., 2021).The primary threat to the Nuclear Power Plant is Cyber Sabotage.Cyber Sabotage can physically disrupt nuclear equipment, introduce viruses/malware into the power plant and lead to a nuclear explosion.Some examples of the cyberattack on nuclear power plants are the Stuxnet computer worm attack on Iran's Nuclear Power Plants, the cyberattack on India's Kudankulam Nuclear power plant, and the Ukraine power grid hack (Kumar and Gupta, 2021).
The new generations of Cyber-Physical Systems (CPS) consisting of software and physical parts (Akazaki et al., 2018) are more vulnerable to threats and easily breach the integrity of these systems.Moreover, the sensors of these systems can be hacked by hackers and false data can be infiltrated into the system so that the controller works on the malicious data.The attacker can even compromise the actuators so they won't function properly (Humayed et al., 2017).
In tech support scams, cyber attackers use fear tactics to convince people to pay for overpriced "help services" to diagnose their computer hardware and software-related technical issues.It's been noticed that cyberattacks impersonalize pension disbursing officials and collect the information of senior citizens to seek benefits it.Online gaming has become a way of entertainment for youngsters but this raises the opportunity for adversaries to introduce cyber-attacks.The most common cyber security threats in online gaming include disclosing personal information, location, IP addresses of devices, flooding, hacking, server maintenance problems, etc., (Shabut et al., 2016).
The safety and data confidentiality of citizens are the primary concerns in a smart city.The authors (Hamid et al., 2019) have discussed major cybersecurity issues and challenges in smart cities.The cyber security issues are categorized from three perspectives: Governance, technological and socio-economic.The governance process uses Information Communication Technology (ICT) tools and the Internet to deliver information and public services.Smart cities have to assure citizens' privacy, providing confidentiality and security benchmarks to ensure security and privacy.
Recent years have witnessed the boom of social networks becoming more popular.Billions of people are using social media such as Instagram, Facebook, Twitter, YouTube, LinkedIn, etc., to socialize and interact with each other.However, targeted spam, phishing, defamation, impersonation, cyberbullying, and fake accounts are some of the threats most common in social cybersecurity (Thakur et al., 2019).Moreover, cyber attackers are sending out phony copyright complaint notices to Twitter, and Facebook users which contain harmful links, and clicking such links can damage the devices or corrupt the software on the device.
Nowadays, many search engines rank web pages to give relevant search results when the user queries through Search Engine Optimization (SEO) (Dramilio et al., 2020).Though many organizations use special techniques to place their pages in search results, cybercriminals can use SEO poisoning to design malicious websites and use search engine optimization tactics to make them appear predominantly in search results.This type of attack method is also referred to as search poisoning.Blockchain and cryptocurrency are proliferating and attracting more interest than ever (Tsochev et al., 2021).Crypto transactions are digital and business entities must apply relevant cybersecurity measures to protect against security breaches, identity theft, and other potential threats.Secure key storage management and inviolable computing are critical needs for Blockchain devices (Urien, 2021).
As Cloud Computing relies on the Internet, the usage of the cloud is increasing tremendously and has become a competitive need; securing could architecture is a significant concern (Krishnaveni et al., 2021).Some major threats to the could architecture are DoS Attacks, Insider Risks, Account hijacking, Data breaches, Misconfiguration, and Reduced Infrastructure Visibility.
An insider attack is one more security threat to any organization (Suresh and Madhavu, 2021).Here the attackers may be current or former employees, business partners, officeholders, consultants, or the Board of Directors.The disgruntled employees may decide to bring harm to the organization purposefully.Employees with malicious intentions may disclose the organization's secrets to outsiders as they may know network design, susceptibilities, and access codes.Though reckless users may not intend to cause any harm, they have access to the organization's information and proprietary data that they accidentally expose.Cyber security is the biggest challenge with insider attacks, where insider fraud has to be detected and rectified (Pantelidis et al., 2021).Insider attacks pose severe threats to CPS like Smart Cities and their components (Hossain et al., 2020).
5G is emerging quickly, providing high speed and responsiveness for wireless communication technology (Cabaj et al., 2018).But the new technologies come with unknown risks for which cyber security professionals have to find solutions for potential threats.5G networks play an important role in smart cities, identity authentication, online banking, etc. Cybersecurity is essential to secure transactions, mitigate identity theft, protect user data and identities, and additional Intelligent Access Control (IAC) mechanisms (Sedik et al., 2021).
It is observed that Cyber threat is a global issue and many countries are getting affected by them.Figure 2 gives a glimpse of countries affected by major Cyber-attacks (Statistics, 2021).There is an increased need to address Cyber-attacks and defense mechanisms worldwide.

Domains In Cybersecurity
The domains of cyber security are discussed in this section.There is no rigid boundary for these domains as they keep evolving.The major domains identified are the social domain, information domain, Physical domain, and cognitive domain.The physical domain includes protecting the system/desktop machine and the peripheral hardware components from cyber thefts.The information domain focuses on Confidentiality, Integrity, and availability of data.The Information domain proposes strategies for shielding programs, data, computers, and networks from unauthorized access or attacks.The information security model is for an organization's policies that keep the data safe.The perception of the data, the analysis, and the way data is used in decision-making explain the Cognitive domain.Social domain deals with the norms, ethics, and policies of the organization and the broad social landscape (Collier et al., 2013).The technical implementation of different forms of security involves application security, information security, vulnerability management, network security, cloud security, cryptography, critical infrastructure security, etc.
Different forecast or prediction methods are used to find the consequence of cyber-attacks (Husák et al., 2018).The taxonomy of attack prediction and forecasting methods in cybersecurity is explained in more detail in Fig. 3.In the attack projection and Intention recognition method, the intentions of the attacker and their next move are predicted.The upcoming cyber-attacks are predicted in Intrusion prediction.The prediction of cyber-attacks on the whole network is done in network security situation forecasting.There are various approaches for formulating cybersecurity threats as models like Bayesian networks, Markov models, attack graphs, or continuous models like grey models, time series, etc.The cyber security issues can also be tackled by machine learning, deep multilayer representational learning, and knowledge discovery approaches.

Intrusion Detection Systems (IDS)
Modern networked enterprises require highly sophisticated technology to safeguard the organizations.Intrusion discovery Systems are used as security tools to detect possible intrusions in a network or a host (Berman et al., 2019;Sharma et al., 2016;Alom et al., 2015;Uğurlu and Doğru, 2019;Kim and Aminanto 2017).An intruder within or outside the organization may initiate anomalous activity to disturb network operations.IDSs protect the system by providing user authentication and ensuring protective access from unauthorized users to gain more system privileges or misuse their privileges.It also guarantees to prevent the loss of data privacy (Alom et al., 2015;Gümüşbaş et al., 2020).IDSs can distinguish between malicious and benign actions (Ferrag et al., 2020).Based on the functionality, IDSs can be Networkbased, Host-based, and distributed based (Alom et al., 2015;Karatas et al., 2018).Similarly, based on detection methods, intrusion discovery systems can be operating as (i) rule-based (also called anomaly-based), (ii) signature-based (also called misuse-based) while analyzing and detecting attacks, and (iii) hybrid (Gümüşbaş et al., 2020;Karatas et al., 2018;Macas and Wu, 2020;Chaudhary et al., 2020;Lakshminarayana et al., 2019).
In a rule-based, the normal behavior states of the system are stored in the database.The program behavior is continuously monitored if any deviation apart from these specified rules is indicated with alarms.A malware detector has a data collector that collects information about the program interpreter and data matcher.Program interpreter converts the data to a useful representation, data matchers compare the interpreted data with the program behavior (Al-Janabi and Altamimi, 2020).Most of the literature classifies anomaly detection as shown below (Alabadi and Celik, 2020): 1) Point Anomalies-It is the data point that is treated as abnormal when it is compared with the remaining data 2) Contextual Anomalies-The abnormality is based on a particular context 3) Collective Anomalies-It is the collection of data points as a dataset which is treated as anomalous The anomaly-based IDS performance is better for unknown and complex attacks than the Signature-based attacks (Hindy et al., 2020).It is good at detecting unknown attack types (also called zero-day exploits) (Rashid et al., 2020).The major challenge with anomaly detection is to segregate normal and aberrant behavior.The anomaly detection solutions are not standard across the applications.It is difficult to predict since malicious activities keep evolving continuously.And also, all unseen behaviors are treated as anomalies, raising false alarm rates.
In signature-based (also known as Knowledge-based), the attack patterns usually are stored in the data repository.These attack patterns are compared in the network by the IDS.In signature-based attacks, the attack types which are already known can be detected with high accuracy and they don't generate false alarms.Usually, Signature based IDS achieves higher detection performance compared to anomaly-based for the known attack types (Kim and Aminanto, 2017).The drawback of Signature based is that it can identify attacks mentioned only in the database.The administrator must update the database rules and signatures very frequently (Gümüşbaş et al., 2020;Kilincer et al., 2021;Mahdavifar and Ghorbani, 2019;Hwang et al., 2020).Extracting the different signatures requires a lot of time and effort.It doesn't provide accurate results for zero-day attacks and viruses which have polymorphic behavior (Al-Janabi and Altamimi, 2020; Pu et al., 2020)."Zero-day" refers to newly realized security susceptibilities that attackers can exploit to attack systems.In other words, the vendor or the developer has "zero days" to fix it (He et al., 2021).The vendor or developer would have just learned about the flaw.
Another distinct way of detecting the intrusion is known as the Hybrid method.This method integrates the advantages of both anomaly and misuse detection.It increases the intrusion identification rate and minimizes the false positive rate for unknown attack types.Most of the ML/DL techniques are hybrid intrusion detection (Mahdavifar and Ghorbani, 2019).In hybrid attacks, unknown attack types are identified by anomaly detection and known attacks are detected by misuse detection.Hybrid detection is divided into 2 categories (Rashid et al., 2020) sequence-based detection, (ii) parallel based detection.In the former, either misuse or anomaly detection is used first.In the latter approach, multiple detectors are applied in parallel to obtain multiple outputs for decision.The complete classification of IDS is viewed in Fig. 4.
An Intrusion Prevention System (IPS) is a protection mechanism for interconnected devices that continuously observes malicious activity in the network (Krishna et al., 2020;Chandre et al., 2018).It takes suitable action to prevent those activities by blocking, dropping, or reporting them.An Intrusion Prevention System (IPS) shall be a signature-based or statistical anomaly-based system (Krishna et al., 2020).With Intrusion prevention systems, one can control access to an IT network and protect it from misuse and attack.Some opensource IDS/IPS systems are OSSEC (Open-Source Security), SNORT, Suricata, Zeek, Samhain, Fail2ban, Security Onion, Bro-IDS, Kismet, OpenWIPS-ng, Sagan, etc., (Sokolov et al., 2019;Chaabouni et al., 2019).

Types of Cyber Threats
The Cyber Security world is not static.Cyber threats are changing at a rapid speed.Also, Cyber Security defense tactics and attack methods are changing and enhancing every day.Some of the major cyber threats addressed in this study are given in Fig. 5.
Cyber threats are broadly categorized as follows.

A. Denial of Service (DoS)
Generally, the banking sector, government organizations, commercial applications, media companies, etc. are vulnerable to DoS attacks (Hamid et al., 2019).In a denialof-service episode, the hacker floods the systems, networks, or servers with undesirable requests.It makes the server resources and bandwidth exhausted with attackers' traffic.As a result of the denial of service, the system cannot fulfill legitimate requests from legitimate users.
Suppose the attackers use several compromised devices to venture DoS attacks; the attack is wellknown as a Distributed Denial-of-Service (DDoS).According to several researchers in the literature, it is proved that DDoS has several repercussions.The motivations behind introducing DDoS attacks are to disturb the traffic of a targeted service or service for financial benefit, economic growth, to take revenge, ideological belief, intellectual challenge, Cyber Warfare, etc., (Zargar et al., 2013).DDoS attacks are categorized into various types of divisions based on the objectives of the attack (Chen, 2020) they are.

DDoS Flooding Invasion at Network/Transport-Level
The attacker makes Network's bandwidth resources unavailable.This flooding attack is further classified into different types as follows (Chaabouni et al., 2019).

a) Flooding Attacks
In flooding attacks, the hacker overwhelms the target's network bandwidth by sending false requests, mainly with ICMP or UDP packets, ultimately disrupting the legitimate user.
These kinds of attacks can be initiated with botnets.

b) Protocol Exploitation Flooding Attacks
Protocol attacks utilize the processing capability of network infrastructure resources like firewalls, servers, and load balancers.They target Layer 3 and Layer 4 protocols with malignant requests for connection.

c) Reflection-Based Flooding Attacks
Here the attacker spoofs the target's IP address and transmits the request to the devices that provide service.The server responds and replies to the target's IP address.To do this the attacker primarily uses UDP or TCP in some cases, thus having the same protocol as 'Reflection' in both directions.

d) Amplification-Based Flooding Attacks
In Amplification attacks, attackers send the "trigger packet" to reflector devices by setting the source IP address as their target's IP address.It in turn overwhelms the victims' machine with the trigger packets.The attacker can send millions of these requests to vulnerable services, thereby generating considerably enormous responses than the original request and significantly boosting the size and bandwidth allocated to the target.

Attack on System's Resources (SYN Flooding Attack)
This attack uses the TCP handshake process required to launch a TCP connection.In this, the invader floods the SYN messages to the server, for which it responds with a confirmation.As the requests are fake, the server waits for the client to complete the handshake mechanism and retransmits SYN + ACK continuously until timeout.Ultimately the server is called on to keep open many half-open connections that eventually overwhelm resources such as CPU time, memory, and other device resources, often to the point where the server crashes.

Application-Level DDoS Flooding Attacks
These sophisticated DDoS attacks exploit weaknesses at the application layer.It opens connections, initiates processes, and performs transactions that would deplete finite resources like disk space and available memory.Application-level DDoS attacks are categorized into (Vanitha et al., 2017;Zargar et al., 2013): a) Flooding attacks with Reflection/Amplification: It is similar to network/transport level attack b) HTTP flooding attacks: Four varieties of this type of attacks are as follows

i) Session Flooding Attacks
It exhausts the server resources by sending a high rate of session connection requests to the server e.g.: HTTP get/post flooding attack.

ii) Request Flooding Attack
In a request flooding attack, attackers send sessions containing more requests than usual which results in a DDoS flood denying the service to the client e.g.: A session HTTP get flooding/HTTP post flooding.

(iii) Asymmetric Attacks
In asymmetric attack, attackers transmit sessions that include high workload requests of several HTTP requests embedded in a single packet e.g.: Multiple HTTP get/post flood.

iv) Slow Request/Response Attacks
In a slow request/response attack, the attacker sends partial HTTP requests that consistently and rapidly grow, gradually update, and will not terminate.The episode persists until these requests take up all available sockets and the web server becomes inaccessible.e.g.: Slow Loris attack, HTTP fragmentation attack, Slow post attack, slow reading attack.

B. Botnet Evasion Attacks
The botnet attack is a multi-stage, predominant cyberattack that begins with scanning network devices.It infects the devices with malicious software, like viruses (Hussain et al., 2021).To increase the magnitude of their attacks, attackers can gain control of a botnet without the device owner's understanding.Further, a botnet overwhelms systems in networks in a DDoS attack.Even though the actual target for botnets is computers, in recent years' adversaries are targeting Internet of Things (IoT) devices more often (Yamaguchi, 2020).In 2016, the Mirai botnet targeted half a million IoT devices with open telnet ports and used default usernames and passwords to log in to those devices and turn them into zombies (Kambourakis et al., 2017).The intention of launching a botnet attack is to initiate malicious activities such as spam generation, key logging, copyright violation, etc. Habitually bots use various invasive approaches to gain the maximum benefit (Karim et al., 2014).The originator of botnets is commonly known as Bot Masters, typically a person or an association of people who have the intention of launching malicious activities.

C. Malware Attack
'Malware' is a term derived from the words 'malicious' and 'software'.Malware is broadly used to refer to worms, ransomware, viruses, spyware, adware, Trojans, and other types of harmful software.In a Malware attack, adversary's trespasses network vulnerable links when a person connects a suspected link or opens an attachment of an email.It leads to the installation of unsecured or untrustworthy software on the system.Recently a large number of new malwares are generated by using metamorphic, polymorphic, and different evasive techniques (Vinayakumar et al., 2019).Initially, the malware is in an incubation period during which it will be propagating silently in the network by infecting the hosts.In the incubation period malware does not harm any system in the network and the attack is launched only when it's guaranteed enough systems are infected.During the expansion period, it propagates the entire network by launching/infecting bots.The malware detection probability and the incubation period are the key factors that determine the extent of malware attack severity (Xu et al., 2020).
Malware got different names based on its behavior and its purpose.The most common types of malwares include Malvertising, Cryptojacking, Spyware, Adware, Ransomware, Trojan horse, Worms, Rootkits, Man-In-TheMiddle (MitM), Backdoors, Viruses, Bot, Scareware, Man-InThe-Mobile (MitMo), etc., (Al-Janabi and Altamimi, 2020).According to the author (TM, 2020), the recent malware attacks were Shlayer, ZeuS, Agent Tesla, NanoCore, Generally, the analysis and detection techniques for malware attacks are classified into (Al-Janabi and Altamimi, 2020; Top 10 Malware, 2020; Albasir et al., 2018;Lin et al., 2020;Baptista et al., 2019): 1) Dynamic 2) Static and 3) Hybrid Static analysis is faster as they can analyze the code without running and they deal with false-positive.Techniques based on static analysis are computationally effective and safer.The static analysis does not predict malware more accurately since it shows up only for some of the patterns.It can detect the most common types of malwares.However inefficient for advanced malwares which utilizes advanced evasion detection techniques such as polymorphism and obfuscation (where evidence of malicious activities is hidden) (Mahdavifar et al., 2020).
Dynamic analysis works with executed code and is effective against obfuscation.The dynamic analysis uses the characteristics of the Malware and the malware functionalities to determine the severity of the Malware.Moreover, the behavior of malware functionalities is determined after executing executable code in a sandbox environment.Dynamic analysis can detect any unseen samples as the file is analyzed in virtual environment systems to improve performance.As the file will be executed, dynamic analysis achieves better accuracy and determines all matching patterns.Hybrid techniques leverage the advantages of both static as well as dynamic methods.

D. Spam and Phishing Attack
Phishing is sending fraudulent communications that seem to come from a reputed origin, usually by email.Through phishing attacks, the intruders pose as trustworthy contacts and gain sensitive information from the user (Sajal et al., 2019;Singh, 2020).To capture vital sensitive information like credit card numbers, PINs, and login information of users, phishing attacks are launched by adversaries.In phishing either login credentials or malware, the software is installed on the victim's machine.Phishing is a common cyber threat in social media such as Twitter, Facebook, etc. Phishing emails convince users with faultless words and original logos.Phishing links are linked to websites that are malware infected.Phishing attacks are exploiting human vulnerabilities more than system vulnerabilities.It makes the user enter his/her details into a fake website that resembles a legitimate website (Al-Janabi and Altamimi, 2020; Patil et al., 2017;Gupta et al., 2021).Singh (2020); Tang and Mahmoud (2021) the phishing attacks are classified into two major types: (i) Social engineering (i.e., deceptive phishing) (ii) Malware-based phishing.Social engineering attacks usually with the psychological manipulation of users to make some mistakes or share their confidential information.In Malware based phishing, malicious softwares are executed on the user's machine to fetch users' confidential information.Malware-based phishing attacks are DNS phishing, Session hijacking, content injection phishing, key loggers, phone phishing, link manipulation, system reconfiguration etc.It is possible with Phishing attacks to install malwares in the victim's machine which can change the victim's machine into a Botnet and botnets can now be able to launch DDoS or any other kind of attack.
Author in (Alam et al., 2020) explained the classification of Phishing attacks.Different phishing attacks are as follows.

1) Algorithm-Based Phishing
It was first identified in the year 1996, wherein the phisher developed an algorithm to generate random credit card numbers to match the original credit card numbers of America Online (AOL) Accounts (Tang and Mahmoud, 2021;Khonji et al., 2013).

2) Deceptive Phishing
In a deceptive attack, the attacker uses emails or SMSs to send fraudulent links and trick people to click the links.The websites behind the links snatch and store the personal information of the victim.

3) URL Phishing
In this attack, the attackers use the phishing page's URL to infect the target.The hidden link is to the hacker's website.When the victim clicks on the URL, it is directed to the hacker's website snatching the victim's information.

4) Hosts File Poisoning
The host file in the operating systems is poisoned so that when the user requests the desired website, either it is rerouted to another website or it returns a "Page Not Found" error.When it is redirected to a fake website, the user data is stolen.By poisoning the host file, the way the OS resolves a DNS name is altered.

5) Content-Injection Phishing
Content Injection phishing is a common web security vulnerability.The vulnerable web applications make the actual content on the web page be spoofed or modified.Content injection phishing occurs when the application is not properly handling the user-supplied data and the attacker can supply the content to the web applications.

6) Clone Phishing
In Clone Phishing, an email that is sent before containing any link is used to create an identical copy of the email but with a malicious link.This new email is just a replica of the original but with fake links or attachments.This duplicated email is sent to all contacts from the target's inbox.The person receiving the cloned mail clicks on the fake links, assuming it to be a legitimate email.This attack is hazardous as the recipients will never suspect the email.

7) Whaling
Whaling attack always targets high-profile executives like CEOs, CTOs, and Directors (Sajal et al., 2019).The attackers usually make the victims act such as fund transfer.It is difficult to find these attacks as they often don't use malicious URLs or weaponized attachments.

8) Spear Phishing
Phishing usually targets a large number of recipients but spear phishing emails are carefully designed to get data in the form of a response from a particular person.Though the risk rate is high, spear phishing is having a high success rate and has become one of the major aspects affecting network security (Xiujuan et al., 2019).
Email phishing and URL phishing are difficult to identify as attackers frequently change their strategies (Alam et al., 2020).Some of the protection approaches against phishing attacks include Client-side tools, Authentication, Server-side filters and classifiers, network-level protection and also educating the users (Singh, 2020).

E. Domain Generation Algorithm (DGA)
It's a type of attack in which adversaries design a software program that generates an extensive number of pseudo-random domain names (Shahzad et al., 2021).With this DGA, the malware will generate hundreds to thousands of domain names randomly in a short period.The generated domain names are explicitly assigned to sites.The domain names assigned for the sites will receive control from the malware and give their instructions.DGAs are common in Malware, which endeavors to install command and control communication with the botmaster and the infected machine (Chen et al., 2021).This is referred to as "command and control" or C2 (Yu et al., 2019).Since domain names are short-lived, it is a challenge for the defenders or the analysts to detect them (Li et al., 2019).Using DGAs, the attackers can manage the infection-spreading websites and deploy the command and control (C&C).The DGA attack constitutes the following phases: Infection, C&C, Lateral Spreading, and Data exfiltration.DGA attacks can be broadly classified into Binary and Script based, depending on how they are deployed (Sood and Zeadally, 2016).

F. Spoofing
Spoofing is also called an impersonation attack.In spoofing, the attacker steals the user authentication credentials to gain unauthorized access to the services.The user credential can be obtained by eavesdropping on the network or can be stolen from the device using a phishing attack.The attacker links their MAC address to the IP address of an unprotected network.It becomes easy for the attacker to perform theft or delete the data in this vulnerable network.Spoofing can be commonly categorized into ARP (Address Resolution Protocol) Spoofing, IP Spoofing, and DNS Spoofing (Hamid et al., 2019;Chaabouni et al., 2019).In ARP spoofing, the attacker sends the spoofed ARP message into the LAN.The Media Access Control (MAC) address of the attacker is then attached to any one of the legitimate users in the LAN.With this, the attacker will be able to modify, steal or even stop network traffic.IP address spoofing is done by modifying the source IP address with which the sender's identity is modified.
DNS spoofing occurs by modifying the entries of a DNS server (which maps domain names to IP addresses).The attacker can now be able to reroute the particular domain name to a malicious or infected system.

G. Probing
The attacker uses probing to get to know the weak points in the system and attain entry to it.The hackers send the scan packets into the system and efficiently collect the information and data.Examples of attacks include Nmap, Satan, port sweep, IP sweep, mscan, etc (Gümüşbaş et al., 2020;Dixit and Silakari, 2021).

H. Remote-to-Local (R2L)
The invader, in an R2L attack, identifies the device's vulnerability by sending packets over the network.Then the attacker acquires unauthorized access to the victims' machine (Elsayed et al., 2020).The attacks are usually caused by buffer overflow (as in imap, named, sendmail), misconfigured security policies (as in ftpwrite), or Trojans (xsnoop).R2L attack may be challenging to detect as it involves both network-level and host-level features (Rodriguez et al., 2021).

I. User-to-Root (U2R)
In a U2R attack, the user gets legal entry to the account (target machine), with which the attacker illegally attempts to acquire superuser permissions of the root by using the susceptibilities of the system (Sapre et al., 2021;Begli et al., 2019).Examples of this attack are Load Module, Eject, Buffer_overflow, and Perl attacks.

J. SQL Injection
SQL injection attacks mostly attack web applications.The loopholes in the websites' databases are used to compromise the database.With this, hackers can access confidential user information on the website (Kilincer et al., 2021).The hackers can even modify, delete or update the user information on the website.These attacks allow attackers to spoof identity, cause repudiation problems, destroy the data or make data inaccessible and even change the administrator's setting of the website server.Attackers can access the backend data based on the methods, such as Out-of-band SQLi, In-band SQLi (Classic), and Inferential SQLi (Blind).

Data Sets used in ML and DL Techniques
Data and datasets play a crucial role in cyber security research to conduct research and evaluate the research activities in the field of Cyber security.It is essential to identify and use the relevant dataset to conduct the research experiments to estimate the significance and performance of suggested Cyber Security solutions.The effectiveness and implementation of the ML and DL models rely on the size of the datasets that are used in training ML and DL methods (Xin et al., 2018).To construct efficient IDS, relevant heterogeneous and massive datasets are required in training proposed models and evaluating the performance of proposed IDS (Sohn, 2021).Some of the vital dataset over some time is depicted in Fig. 6.Ferrag et al. (2020) have classified available public datasets into 7 categories.The classified datasets are based on network traffic, internet traffic, electrical network, virtual private network android applications, IoT traffic, and internet-connected devices.Buczak and Guven (2015) have illustrated the significance of ML approaches in intrusion detection systems.The authors used packet-level, Netflow, and public data to evaluate the ML algorithm.

A. KDD99/KDD CUP 99 Dataset
KDD99 datasets are created in a Competition, namely, 3 rd International Knowledge Discovery and Data Mining Tools, that was held in association with KDD-99, The 5 th International Conference on Knowledge Discovery and Data-Mining (KDD).These datasets are based on the DARPA 1998 PCAP files (Gümüşbaş et al., 2020).These are widely used in differentiating intrusion and normal traffic (Andresini et al., 2020).There are more than 5 million different data available in the dataset.In total, there are 41 traffic features mainly to give the information about source IP address and source port.These 41 features can be categorized into 3 classes such as content features, traffic features, and basic features (Ferrag et al., 2020).The Basic features (also referred to as Intrinsic attributes) are extracted from the network packet's header.The Content attributes are extracted from the contents area of the network packets.Traffic attributes are calculated based on the previous connections.Traffic attributes are grouped into (1) Time-based traffic features and (2) Hostbased (machine) traffic features (Chahira, 2020).The summary of the features is given in Table 1.

B. NSL-KDD Dataset
It is the next version of the KDD-CUP-99 Dataset.These datasets are reformed by removing duplicative instances in KDD-CUP-99 and reconstructing the structure of the datasets (Sohn, 2021).Restructured data enables the classifier not to give any biased results.Compared to KDD CUP 99, NSL-KDD Dataset gives a lower reduction ratio since there is no repetitive data.There are 41 attributes in the dataset giving the different features of the traffic.The features are classified into numerical (38 features) and categorical (3 features such as "flag", "service", and "protocol_type") (Li et al., 2019).The 4 major attack categories are listed in (Karatas et al., 2018): The NSL-KDD dataset is divided into KDD Train + and KDD Test −21 .KDD Train + is used to train an IDS model, and KDD Test −21 is used for testing the datasets (Sohn, 2021;Li et al., 2019).

C. UNB ISCX 2012 Dataset
UNB ISCX 2012 Dataset was created at the University of New Brunswick (UNB) in 2012.UNB dataset includes traffic with normal data and attacks data for the DoS, infiltration, DDoS, and SSH attacks with brute force method (Gümüşbaş et al., 2020).The UNB ISCX 2012 dataset for Intrusion Detection Systems is from the realistic network and traffic covering diverse intrusion scenarios.This dataset has statistical features such as protocol, source_bytes, direction, time_stamp, source_packets, dst_bytes, dst_packets, source_ip, Tag, and dst_ip.The real network traffic of POP3, IMAP, SMTP, HTTP, FTP, and SSH protocols are analyzed to determine the expected behavior of computers (Ghurab et al., 2021).It consists of traces of labeled network data that include the payload of a full packet in pcap (Packet Capture) format.Datasets are made open for public use by researchers (Shiravi et al., 2012).

D. UNSW-NB15 Dataset
The datasets are generated at the Cyber Range Laboratory by the Australian Center for Cyber Security (ACCS), a cyber-security research team employing IXIA Perfect Storm, Argus, Tcpdump, and Bro-IDS tools.The tools are specifically designed to create DoS, generic, shellcode, reconnaissance, worms, and exploits (Ferrag et al., 2020).Datasets have 49 features with two million and 540,044 vectors.The features are grouped as basic, time, content, flow, connection, labeled, and general (Gümüşbaş et al., 2020;Ferrag et al., 2020;Sohn, 2021).The dataset is publicly available and consists of 45 distinct IP addresses.3 summarizes the data collection (Pantelidis et al., 2021;Sharafaldin et al., 2018b).The database consists of 2830540 samples containing 83 features.The important features extracted are flow duration, destination port, total backward packets, total forward packets, etc., (Ho et al., 2021).It covers a variety of insider and outsider attacks.The common attacks covered in the Dataset are DoS, Web Attacks, Botnet, Brute Force SSH, DDoS, Brute Force FTP, Heartbleed, Infiltration, etc., (Rashid et al., 2020).

F. CSE-CIC-IDS2018 Dataset
The CSE-CIC-IDS2018 dataset is formed by the Canadian Institute for Cyber-security (CIC) and the Communications Security Establishment (CSE).It addresses seven distinct types of attacks, such as heartleech, bruteforce, botnet, DoS, DDoS, infiltration, and web attack (Rashid et al., 2020;Shibahara et al., 2016).To collect the data, the victim organization Constituted five departments with 420 personal computers with 30 servers and the attacking infrastructures consisted of 50 machines.The CSE-CIC-IDS2018 dataset consisted of network flow data, event files from each victim's machine and 80 different network traffic attributes from CICFlowMeter-V3 (Rashid et al., 2020).Li et al. (2019) discussed the Border Gateway Protocol (BGP) datasets.Datasets include the routing logs from Reseaux IP Europeens (RIPE) and BCNET.The dataset includes the data of the day of the attack (anomalous data points) and the data collected two days before and two days after the invasion (regular data points).The record has 37 features that are extracted from the dataset.This dataset possesses information regarding the attacks such as Code Red I, Nimda, and Slammer.BCNET contains the regular data.

H. ISCX URL-2016 Dataset
The dataset contains samples of different types of URLs (Mamun et al., 2016).The collected URLs are classified into five different types of URLs: 1) Benign URLs-these are legitimate URLs that lead to any malicious websites 2) Spam URLs-it contain out-of-context links to websites, discussion forms, etc., to promote spammer sites 3) Phishing URLs-the URLs make the user visit a fake website and thus steal the personal information of the victim 4) Malware URLs-the URLs take the user to a malicious website that installs some malware on the victim's device 5) Defacement URLs-here the Hacktivists try to deface a website for their benefit which technically means penetrating a website Table 4 shows the various URL types, their sources, and several samples.Mahdavifar et al. (2020) a new dataset named CICMalDroid2020 is proposed.CICMalDroid2020 is a collection of samples taken from five distinct classes of Android applications such as SMS, Adware, Riskware, Banking, and Benign.This includes 17,341 samples capturing the static and dynamic features from the publicly available datasets.

K. Drebin
Drebin is a famous, widely accepted malware dataset used in the Android operating system and was part of the Mobile Sandbox project (Mishra et al., 2021) (Arp et al., 2014;Salah et al., 2020).Four well-known malware families and their top 5 features are listed in Table 5 (Arp et al., 2014).

L. Kyoto University Honeypot Dataset
Kyoto 2006+ is created from the traffic data over three years from Nov. 2006 to Aug. 2009, using various variants of honeypots (Song et al., 2011).The dataset includes 24 statistical features, of which 14 features are conventional features extracted from the KDD Cup 99 data set, a prevalent and widely used habituated evaluation data in intrusion detection (Krishnaveni et al., 2021).The dataset has ten additional features such as Source_Port_Number, Ashula_detection, Source_IP_Address, IDS_detection, Malware_detection, Label, Destination_Port_Number, Destination_IP_Address, Protocol and Start_Time (Ghurab et al., 2021).
Various honeypots such as Windows machines, Linux/Unix machines, and dedicated honeypots introduced in network printers, home appliances, etc. were used to collect the actual data.The honeypots were deployed on five different networks, such as 1 class A and 4 class B networks inside and outside Kyoto University.Research on the Kyoto dataset mainly concentrated on detecting anomalies, notably feature analysis, ensemble classifier, and dimensionality reduction (Salo et al., 2019).

M. The CTU-13 Dataset
CTU-13 Dataset consists of labeled data with Botnet, Normal and backdrop traffic grabbed in the Czech Technical University (CTU), Czech Republic, in the year 2011 (Garcia et al., 2014).It consists of thirteen different captures and is referred to as scenarios of distinct botnet samples.Every botnet system is for specific Malware with a precise infection of the virtual appliances, captured through different pcap files.The Dataset includes various types of botnets with HTTP, IRC, and P2Pbased communication techniques with invasions like Click Fraud, Port Scan, DDos, Spam, and FastFlux (Kim et al., 2021).The CTU13 Botnet dataset scenario is shown in Table 6 (Huang et al., 2021;Garcia and Uhlir, 2014).

N. ADFA Dataset
Created at the Australian Defense Academy (ADFA) by Creech and Hu (2013), this public dataset was devised to overcome the constraints of the KDD Cup 99 dataset.The two variants of the ADFA data set are ADFA-LD (ADFA Linux Dataset) and ADFA-WD (ADFA Windows Dataset), containing the data from each operating system.ADFA includes 833 normal training datasets and 4373 normal validation datasets.Constructed by evaluating the system-call-based HIDS, it represents the recent attacks' format and process.The dataset contains the attack types such as Websell, Adduser, HydraSSH, Hydra-FTP, and Jara-Meterpenter.

O. AWID Dataset
Aegean WiFi Intrusion Dataset (AWID) is a public set of data consisting of real data of regular and malicious traffic of the 802.11networks (Kolias et al., 2015).This dataset is exclusively for detecting intrusion in Wireless Networks.
The data is collected from a devoted WEP-protected 802.11 network with actual network utilization.A physical lab was set up emulating a typical Small Office/Home Office (SOHO) infrastructure to collect the AWID data.With the labeling method, the dataset is accessed in two sets: AWID-CLS (for Classes) and AWID-ATK (for Attacks).Each of these sets has a complete subset (AWIDCLS-F and AWID-ATK-F) and a lessened subset (AWID-CLSR and AWID-ATK-R).Each subset has two versions, one for training and the other one for testing.AWID-CLS-R-Trn and AWID-ATK-R-Trn consist of 1,795,575 records, of which 1,633,190 is regular traffic and 162,385 records are intrusive.AWID-ATK-F-Trn and AWID-CLS-F-Trn have 37,817,835 records, of which 1,085,372 have some kind of attack.The AWID dataset is used for other wireless technologies such as WiMax, UMTS, LTE, or different 802.11 settings (e.g., vehicular networks, mesh mode).

P. Other Datasets
Many researchers used other datasets besides the well-known datasets for Cyber Security.Researchers (Xiujuan et al., 2019) used Email Dataset named Enron from CALO (A Cognitive Assistant that Learns and Organizes) project.The data is a collection of 150 users' emails, mainly the Enron Corp's senior manager.The Federal Energy Management Committee examines and publishes this email dataset to the network.In the research work (Lorenzen et al., 2018), data was collected from "Cybersecurity Environment for Detection, Analysis, and Reporting" (CEDAR).
These datasets are used to analyze the deep learning algorithms segregating normal and benign network activities.Li et al. (2019) used DGA-based domain data such as Nymaim, Tovar, CryptoLocker, Locky, and Nymaim to evaluate their proposed model for Malware detection.Authors (Singhal et al., 2020) used public blocklists such as PhishTank and Malware Domain List (MDL) to collect malicious URLs.OpenDNS operate PhishTank to distribute and verify phishing websites.MDL holds an archive of malware-infected websites.Yu et al. (2019) used AlexaBamb training data constituting domain names of Alexa, which is benign, and Bambenek, which is non-benign.Apart from these, researchers have also used the datasets from other sources such as the Cisco umbrella popularity list, Alexa Top 1M domains, OSINT DGA feed from Bambenek, and Netlab 360 for the most famous domain names for DGA Domain Detection (Shahzad et al., 2021).

Machine Learning-Based Approaches for Cyber Security Problems
Machine Learning algorithms build behavior models using mathematical techniques across massive datasets and make imminent predictions with the new set of input data.Machine learning methods are adequate for intrusion discovery mechanisms.Machine Learning (ML) lets the computer learn without explicitly programming them.The frontier person of ML, Arthur Samuel, explained ML as, a branch of computer science that emphasizes how to make the computer think (i.e., artificial intelligence) without giving explicit instructions to the machines (Gordon, 1995).ML performs categorization and regression established on previously learned features from the set of training instances.The strategy consists of two phases: Training and testing (Buczak and Guven, 2015).
Machine Learning approaches are commonly classified as Unsupervised, Supervised, and Reinforcement techniques.The algorithm/system is trained with a set of labeled input and output data in a supervised learning algorithm.The training is done with the feature set of input and correct output that makes the model learn over time.That is, the training dataset is having the target vector.Whereas in Unsupervised Learning, algorithms learn from the training data but without any target vector available (Sharma et al., 2016;Martínez et al., 2019;Apruzzese et al., 2018;Hu and Tan, 2017;Yavanoglu and Aydos, 2017;Djellali et al., 2019).Different algorithms and computation approaches are used in supervised techniques.The most commonly used supervised learning methods are classification and regression established on the target labels, which can be either discrete or numeric (Liang et al., 2019).Unsupervised learning includes Dimensionality reduction, Density estimation, and Clustering (Liang et al., 2019).As Unsupervised Learning doesn't require labeled training data to detect malicious activity, they are best suited for cyber security compared with supervised learning which needs labeled training data (Geluvaraj et al., 2019).In reinforcement learning, the machine learned by trial and error in an interactive setting with the experience and predicted output is evaluated based on positive or negative reward (Alabadi and Celik, 2020).The major reinforcement methods are Value function approximation and Policy search (Liang et al., 2019).
The research experiment of (Apruzzese et al., 2018).uses Feedforward Fully Connected Deep Multi-Layer Neural Network and Random Forest algorithms.The ML algorithms are applied in (i) Intrusion identification, (ii) Analysis of Malware, and (iii) Detection of spam.DGA Detection, Network Intrusion Detection, and Botnets are focused on Intrusion Detection.The other machine learning algorithms that are used in cybersecurity are the Bayesian approach-Bayesian classifiers, and Markov models.K Nearest Neighbor (KNN), Naive Bayesian classification, SVM, and Neural Networks are the machine learning techniques that are used in spam filtering (Patil et al., 2017).Pu et al. (2020) proposed a blended unsupervised method for anomaly detection process combining clusterbased techniques such as One-Class SVM(OCSVM) and Subspace Clustering (SSC).SSC is an extension of the traditional clustering approaches.SVM is a supervised approach that investigates data and identifies patterns.OCSVM is an extension of the SVM model and is specifically appropriate for unlabeled data.The proposed method is evaluated utilizing the notable NSL-KDD dataset (Sohn, 2021).
The attackers use malicious websites to acquire control of the system and inject Malware to collect user details or harm the system.Generally, the attackers keep changing the URL of the malicious websites.Singhal et al. (2020) suggested a method to categorize website URLs as malicious or benign.The authors used the Machine Learning classifiers like Gradient Boosted Decision Trees, Random Forests, and Deep Neural Networks for the classification.For these classifiers, they used Content-Based, Host-Based, and Lexical features from the URLs.The author highlighted drift in websites to address the vibrant nature of malicious websites.Web drifts are observed by changing the association between the input data and the target variable.
In malware analysis, the ML approaches are utilized for Malware detection and classify the Malware into different categories (Li et al., 2020).In malware detection, algorithms classify software as malicious or benign.The major challenge with Malware is that they incorporate metamorphic, polymorphic, and other evasive techniques which can modify their behaviors and create a new type of malwares (Vinayakumar et al., 2019).These obfuscation techniques are used by hackers against traditional signature-based techniques.Baptista et al. 2019) present methods for malware detection.The proposed method is established with Self-Organizing Incremental Neural Networks (SOINN) and binary visualization.Binary data of any file is converted into an image and malicious traffic is analyzed and detected using SOINN.The converted images are preprocessed and extracted features are given to SOINN for clustering and classification.A similar process happens during the testing phase.The algorithm achieves 74% of the overall detection rate with false positives at 12% and false negatives at 14%.
To effectively detect Malware, authors (Li et al., 2018) designed the Significant Permission IDentification (SigPID) framework, which adopts an SVM classifier.SigPID framework pulls effective permissions from the applications and effectively utilizes the extracted data to detect Malware employing supervised learning algorithms.To extract significant permissions, the authors proposed a Multilevel Data Pruning (MLDP) approach with Support-based Permission Ranking (SPR), Permission Mining with Association Rules (PMAR), and Permission Ranking with Negative Rate (PRNR).The authors then used an SVM classifier to categorize Malware and benign applications.The proposed framework achieves better accuracy, precision, and recall in Malware detection, which is the main objective of the framework.
The SVM classifier is one more efficient method for the detection of Malware.The authors (Hegde et al., 2020) proved the effectiveness of SVM classifiers in detecting botnet activities for a home IoT environment.The performance metrics used are false alarm rate, detection rate, and testing accuracy.The classifiers used for detecting botnet activities are Random Forest, Decision Trees, Two class Neural Networks, Multiclass Decision Trees, and Multiclass Neural Networks.The author concluded that the performance of the classifiers increased with the dataset size and amount and diversity of the malicious activities.
Intrusion detection systems are used to monitor malicious activities in the system.The ML-based IDS approach involves three categories such as data classification, anomaly-based method, and data clustering (Bahl and Sharma 2015).Data classification is a supervised machine learning strategy where the dataset is classified into different types of attacks.The deviations from the expected behavior are identified by an anomaly-based method, a semi-supervised machine learning technique.In data clustering, the data is clustered based on patterns.
Adaptive Bayesian Algorithm (ABA), Artificial Neural Networks (ANN), KNN, DT, and SVM are machine learning techniques that research scholars in literature extensively used to detect intrusion.The machine learning model, Radial Based Function SVM (RBF-SVM), resulted in the most increased accuracy (Chaudhary et al., 2020).Otoum et al. (2018) the author proposes an Adaptively Intrusion Detection System (Adaptive-IDS), called the Adaptively Supervised and Clustered Hybrid Intrusion Detection System (ASCH-IDS) to classify the aggregated data.This model uses machine learning techniques namely random forest-based classifier as misuse detection subsystem to detect known attacks and enhanced-DBSCAN classifier as anomaly detection subsystem to detect unknown attacks.Begli et al. (2019); Hagos et al. (2017) used SVM in designing an intrusion detection system to prevent possible attacks like U2R, DoS, etc.The proposed methodology uses the SVM to classify the malicious traffic pattern from the typical traffic pattern, which happens to be non-linear.
Though intrusion detection employs many machine learning algorithms, each has its benefits and de-benefits.Each algorithm performs differently on different attacks.Ensemble in machine learning is a technique in which several base models of machine learning models combine to have an optimal predictive model.These ensemble models proved to be efficient in detecting cyber-attacks.Feng et al. (2018) (designed a unique Intelligent Intrusion Detection System framework to address multi-attack classification based on the CIC-IDS 2018 dataset.Their ensemble technique uses a blended mode of featureselecting approach employing Random Forest (RF) and Principal Component Analysis (PCA).The other Machine Learning algorithms utilized in the suggested work are KNN, DT, Extra Trees, Light GBM, Gradient Boosting based on Histogram (HBGB), and Extreme Gradient Boosting (XGB).The framework is tested and compared with other approaches as well.Krishnaveni et al. (2021) proposed an ensemble method for efficient feature selection and classification of network intrusion detection for the current threats in cloud computing.This proposed approach relies on the univariate ensemble feature selection technique, with reduced feature sets selected from intrusion datasets such as Honeypot real-time dataset, Kyoto, and NSLKDD.Wang et al. (2018) used the K-NN technique for supervised learning and the K-Means method in KNN classifier for unsupervised learning to enhance the performance of the intrusion classifier for U2R attacks.The authors introduce feature weighting and unsupervised learning methods in the KNN process to achieve this.Observed results reveal that the suggested approach can efficiently classify network attacks and significantly enhance the classification of U2R attacks.Buczak and Guven (2015) proposed the ML and DLbased approaches for detecting cyber intrusion and misuse attacks that are applied in wired and wireless networks.The author focused on Misuse Detection, Anomaly Detection, and Hybrid Detection for the various models of ML and DL such as (i) Bayesian Networks, (ii) Evolutionary Computation, (iii) Artificial Neural Networks, (iv) Clustering, (v) Decision Trees, (vi) Association Rules and Fuzzy Association Rules, (vii) Sequential Pattern Mining, (viii) Inductive Learning, (ix) Support Vector Machine, x) Hidden Markov Models and (xi) Naive Bayes.These models' performances are compared with the parameters such as time to train a model, classify unidentified examples with a trained ML model, Comprehend the conclusive results (classification), and Accuracy.This research work highlights the requirement of retraining data and labeled data.
In the research work by Xin et al. (2018), the authors detailed the ML and Deep Multilayered Representative Learning strategies that are employed in detecting network intrusion.They considered SVM, KNN, Decision Trees, Deep Belief Networks (DBN), Recurrent Neural Networks (RNN), and finally Convolution Neural Networks (CNN) in their study.They highlighted some problems such as the unavailability of benchmark datasets, irregular evaluation metrics, and insufficient measurement of the efficiency of the algorithms.Feng et al. (2018) in their research use ML approaches to detect Distributed Cyber Attacks.The work focuses on identifying C&C (Command and Control) communication between the C&C server and the bots that are compromised.The C&C contact occurs in the preparation stage of distributed attacks.The authors used 55 features to select C&C traffic to detect the DDoS attacks early.They used mainly PCA and SVM for feature selection.SVM and RF methods are used to build the classifier.The experiment focused on decreasing the number of features used and finding the critical features necessary for the early detection of C&C communication.The study concluded that though more features are used in the detection, as the count reaches around 40, the detection performance will not very much.
The literature proved that Machine Learning algorithms are best suited for phishing attacks since they have most of the common characteristics in common (Lakshmanarao et al., 2021).Many ML algorithm-based results have been presented in publications to thwart phishing attacks.However, the existing ML-based solutions have higher response times, and high falsepositive rates and involve third parties' (unauthenticated) information.Gupta et al. (2021), proposed a solution for phishing attacks that detects URL phishing attacks in a realtime environment.The authors have used well-known algorithms such as Random Forest, Spearman correlation, and K best for identifying phishing attacks.The proposed work used nine lexical-based features to achieve high accuracy with Random forests with a very low response time.The authors have done a detailed study on the response time that includes the time for feature extraction, dataset preparation, loading of modules, and predicting the results as valid or phishing attacks.Authors have concluded that the Random Forest algorithm has the highest response time and SVM has the minimum response time.
The effectiveness of other classified algorithms is verified by Iyer et al. (2021).The classification algorithms used are DT, K-NN, SVM, Logistic Regression (LR), RF, and Ensemble learning.Authors (Iyer et al., 2021) applied fusion classifiers based on priority-based algorithms such as Priority Algorithm 1 (PA1) and Priority Algorithm 2 (PA2).A final fusion is then applied based on the priorities obtained in PA1 and PA2 to achieve an accuracy of 97%.
Phishing prediction can be done using different machine learning methods such as SVM, Decision Tree, Random Forest, Naive Bayes, Bayesian Classification, K-Nearest Neighbor, and Artificial Neural Networks.The feature selection is classified as Source code features, URL features, and Image features, and these are based on rules (Singh, 2020;Tang and Mahmoud;2021;Alam et al., 2020) Random forests and Decision Trees are used for detecting phishing attacks.The datasets are collected from Kaggle and feature selection is made by Principal Component Analysis (PCA).It identifies and classifies the dataset components.Decision Trees are used to categorize the website and for classification, Random Forest is used.High accuracy was achieved through Random Forest.Research works (Xiujuan et al., 2019) propose spear-phishing email detection based on Authentication (SPBA) which uses personality features, stylometric features, and gender features extracted from the emails of the same sender with which the identity portrait model of the sender is created.For authentication, KNN, SVM, and Random Forest are used as classifiers.The real portrait of the sender is then compared with the portrait of the uncertain email.If it is found identical then the email is treated as normal otherwise the email is classified as spear phishing from a disguised sender.This study outperforms the PHILFER and FSSPD concerning detection rate and accuracy.Apruzzese et al. (2018) showed that ML algorithms are used in many problems, whereas DL algorithms are mainly used for Malware investigation, less in Intrusion detection.Unsupervised DL algorithms are used in spam detection.The results provided strong evidence that ML techniques are having shortfalls in their effectiveness for Cyber Security.A lack of human surveillance can allow professional attackers to penetrate, loot the data and even vandalize an enterprise.The authors concluded that the ML methods are prone to adversarial attacks, the algorithms need continuous re-training and the parameters need to be carefully tuned (Apruzzese et al., 2018).The attacker can perform adversarial attacks on the machine learning algorithms during the training or testing (inferring) period (Chaudhary et al., 2020).
Adversarial Machine Learning is the ML method that makes the machine malfunction by providing wrong input to the model while training the machine.This forces the machine to make false predictions.The attack by the attacker can be a targeted attack where a specific part of the training sample is targeted or it can be a random attack where any part of the training sample is targeted.In both methods, the ultimate goal is to misclassify the output result.The adversarial effect can be an integrity violation, availability violation, or privacy violation based on the adversary's goals (Dixit and Silakari, 2021).The targeted attack on the neural network which leads to misclassification is referred to as an Integrity violation.If the targeted system is unavailable to users for a certain period, it is called an availability violation.Privacy violation occurs if the adversary is successful in compromising confidential information.However, adversarial examples can be leveraged to enhance ML models' performance or robustness.
Research showed that ML techniques in IDS attain a heightened detection rate but a less false positive rate.But it is also observed that the ML algorithms can misclassify the network data due to poison learning (Sharma et al., 2016;Xin et al., 2018).
The process of making Machine Learning algorithms perform undesirable activity/function is referred to as an Adversarial Machine attack.The adversarial machine attacks are categorized as (Liang et al., 2019): 1) Poisoning (also known as a causative attack) 2) Evasion attack and 3) Exploratory attack A poisoning attack is a kind of adversarial invasion in which the adversary in the poison attack manipulates the training dataset of a machine learning model.In a poisoning attack, the adversary gives carefully designed training data and these are induced into the system while at the training stage.The contaminated/poisoned datasets result in incorrect behavior of the model and thus resulting in a performance decrease.This definitely will affect the accuracy of the system.Poisoning attacks can be of two types: Poisoning with changing features (labels) and poisoning without changing the features (Chaudhary et al., 2020).In an Exploratory episode, the adversary learns the model algorithm and can manipulate the parameters of the system so that they can reach their goals (Yu and Deng 2010).
Another commonly known attack is the evasion attack.In an Evasion attack, the malicious samples are evaded/misclassified as valid during test time.Evasion attack is on the learned models during the testing phase producing adversary-selected outputs.Through an Evasion attack, the adversary can pass through the test process by altering the test samples, and the model results in incorrect output (Liang et al., 2019).The evasion attacks can be classified into three types.A black box attack is the most frequently used attack type where the attacker will have zero knowledge about the ML/DL models.In a white box attack, the hacker has a permit to access the parameters of the prototype, whereas, in the grey box model, the attacker has moderate knowledge about the model (Dixit and Silakari, 2021;Taheri et al., 2020).The testing phase attacks are Deep Fool, Fast Gradient Sign Method (FGSM), Optimization-based method, Jacobian-based Saliency Map Approach (JSMA), etc., (Chaudhary et al., 2020).The thwarting techniques for attacks on the ML models are categorized into four types: Security Assessment mechanisms, counteractant in the training and testing stage, Data Security, and Privateness.Some examples of defensive techniques are Adversarial Training, Ensemble Method, Data Deduplication, Secure Data Deduplication, Data Sensitization, Reject on Negative Impact (RONI), Identity Based Encryption, Defense Distillation, Differential Privacy, Blockchain Based Solution, Homomorphic Encryption, etc., (Chaudhary et al., 2020).Guo et al., (2021) also propose a black box attack method for models which detect anomaly network flow using machine learning algorithms.The proposed Black Box adversarial example generation method uses the White box attack on the substitute model.The target model and substitute model are trained identically on the KDD99 dataset and the CSE-CICIDS2018 dataset.The attacker can launch an attack on the substitute model with the white box method.These crafted adversarial examples are then used in the target model to check whether these adversarial examples can misclassify the target model.Experiment results showed that the authors effectively generated adversarial examples based on network flow, which can mislead the detection models that are machine learning-based.
In general, adversaries use Adversarial Machine Learning Algorithms (AML), so that the machine learning algorithms misclassify the benign sample.The main reason is to make a machine learning model malfunction.For this adversary use poison data.This data may be to exploit particular vulnerabilities and compromise the outcomes.Some of the AML models are Droid API Miner, Mystique, Pin droid, and Droid Chameleon which reduce the detection rate of classification of Machine Learning models (Taheri et al., 2020).The adversarial classification can be False negative or False positive.In False positive, the attacker wrongly calculates a negative instance to classify it as positive.In contrast, in a false negative, the benign data is added with Malware so it can bypass the detection (Taheri et al., 2020).Taheri et al. (2020) propose Ant Colony Optimization (ACO) algorithm to produce poison malware samples.In this approach, a linear regression algorithm is applied first to choose the malware instances almost identical to the benign examples in the training dataset.Next, the ACO Function is utilized to find the adversary sample data.The ACO pheromone value used is the number of features changed.The algorithm starts with one feature and the new samples are produced by modifying the malware samples without attributes present in legitimate applications.It is repeated by utilizing more additional features.The distance between the recently generated sample and the discriminator is estimated.If it is within the specified Malware and the discriminator range, the newly generated sample is added to recently developed samples; otherwise, discard this sample.The feature values are changed and the distance is recalculated.This is repeated until the maximum iteration, or the classifier misclassifies malware samples.
The domain names generated by DGA are generally detected by extracting the features of DNS traffic and statistical characteristics of the domain name language.Later, the ML algorithms analyze the extracted features to identify and classify the DGA domain names (Chen et al., 2021).The authors (Chen et al., 2021)  Nowadays, ML approaches are susceptible to adversarial instances through Generative Adversarial Networks (GANs).It is an unsupervised ML technique that combines a generator and a discriminator (Gümüşbaş et al., 2020;Rao et al., 2020).GAN poses severe problems for Cybersecurity applications that are security-critical.More work is required to study the effect of adversarial examples in Cybersecurity.The generator produces data from the random distribution which could easily be mistaken for real data and a segregator (discriminator) separates real data from the false data.They learn the data distribution through unsupervised methods (Gümüşbaş et al., 2020).The generator is a convolutional neural network and the discriminator is a DE convolutional neural network.The data produced by the generator is matching to the probability distribution of training data.Whereas the discriminator distinguishes the training data from the generated data (Rao et al., 2020).The generated samples can increase the detection performance.GANs can be used in addressing Missing Data Problems (Ren and Xu, 2019), to generate negative samples to satisfy the negative samples which are needed to train deep networks.Zhang et al. (2020) have suggested a Brute-Force Black-Box method to launch an invasion of systems that work with Machine Learning.The proposed method detects Network Intrusion Detection (NIDS) since ML techniques are vulnerable to adversarial examples.The Brute Force Attack Method (BFAM) framework evaluates the resilience of the ML classifiers in detecting cyber security.It uses the confidence scores from the target classifiers to develop the adversarial examples so that BFAM can be used for other adversarial invasions in cyber security.To utilize the excellent performance of Wasserstein GAN (WGAN), authors used this in their GAN model.Other GAN models such as MalGAN by Kim et al. (2018), and IDSGAN by Lin et al. (2020) are capable of generating adversarial malware, which misleads malware detection systems based on ML.
Table 7 summarizes the various ML algorithms discussed in this section.ML techniques can be used efficiently for defending against Cyberattacks, moreover, ML-based systems used offensively against all types of attacks.Kamoun et al. (2020) studied various AI/ML models for cyber security defense.The authors also list the misuse of AI/ML itself for Cyber security threats.Generally, AI/ML models, frameworks, and tools are available as open source, the hackers can easily adapt these models for their benefit.The adversarial AI/ML-based attack models are featured with speed, automation, scale, and sophistication.Based on the activities/actions, (Kamoun et al. 2020) categorize the AI/ML-powered cyberattacks into Probing, Scanning, Spoofing, Flooding, Misdirection, Execution of malicious processes, and Bypassing.Nguyen and Armitage (2008) systematically explain the performance of ML algorithms differently for different applications of cyber security.The author concluded that it is better to use a combination of classification models (Nguyen and Armitage, 2008).

Deep Learning Solutions to Cyber Security
Deep Learning (DL) is considered a sub-category of ML that establishes a layered neural network to stimulate human intelligence for coherent thinking (Martínez et al., 2019;Hu and Tan, 2017).Deep learning algorithms have proved that they can overcome the constraints of machine learning algorithms.Deep learning algorithms benefit from traditional machine learning algorithms (Aslan and Yilmaz, 2021) where the high-level features are generated from existing features automatically.
DL algorithms lower the requirement for feature engineering and feature space.It can perform well on supervised, unsupervised, and semi-supervised learning efficiently.DL algorithms process enormous datasets and they can handle unstructured data efficiently.DL algorithms play a vital role in solving problems in various research domains: Image processing, Bioinformatics, Game playing, Speech recognition, Object detection, Segmentation, Classification, Pattern recognition, and matching, Customer Relationship Management automation, Vehicle automation system, etc., (Karatas et al., 2018;Mahdavifar and Ghorbani, 2019;Rodriguez et al., 2021).
The deep learning techniques' robustness, rapidness, accuracy, and ability to handle extensive data have drawn researchers' concentration in recent years.
Deep Learning (DL) algorithms efficiently detect advanced cyber security threats.It is evident that DL techniques can be used for cybersecurity problems.Deep Learning algorithms can identify known and unknown attacks, it can manage incomplete, inconsistent, and composite data (Geluvaraj et al., 2019).The authors (Lakshminarayana et al., 2019;Kim and Aminanto, 2017) studied various DL algorithms and then classified DL algorithms into Generative (Unsupervised), Discriminative (Supervised), and Hybrid.Table 8 describes some of the DL techniques under these categories (Sarker, 2022).Aslan and Yilmaz (2021) suggested a framework for DL models and explored the utilization of deeply layered learning models for detecting several cybersecurity problems such as Intrusion, Malware, Spam Phishing, and Website Defacement.The authors used generative deep learning models over discriminative or hybrid approaches.Authors have highlighted the advantages of semi-supervised learning for unlabeled data.
Compared with classic ML techniques, deep networks can acquire the features automatically from data, reducing the effort of pre-processing the input data and not relying on human-engineered features.This makes Deep learning algorithms fit for much real-time processing.But DL algorithm's performance declines if the algorithms are not provided with sufficient numbers of appropriate training data (Mahdavifar and Ghorbani, 2019;Yu et al., 2019).The existing machine learning techniques do not scale over a huge volume of data and detecting cyberattacks in large loosely coupled devices is a great challenge.It is observed that ML techniques are inefficient in detecting intrinsic attacks or unidentified malware and are very poor in preserving users' privacy (Sapre et al., 2021).
The DL methods can overcome the drawbacks of ML models for existing cyber security solutions.DL has the potential at handling complex patterns and builds robust and reliable models.The DL techniques are faster and more accurate in processing since it has self-learning capabilities that improve the processing speed as well as the accuracy of the applications (Imamverdiyev and Abdullayeva, 2020).DL methods are suitable for Malware Detection, Network Intrusion Detection, DDoS attacks, Phishing/Spam Detection, Behavior Anomaly Detection, Botnet Detection, and Website Defacement Detection (Chen, 2020).Lee et al. (2019) used various Artificial Neural Network methods, such as CNN, LSTM, and FCNN to develop an Artificial Intelligence-Security Information and Event Management (AI-SIEM).The proposed model can discriminate between true positive and false positive notifications.This model enables Cyber security analysts to identify cyber threats and defend against them quickly.The author inferred that AI-SIEM has relevance in learning-based network intrusion detection models.They also concluded that multiple deep learning approaches could be efficiently used to enhance the threat predictions to avoid cyber-attacks.
The research by Xin et al. (2018); Karatas et al. (2018) highlights the differences between DL and ML techniques used for cybersecurity.DL algorithms perform well when large data volume is available and it requires highperformance machines with GPUs which are not applicable to ML algorithms.In Machine Learning, feature extraction is done by an expert wherein in Deep Learning, the algorithm tries to automatically extract the features.The ML algorithm's performance is gauged on the accuracy of the extracted features which is not the case in DL algorithms.With respect to problem-solving methods, ML divides the problem into sub problems and then solves those sub problems whereas DL algorithms do end-to-end problem-solving.The training period is more in DL models but the testing time is very less compared to ML algorithms.This is reversed in the case of ML models.Machine Learning algorithms can work on any normal CPU, but to run the Deep Learning algorithms high performance machines are required.Manual feature extractions are done in the ML approaches, whereas deep learning algorithms automatically extract abstract and flexible features by generalization in classification (Mahdavifar et al., 2020).Hossain et al. (2020)  It isn't easy to find anomalous features in the extensive network traffic samples.Feedforward neural network autoencoders are best suited for network anomaly detection since it is simple to train the input and reconstruct the output (Xu et al., 2021).Autoencoders are an unsupervised neural network learning approach.Autoencoders reduce dimensionality by compressing input data and rebuilding output data from their representation.It can discover structure within data to develop a compressed input representation.Xu et al. (2021).presented a novel Autoencoder-based method consisting of five layers for detecting anomalous traffic in the network.The approach transforms the input dataset into balanced datasets concerning data size and data types by removing outliers and avoiding bias in anomaly detection.In the 5-layer architecture, the hidden layer has the optimized count of neurons and the latent space layer provides the best performance compared to other architectures.
Autoencoders can also be used for feature learning and feature extraction.Authors (Andresini et al., 2020) used deep feature learning with multi-channel to detect intrusion in the system.The MINDFUL (MultI-chanNel Deep FeatUre Learning) framework uses Autoencoders.The Autoencoders are implemented by Hindy et al. (2020) for detecting zero-day attacks.This study tries to overcome the drawbacks of outlier-based zero-day detection, which has high false-negative rates.The authors built an IDS model to reduce the false negative rate (i.e., miss rate) with high recall (i.e., true-positive rate).The authors used the CICIDS2017 and NSL-KDD datasets.They remarked on an excellent accuracy rate compared to the One-Class Support Vector Machine (SVM).
An unsupervised Stacked Auto Encoder (SAE) is combined with weighted feature selections (Kim and Aminanto, 2017) to improve the feature learning process for IDS.The authors described that SAE is efficient and valuable for Feature Extraction, Clustering, and Classification mechanisms.The authors used SAE for classification and clustering.The results are validated using the Aegean Wi-Fi Intrusion Dataset (AWID) consisting of benign, injection, impersonation, and flooding classes.The authors concluded that IDS that used SAE as a classifier resulted in a low impersonation detection rate.Thus, SAE could be used as a classifier rather than a feature extractor.
An exhaustive investigation of deep learning-based intrusion detection is proposed by (Otoum et al., 2019).The proposed work uses the Adaptively Supervised and Clustered Hybrid (ASCH-IDS) methodology (Otoum et al., 2018).This intrusion detection model, Restricted Boltzmann-based Clustered IDS (RBC-IDS), is for Wireless Sensor Networks-based critical applications.The results showed that ML-based IDS is desirable when it resembles DL-based IDS concerning the accuracy, training, and testing time for WSN-based critical infrastructure monitoring.The research work in (Alom et al., 2015) performed a series of experiments on Intrusion detection using DBN.With these experiments, (Alom et al., 2015) could identify unknown attacks and, after 50 iterations, achieved 97.5% of accuracy.Djellali et al. (2019), designed two deep learning techniques as Batch Gradient Descent and Stochastic Gradient Descent which are compared and tested on a resampling method for cybersecurity.Batch Gradient Descent is an iterative technique that uses complete input training patterns in order to optimize a cost function.In Stochastic Gradient Descent, the input training patterns are randomly selected to update the weights.The author concluded that Stochastic Gradient Descent provides an efficient optimization algorithm for cybersecurity with a good performance and less computational costs.Sohn (2021) proposed a survey paper that describes the basics of the DBN-based intrusion detection model.The author compares the fundamental algorithms, the different training methods, and the data sets and interprets the results of various research works starting from 2016.Intrusion detection based on DBN used ADFA, NSL-KDD, UNSW-NB15, and KDD Cup 99 dataset.The DBN-IDS-based framework consists of components such as a data preprocessor training, Classifier, Optimizer, and fine-tuning algorithm (Sohn, 2021).
The malicious activities are detected from the network traffic using anomaly detection.Many deep-learning techniques have been proposed for anomaly detection.To detect anomalies, (Kim et al., 2018) designed a cluster of approaches established on Variational Autoencoder (VAE), Fully Connected Network (FCN), and LSTM Seq2Seq structures and concluded that Deep learning methods are a proper selection for convincing network anomaly detection.The authors examined the proposed architectures with various public traffic datasets, including IDS2017, UNSW-NB15, Kyoto-Honeypot, and NSL-KDD.In data preprocessing, numerical features are normalized using a z-score, and categorical features are turned to numerical by one-hot encoding-the preprocessed data fed into a connected network for training.The authors considered ReLU as the activation function in hidden layers.The Softmax layer produces the final output with a cross-entropy cost function which can be either normal or attack.Next, the two variants of VAE models such as VAE-Pure and VAEFCN models are tested.The original data and the detected data are compared to calculate the loss.The LSTM-Seq2Seq model is based on RNN which yields a target sequence and conditional probability through an encoder and decoder.LSTM Seq2Seq structure showed a promising result of 99% of binary classification accuracy on both the NSL-KDD dataset and Kyoto University Honeypot data ("Kyoto-Honeypot").Results of SVM and RF show less accuracy when classified with the NSL KDD dataset and high accuracy with the UNSW-NB15 dataset.Maimó et al. (2018) proposed a two-level DL model, which acts as a robust system for detecting anomalies and defending against cyber-attacks in a 5G architecture for the mobile network.The supervised or semi-supervised learning method is used in the first level to implement a DBN or an SAE operating on every RAN.supervised LSTM Recurrent Network is used in the second level to confine the cyberattacks.
For anomaly detection with multi-dimensional input, a little investigation has been done by employing Convolutional Neural Networks (CNN) (Alabadi and Celik, 2020).Though Deep learning mechanisms are best suited for anomaly detection, the challenges faced are to identify the threats faster and the traffic profile should be auto-profiled.The traffic profile includes flow statistics such as transmission rate, packet count, flow size, etc.In CNN these kinds of features are automatically extracted from the traffic profile.Hwang et al. (2020), the traffic patterns are built by investigating the starting bytes of the first few traffic packets.As it uses only the first few packets for anomaly detection, the speed of threat detection is increased.The proposed system automatically uses the CNN module to know the source data's features.The model achieves 99.77% accuracy in detecting malicious activities and less than 1% FNR and FPR.The dataset comprises four DDoS attack classes: HTTP flood, ACK flood, UDP flood, and SYN flood (Hwang et al., 2020).
A Machine Learning approach that combines deep learning approaches with Reinforcement Learning (RL) is named Deep Reinforcement Learning (DRL).Apruzzese et al. (2020) used DRL mechanisms in their work to propose a design approach that protects botnet detectors from adversarial attacks.The novel strategy leverages DRL to improve the robustness of detectors.Botnet detectors use the classifiers Wide and Deep (WnD) and Random Forest (RF).The agents in the proposed model are based on deep reinforcement learning approaches such as Double Deep Q-Network(2DQN) and Deep State-action-reward-state-action (Sarsa), which use off-policy and on-policy methods, respectively.In the next phase, this trained DRL agent produces the adversarial attack.These samples can evade a botnet detector.With adversarial training, the model utilizes the samples for hardening the botnet detectors.
Deep Neural Networks are most suited for Domain Generation Algorithms as they can efficiently classify domain names as malicious and benign (Yu et al., 2019).
For DGA detection, the authors examine the advantages of labeled data to train DL classifiers.For this, the authors used RNNs, LSTMs, CNN, and hybrid CNN/RNN models.Shahzad et al. (2021) used RNN architectures like Bidirectional LSTM (Bi-LSTM), Long Short-Term Memory networks (LSTMs), and Gated Recurrent Units (GRU) to calculate the performance of a DGA classifier.The suggested DGA classifier takes the domain names from the DNS queries and does not demand manual feature creation.Without any contextual information, the model performs multiclass classification to determine the domain family to which it belongs.
When compared to ML algorithms, DL algorithms are most suited for malware detection (Apruzzese et al., 2018;Li et al., 2020).The reason is the diminishing output of ML algorithms when the data size increases.DL algorithms enhance the performance though the input size is more.As Malwares are multiplying with the technology, malware detection should cope with the scalability issues.Vinayakumar et al. (2019) proposed a hybrid scalable deep learning framework named as Scale Mal Net, which handles large samples of malware.The model collects the malware samples and applies them to pre-process in a distributed way.The executable files are classified into benign or Malware samples using static and dynamic examination in the first phase.This is followed by a second stage where the malware executable files are separated into their families.However, the robustness of the DL techniques is not focused on the work; the authors conclude that the deep learning architectures outperform the classical machine learning models.
To classify Malware, authors (Aslan and Yilmaz, 2021) proposed a framework with a hybrid deep neural network.This hybrid approach combines several pretrained network models and the test results proved that the suggested framework could segregate Malware with increased precision, recall, accuracy, and F score.
A framework for Malware category classifications for Android is performed (Mahdavifar et al., 2020).This framework uses dynamic malware category classification and also applies semi-supervised deep neural networks.The experiment results show that the F1 score is better and has a false positive rate of 2.76% outperforming the typical machine learning algorithms.The input layer consisted of 470 neurons and the output layer consisted of 5 neurons.The sigmoid function is used for activation and for optimization they use mini-Batch Gradient Descent.
Like Machine Learning algorithms, deep learning methods also get affected by adversarial attacks.Deep learning models are fragile under adversarial attacks (Li et al., 2020).The adversarial attacks can be gray-box, white-box, and black-box attacks.Many attack algorithms are proposed for adversarial sample generation for these threat models.Some of the attack algorithms are the Deep Fool, Fast Gradient Sign Method (FGSM), Optimization-based method, Jacobian-based Saliency Map Approach (JSMA), Limited-memory Broyden-Fletcher Goldfarb-Shanno (L-BFGS) algorithm, the Basic Iterative Method (BIM)/Projected Gradient Descent (PGD), Carlini and Wagner (C&W) attacks and Distribution Ally Adversarial attack (Ren et al., 2020;Li et al., 2021).
Intrusion Detection Systems based Deep Learning Neural Networks are susceptible to attacks on white-box and backdoor adversarial scenarios (Alrawashdeh and Goldsmith, 2020).Much research work has been undergone in this field.One such work is investigating the adversarial examples affecting the interpretation of Intrusion Detection Systems using Deep Neural Networks (DNN) (Yang et al., 2018).The author illustrates that the adversary can generate adversarial examples to mislead the DNN model even though the models' internal information is isolated from the adversary.These adversarial examples are generated and evaluated in the black-box model.Though the internal details of the model are not accessed here, the adversary can still mislead the classifier to misclassify the attack input as normal input.Shi and Sagduyu (2017) proposed a Machine learning classifier for generating and defending against evasion and causative attacks, combining the DL-based exploratory attack.Initially, the adversary creates a classifier using an exploratory attack established on Deep Learning (DL), similar to the original classifier.From the built classifier, the samples are collected and given to the original classifier.To achieve an evasion attack in the trained classifier, the adversary tries to deceive the machine learning algorithm by providing incorrect input data, which results in the wrong label, thus misclassifying the samples.For the causative attack, the adversary provides the target classifier with false class information, thus reducing the precision of the trained classifier.This study by the authors demonstrated that the evasion attack increased the error in the test phase and the causative attack increased the same during the training phase.They concluded the work by providing an aggressive defense mechanism with small perturbations showing that the error under attack is identical to the error when there is no attack.Li and Li (2020) propose a mixture of attacks to produce adversarial malware examples.For this author uses multiple generative procedures and manipulation sets.To validate the malware detectors' robustness, the author uses 26 evasion attacks.These evasion attacks are categorized into gradientbased, gradient-free, transfer attack, obfuscation, and a mixture-of-attack approaches.Table 9 summarizes the various DL algorithms discussed in this section.

Performance Metrics
The essential metrics estimating the performance of DL and ML techniques are Confusion matrix, precision, Detection Rate (DR) (also called recall or true positive rate), false negative rate, false positive rate, true negative rate, F1-Score, accuracy (Chaudhary et al., 2020).The Receiver Operating Characteristics (ROC) curve and Area under the ROC (AuC) are also used to estimate the classification performance (Sohn, 2021).
In a dataset of random size, the component can belong to either binary or n-ary classification.In binary classification, the element can be considered as an attack or benign.The invasion is represented as positive and the benign category is denoted as negative (Vinayakumar et al., 2019).A True Positive (TP) is a component from the positive category that the algorithm treats as positive.Similarly, a True Negative (TN) is an element from the negative class that is treated correctly as negative by the algorithm.But in a False Positive (FP), the element is identified as an attack when in actuality, it isn't.Similarly, in False Negatives (FN), the algorithm fails to identify the attack.
Accuracy is measured as the fraction of elements that are correctly predicted: (2) The Detection Rate (DR) reveals the count of attacks that are identified (Vinayakumar et al., 2019;Jayakumar et al., 2015): The False Positive Rate (FPR) indicates the number of invasions that are not recognized: The recall is the calculated percentage of rightly classified attack data to the total count of attack data in a provided dataset-the more the recall rate, the better the machine learning model's performance: (5) F1-Score/F1-Measure is calculated as the harmonic mean of Precision and Recall.The increased rate of F1-Score illustrates that the machine learning algorithm is accomplished excellently: The high value of the false negative rate may demonstrate that the NIDS failed to identify known or anonymous attacks.In contrast, the increased false positive rate indicates the false alarms generated when there is no attack in the network (Kilincer et al., 2021) Some of the metrics used in Generating AEs are the Total Time Cost (TTC), Adversarial Detection Rate (ADR), and Original Detection Rate (ODR) (Zhang et al. 2020).TTC is the total time required to build a set of AEs.The ODR determines the detection performance of the target classifiers contrary to the actual attack examples.The ADR implies the detection performance of the target classifiers contrary to the adversarial attacks: . .

No of right indentified orignal attack examples ODR
No of all theorignal attack examples  (7) . .

No of right indentified adversarial attack examples ADR
No of all the asversarial attack examples  (8)

Conclusion
Nowadays, cyber security attacks are increasing tremendously.The prevailing cyber security attacks are DOS attacks, Phishing, Malware attack, Botnet Evasion Attacks, Spoofing, R2L, Probing attacks, and U2R attacks.This survey paper details the different cyber security attacks and tools for detecting intrusion detection mechanisms.The paper also identifies cyber security domains and significant research challenges.Many traditional approaches are inefficient in detecting, analyzing, and defending against cyberattacks.In current years, it has been evident that ML and Deep Feature Learning approaches efficiently solve cyber security attacks.This study reviewed several efficient algorithms of Machine and Deep feature learning to solve many cyber security problems.The article also addresses the adversarial attacks on Machine Learning Algorithms and Deep Learning Algorithms and the defense mechanisms against those adversarial attacks.The survey gives insights into private and publicly available datasets that are significant in analyzing the effectiveness of the proposed algorithms to defend against cyber security threats.The paper concludes with the various performance matrices utilized to estimate the efficiency of the suggested algorithms.

Sharafaldin
et al. (2018a); Ring et al. (2017) of the Canadian Institute for Cybersecurity Intrusion Detection System (CICIDS) created this dataset in 2018.The dataset comprises the normal data and attack data are gathered for five days.Table used a deep neural LSTM network to propose a DGA domain name detection model.Li et al. (2019) presented a model to handle DGA threats as conventional malware control approaches (like blacklisting) cannot handle them.The paper focuses on the machine learning framework which can identify and detect DGA attacks.It also proposes the Deep Learning technique (DNN) to organize those large numbers of domain names.This study presents the machine learning framework with a two-level model and prediction model.In the first level of classification, the paper identifies Decision Tree-J48 as the best classifier among NB, ANN, LR, SVM, RF, and Gradient Boosting Tree (GBT), to classify DGA domains.The DT-J48 classification algorithm worked with high accuracy and minimum classification time.The framework uses the DBSCAN algorithm for second-level clustering, which is a densitybased clustering.As the HMM model performs well with a quick run time and elevated match accurateness, it is used to analyze the clustering results.When compared with the DT-J48 classification algorithm, the DNN model works better to classify large datasets.The research work extrapolates that deep learning algorithms perform better when compared with machine learning algorithms to classify large data sets.
Predictive Value) is the fraction of elements that are predicted correctly to the overall predicted attacks.This identifies the number of attacks classified as positive:

Table 2 :
Different attack class with relevant features of KDD CUP99 data set

Table 5 :
Four well-known Malware families and their top 5 features of DREBIN dataset

Table 7 :
A bird view on state-of-art machine learning techniques for cyber security

Table 8 :
Three major categories of DL techniques

Table 9 :
A bird view on state-of-art deep learning techniques for cyber security

Table 9 :
Continue in their work demonstrated that the LSTM Deep Learning approach outperforms the Machine Learning classifiers like J48, RF, KNN, NB, DT, and algorithms to detect FTP and SSH brute-force invasions effectively.Ferrag et al. (2020) performed an exhaustive investigation on intrusion discovery systems, datasets, and also a comparative analysis of various DL models.The authors used Deep Learning strategies like DNN, RNN, RBM, CNN, DBN, DBM, and DA for detecting Intrusions such as Brute Force, DoS, DDOS, SQL Injection, and Botnet attack and compared with different machine learning approaches like RF, NB, SVM, ANN concerning global detection rate.DBM, RNN, and CNN are the DL models incorporated for detecting networkbased intrusion.Karatas et al. (2018) listed out components involved in IDSs to enhance network security.The components of IDS are data collection, feature selection, and decision engine.The third component is the critical one where the collected data is classified as benign or malicious based on previous knowledge.