Private Island Networks Inc.

Machine Learning and Data Networks

A summary of recent articles and issues related to machine learning for data networks.

Overview

This article lists and summarizes relevant studies, papers, tools, and datasets related to Machine Learning (ML) for data networks. The objective is to tie together the ML capabilities of the Private Island ® open source project with ongoing research and deployments both in industry and academia.

ML Concepts

"Physical AI lets autonomous systems like cameras, robots, and self-driving cars perceive, understand, reason, and perform or orchestrate complex actions in the physical world."
"A SmartNIC (Network Interface Card) is a specialized type of NIC that offloads certain network processing tasks from the host CPU."

Surveys / Primers

Primer paper that provides "an efficient start for anybody doing research regarding machine learning for networks or using networks for machine learning."
Summary article by a network giant
Summary of "fundamental techniques, specific frameworks, and access to relevant datasets"
"This paper explores the crucial role of AI and ML in enhancing cybersecurity defenses against increasingly sophisticated cyber threats, while also highlighting the new vulnerabilities introduced by these technologies. Through a comprehensive analysis that includes historical trends, technological evaluations, and predictive modeling, the dual-edged nature of AI and ML in cybersecurity is examined. Significant challenges such as data privacy, continuous training of AI models, manipulation risks, and ethical concerns are addressed."
"Explores the application of ML in selecting and optimizing cybersecurity models for enterprise ICT systems"

ML Community / On-line Database

"The platform where the machine learning community collaborates on models, datasets, and applications."
"Join over 29M+ machine learners to share, stress test, and stay up-to-date on all the latest ML techniques and technologies. Discover a huge repository of community-published models, data & code for your next project."
"Explore and extend models from the latest cutting edge research"

ML and Real-Time Interfencing on Ubuntu Linux

"Build enterprise-grade AI projects with secure and supported Canonical MLOps. Develop on your Ubuntu workstation using Charmed Kubeflow or Charmed MLFlow and scale up quickly with open source tooling in every part of your stack."
"Install a well-known model like DeepSeek R1 or Qwen 2.5 VL with a single command, and get the silicon-optimized AI engine automatically."
"An end-to-end demo that will walk you through setting up a scalable model training environment"
"This blog aims to provide an in-depth look at Ubuntu AI, covering fundamental concepts, usage methods, common practices, and best practices."

ML Frameworks

"Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters."
"OpenVINO is an open-source toolkit for deploying performant AI solutions in the cloud, on-prem, and on the edge alike. Develop your applications with both generative and conventional AI models, coming from the most popular model frameworks. Convert, optimize, and run inference utilizing the full potential of Intel hardware."

Network Data Sets for Machine Learning

"A comprehensive dataset derived from 40 weeks of traffic transmitted by 275,000 active IP addresses in the CESNET3 network—an ISP network serving approximately half a million customers daily."

Using ML to Detect & Thwart Denial of Service (DOS) Attacks

Figure 1: Machine Learning Utilizing Local PC
Network-based Machine Learning Utilizing Local PC
"A mathematical model for distributed denial-of-service attacks is proposed in this study. Machine learning algorithms such as Logistic Regression and Naive Bayes, are used to detect attacks and normal scenarios."
"This study proposes a machine-learning-based framework to enhance DDoS attack detection and mitigation, employing Random Forest, XGBoost, and Long Short-Term Memory (LSTM) models."
"A PCA-based Enhanced Distributed DDoS Attack Detection (EDAD) framework has been proposed. Various Machine Learning (ML) algorithms and feature selection techniques have been used to detect DDoS attacks. Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbours (KNN), Decision Tree (DT) supervised models, and Principle Component Analysis (PCA) feature selection method are used to differentiate between attack and regular traffic."
"Examines the application of twelve leading Machine Learning (ML) techniques, utilizing the Pycaret module, to effectively analyze Distributed Denial of Service (DDoS) attacks."

Contributed by Jacob:

"Distributed denial of service (DDoS) attacks are a subclass of denial of service (DoS) attacks. A DDoS attack involves multiple connected online devices, collectively known as a botnet, which are used to overwhelm a target website with fake traffic."
"The Denial of Service (DoS) attack is focused on making a resource (site, application, server) unavailable for the purpose it was designed. There are many ways to make a service unavailable for legitimate users by manipulating network packets, programming, logical, or resources handling vulnerabilities, among others. If a service receives a very large number of requests, it may cease to be available to legitimate users. In the same way, a service may stop if a programming vulnerability is exploited, or the way the service handles resources it uses."
"Adversaries may perform Network Denial of Service (DoS) attacks to degrade or block the availability of targeted resources to users. Network DoS can be performed by exhausting the network bandwidth services rely on. Example resources include specific websites, email services, DNS, and web-based applications. Adversaries have been observed conducting network DoS attacks for political purposes and to support other malicious activities, including distraction, hacktivism, and extortion"
"Distributed Denial of Service (DDoS) attack is a menace to network security that aims at exhausting the target networks with malicious traffic. Although many statistical methods have been designed for DDoS attack detection, designing a real-time detector with low computational overhead is still one of the main concerns. On the other hand, the evaluation of new detection algorithms and techniques heavily relies on the existence of well-designed datasets."

"In this paper, we first review the existing datasets comprehensively and propose a new taxonomy for DDoS attacks. Secondly, we generate a new dataset, namely CICDDoS2019, which remedies all current shortcomings. Thirdly, using the generated dataset, we propose a new detection and family classification approach based on a set of network flow features. Finally, we provide the most important feature sets to detect different types of DDoS attacks with their corresponding weights..
"In this project, the machine learning-based model was proposed to detect DDoS attacks. The proposed model used the DDoS-CICIDS2017 dataset with 79 features, and applied four algorithms: Logistic Regression (LR), Support Vector Machine (SVM) with different kernels, Random Forest (RF), and Gradient Boosting (GB). The results highlight the outstanding performance of the Random Forest model, achieving an exceptional 99.99% accuracy, precision, recall, and F1 Score. Notably, this model demonstrated a perfect precision of 100.00%, underscoring its efficacy in accurately classifying DDoS traffic and solidifying its role as a formidable defense against these cyber threats."

Didn't find an answer to your question? Post your issue below or in our new FORUM, and we'll try our best to help you find a solution.

And please note that we update our site daily with new content related to our open source approach to network security and system design. If you would like to be notified about these changes, then please join our mailing list.

share
subscribe to mailing list:

Please help us improve this article by adding your comment or question:

your email address will be kept private
authenticate with a 3rd party for enhanced features, such as image upload
previous month
next month
Su
Mo
Tu
Wd
Th
Fr
Sa
loading