CyberSecurity Classification

Overview

This repository contains an experimental machine learning study conducted on a cybersecurity-related text classification dataset.

The dataset was provided by my juniors as part of their undergraduate project work. I explored various feature extraction techniques and classification models to evaluate different approaches for text classification and predictive analysis.

The objective was to experiment with multiple machine learning and deep learning techniques and compare their effectiveness on the given dataset.

Dataset

The study was performed using labeled datasets related to Trump and U.S. Presidential Election discussions.

Files included:

LabelledDataset.csv
LabelledDataset.xls
UpdatedDataset.csv
cleaned_datasetChecked.csv

Data Preprocessing

The dataset was cleaned and prepared through:

Data cleaning
Duplicate removal
Missing value handling
Text normalization
Feature preparation

Feature Extraction Techniques

Word Embedding

GloVe (Global Vectors for Word Representation)

Graph-Based Features

Graph Neural Network (GNN)

Probabilistic Features

Probabilistic Neural Network (PNN)

Models Implemented

Deep Learning Models

LSTM with GloVe Embeddings

Sequence modeling
Context-aware text representation

Graph Convolutional Network (GCN)

Graph-based classification
Node relationship learning

CNN-Based Classification

Convolutional Neural Network for text classification

Probabilistic Models

Probabilistic Neural Network (PNN)

Probabilistic classification approach

PNN + Random Forest

Hybrid classification model

PNN + Voting Classifier

Ensemble learning approach

GloVe + Voting Classifier

Ensemble learning using word embeddings

Probabilistic CNN

Combination of probabilistic and convolutional techniques

Technologies Used

Python
Jupyter Notebook
Pandas
NumPy
Scikit-Learn
TensorFlow
Keras

Repository Contents

Dataset Files
Data Preprocessing
Feature Extraction
Classification Models
Experimental Results
Jupyter Notebook Implementations

Purpose

This repository represents an independent exploration of machine learning techniques applied to a provided dataset. The work was performed to gain practical experience with:

Text Classification
Deep Learning Models
Graph-Based Learning
Ensemble Methods
Feature Engineering

Learning Outcomes

Through this study, I gained experience in:

Applying multiple classification algorithms
Comparing traditional and deep learning approaches
Working with GloVe embeddings
Understanding graph-based learning concepts
Evaluating ensemble classification techniques

Disclaimer

This was an experimental learning exercise conducted on a dataset provided by undergraduate students. The repository serves as a demonstration of machine learning experimentation and comparative model evaluation rather than a formal research project.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CyberSecurity.ipynb		CyberSecurity.ipynb
LabelledDataset.csv		LabelledDataset.csv
LabelledDataset.xls		LabelledDataset.xls
README.md		README.md
UpdatedDataset.csv		UpdatedDataset.csv
cleaned_datasetChecked.csv		cleaned_datasetChecked.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CyberSecurity Classification

Overview

Dataset

Data Preprocessing

Feature Extraction Techniques

Word Embedding

Graph-Based Features

Probabilistic Features

Models Implemented

Deep Learning Models

LSTM with GloVe Embeddings

Graph Convolutional Network (GCN)

CNN-Based Classification

Probabilistic Models

Probabilistic Neural Network (PNN)

PNN + Random Forest

PNN + Voting Classifier

GloVe + Voting Classifier

Probabilistic CNN

Technologies Used

Repository Contents

Purpose

Learning Outcomes

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CyberSecurity Classification

Overview

Dataset

Data Preprocessing

Feature Extraction Techniques

Word Embedding

Graph-Based Features

Probabilistic Features

Models Implemented

Deep Learning Models

LSTM with GloVe Embeddings

Graph Convolutional Network (GCN)

CNN-Based Classification

Probabilistic Models

Probabilistic Neural Network (PNN)

PNN + Random Forest

PNN + Voting Classifier

GloVe + Voting Classifier

Probabilistic CNN

Technologies Used

Repository Contents

Purpose

Learning Outcomes

Disclaimer

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages