This repository contains an experimental machine learning study conducted on a cybersecurity-related text classification dataset.
The dataset was provided by my juniors as part of their undergraduate project work. I explored various feature extraction techniques and classification models to evaluate different approaches for text classification and predictive analysis.
The objective was to experiment with multiple machine learning and deep learning techniques and compare their effectiveness on the given dataset.
The study was performed using labeled datasets related to Trump and U.S. Presidential Election discussions.
Files included:
- LabelledDataset.csv
- LabelledDataset.xls
- UpdatedDataset.csv
- cleaned_datasetChecked.csv
The dataset was cleaned and prepared through:
- Data cleaning
- Duplicate removal
- Missing value handling
- Text normalization
- Feature preparation
- GloVe (Global Vectors for Word Representation)
- Graph Neural Network (GNN)
- Probabilistic Neural Network (PNN)
- Sequence modeling
- Context-aware text representation
- Graph-based classification
- Node relationship learning
- Convolutional Neural Network for text classification
- Probabilistic classification approach
- Hybrid classification model
- Ensemble learning approach
- Ensemble learning using word embeddings
- Combination of probabilistic and convolutional techniques
- Python
- Jupyter Notebook
- Pandas
- NumPy
- Scikit-Learn
- TensorFlow
- Keras
- Dataset Files
- Data Preprocessing
- Feature Extraction
- Classification Models
- Experimental Results
- Jupyter Notebook Implementations
This repository represents an independent exploration of machine learning techniques applied to a provided dataset. The work was performed to gain practical experience with:
- Text Classification
- Deep Learning Models
- Graph-Based Learning
- Ensemble Methods
- Feature Engineering
Through this study, I gained experience in:
- Applying multiple classification algorithms
- Comparing traditional and deep learning approaches
- Working with GloVe embeddings
- Understanding graph-based learning concepts
- Evaluating ensemble classification techniques
This was an experimental learning exercise conducted on a dataset provided by undergraduate students. The repository serves as a demonstration of machine learning experimentation and comparative model evaluation rather than a formal research project.