📊 Data Science & Preprocessing Practice Portfolio

Welcome to my Data Science and Analytics practice repository! This repository serves as a centralized hub for all my exploratory data analysis (EDA), data cleaning, text parsing, feature engineering, and preprocessing projects.

The main goal of this repository is to track my learning journey, build solid algorithmic thinking, and showcase clean, industry-standard data processing workflows.

🛠️ Tech Stack & Tools Used

Language: Python
Libraries: Pandas, NumPy, Matplotlib, Seaborn, Plotly
Environment: Jupyter Notebook / VS Code

📂 Project Portfolio (Current Tracks)

🚗 1. CarDekho Dataset - Preprocessing & Text Parsing

File: car_project.ipynb / cardataset.csv
Description: A comprehensive data cleaning and feature engineering project on 8,000+ car records. It involves complex string formatting (extracting numerical values from alphanumeric strings like CC, bhp, kmpl), handling missing records using numerical medians, extracting brand categories, and engineering time-based metrics like car_age. It also includes deep dives into outlier handling techniques and structural categorical encodings (One-Hot and Ordinal Mapping).

🤖 2. Google Play Store Dataset Analysis

File: google_playstore.ipynb / google_play_store_dataset.csv
Description: Exploratory Data Analysis (EDA) focused on app store dynamics. Cleaned and processed raw user installation numbers, ratings, app sizes, and pricing structures to build structural distribution metrics. Includes calculating estimated revenue parameters for paid apps and identifying high-volume market segments.

🏢 3. Airbnb NYC 2019 Data Analysis

File: AB_NYC_2019.ipynb / AB_NYC_2019.csv
Description: Spatial and financial analysis of the New York City Airbnb housing market. Focused on profiling neighborhood groups, investigating right-skewed pricing distributions, mapping price densities across coordinates, and filtering availability patterns to draw structural domain insights.

🛒 4. Supermarket Sales Analysis

File: SuperMarketAnalysis.ipynb / SuperMarketAnalysis.csv
Description: A detailed business analytics and retail dataset pipeline. Implemented data sorting routines by chronological dates, analyzed consumer product lines across gender distributions, calculated financial gross income parameters, and correlated transactional payment methods against aggregate customer ratings.

🗂️ How to Run the Notebooks

To set up and run these projects locally, follow these steps:

Clone this repository to your local directory:

git clone [https://github.com/code-with-ayyan/Data-Science-practice-projects.git](https://github.com/code-with-ayyan/Data-Science-practice-projects.git)

Navigate into the repository folder:

cd Data-Science-practice-projects/

Launch Jupyter Notebook to review the source files:

jupyter notebook

Maintained with consistency and passion for backend development and data engineering.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Data Science & Preprocessing Practice Portfolio

🛠️ Tech Stack & Tools Used

📂 Project Portfolio (Current Tracks)

🚗 1. CarDekho Dataset - Preprocessing & Text Parsing

🤖 2. Google Play Store Dataset Analysis

🏢 3. Airbnb NYC 2019 Data Analysis

🛒 4. Supermarket Sales Analysis

🗂️ How to Run the Notebooks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
AB_NYC_2019.csv		AB_NYC_2019.csv
AB_NYC_2019.ipynb		AB_NYC_2019.ipynb
README.md		README.md
SuperMarketAnalysis.csv		SuperMarketAnalysis.csv
SuperMarketAnalysis.ipynb		SuperMarketAnalysis.ipynb
car_project.ipynb		car_project.ipynb
cardataset.csv		cardataset.csv
google_play_store_dataset.csv		google_play_store_dataset.csv
google_playstore.ipynb		google_playstore.ipynb
insurance.csv		insurance.csv
insurance.ipynb		insurance.ipynb

Folders and files

Latest commit

History

Repository files navigation

📊 Data Science & Preprocessing Practice Portfolio

🛠️ Tech Stack & Tools Used

📂 Project Portfolio (Current Tracks)

🚗 1. CarDekho Dataset - Preprocessing & Text Parsing

🤖 2. Google Play Store Dataset Analysis

🏢 3. Airbnb NYC 2019 Data Analysis

🛒 4. Supermarket Sales Analysis

🗂️ How to Run the Notebooks

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages