Database Normalization

What is Normalization?

Database normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing large tables into smaller, related tables and defining relationships between them.

Why Normalize?

Eliminate Data Redundancy: Reduces duplicate data storage
Improve Data Integrity: Prevents inconsistent data
Reduce Storage Space: More efficient data storage
Easier Maintenance: Updates in one place affect the entire database

Normal Forms

Unnormalized Form (0NF)

Data stored in a single table with repeating groups and no structure.

Problems:

Data redundancy
Update anomalies
Insert anomalies
Delete anomalies

First Normal Form (1NF)

Rules:

Each column contains atomic (indivisible) values
No repeating groups
Each row is unique

Second Normal Form (2NF)

Rules:

Must be in 1NF
All non-key attributes must be fully dependent on the primary key
Eliminates partial dependencies

Third Normal Form (3NF)

Rules:

Must be in 2NF
No transitive dependencies
Non-key attributes should not depend on other non-key attributes

Practical Example: Student Database

Step 1: Create Database

CREATE DATABASE students;
USE students;

Step 2: Unnormalized Data (0NF)

-- Unnormalized Data
CREATE TABLE students_raw(
    student_id INT,
    student_name VARCHAR(30),
    course1 VARCHAR(50),
    course2 VARCHAR(50),
    course3 VARCHAR(50)
);

INSERT INTO students_raw (student_id, student_name, course1, course2, course3)
VALUES
(1, "rana", "java", "python", "javascript"),
(2, "buddha", "java", "c", "jython"),
(3, "rahul", "java", "c", "jython");

SELECT * FROM students_raw;

Problems with Unnormalized Form:

Repeating groups (course1, course2, course3)
Wasted space if student takes fewer courses
Difficult to add more courses

Step 3: First Normal Form (1NF)

-- 1NF: Remove repeating groups
CREATE TABLE student_1nf (
    student_id INT,
    student_name VARCHAR(30),
    course VARCHAR(30)
);

INSERT INTO student_1nf (student_id, student_name, course) VALUES
(1, 'rana', 'java'),
(1, 'rana', 'python'),
(1, 'rana', 'javascript'),
(2, 'buddha', 'java'),
(2, 'buddha', 'c'),
(2, 'buddha', 'jython'),
(3, 'rahul', 'java'),
(3, 'rahul', 'c'),
(3, 'rahul', 'jython');

SELECT * FROM student_1nf;

Improvements:

Atomic values in each column
No repeating groups
Flexible number of courses per student

Remaining Problems:

Data redundancy (student names repeated)
Update anomalies

Step 4: Second Normal Form (2NF)

-- 2NF: Remove partial dependencies
CREATE TABLE students(
    student_id INT PRIMARY KEY,
    student_name VARCHAR(30)
);

CREATE TABLE students_course(
    student_id INT,
    course VARCHAR(30),
    PRIMARY KEY (student_id, course),
    FOREIGN KEY (student_id) REFERENCES students(student_id)
);

INSERT INTO students (student_id, student_name) VALUES
(1, 'rana'),
(2, 'buddha'),
(3, 'rahul');

INSERT INTO students_course (student_id, course) VALUES
(1, 'java'),
(1, 'python'),
(1, 'javascript'),
(2, 'java'),
(2, 'c'),
(2, 'jython'),
(3, 'java'),
(3, 'c'),
(3, 'jython');

SELECT * FROM students;
SELECT * FROM students_course;

Improvements:

Eliminated partial dependencies
Student names stored only once
Separate tables for different entities

Step 5: Third Normal Form (3NF)

-- 3NF: Remove transitive dependencies
CREATE TABLE course(
    course_id INT PRIMARY KEY,	
    course_name VARCHAR(30),
    instructor VARCHAR(30)
);

-- Recreate students_course with course_id reference
DROP TABLE students_course;

CREATE TABLE students_course(
    student_id INT,
    course_id INT,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES course(course_id)
);

INSERT INTO course (course_id, course_name, instructor) VALUES
(101, 'java', 'Navin Sir'),
(102, 'python', 'Hitesh Sir'),
(103, 'javascript', 'Pyuish Sir'),
(104, 'c', 'Sourabh Sir'),
(105, 'jython', 'Avishek Sir');

INSERT INTO students_course (student_id, course_id) VALUES 
(1, 101),
(1, 102),
(1, 103),
(2, 101),
(2, 104),
(2, 105),
(3, 101),
(3, 104),
(3, 105);

SELECT * FROM students;
SELECT * FROM course;
SELECT * FROM students_course;

Step 6: Query Normalized Data

-- Join all tables to get complete information
SELECT s.student_id, s.student_name, c.course_id, c.course_name, c.instructor
FROM students s
INNER JOIN students_course sc ON s.student_id = sc.student_id
INNER JOIN course c ON sc.course_id = c.course_id;

Final Database Structure (3NF)

Table: `students`

student_id (Primary Key)
student_name

Table: `course`

course_id (Primary Key)
course_name
instructor

Table: `students_course` (Junction Table)

student_id (Foreign Key)
course_id (Foreign Key)
Primary Key: (student_id, course_id)

Benefits Achieved

✅ No Data Redundancy: Each piece of information stored once
✅ Data Integrity: Consistent data across tables
✅ Flexibility: Easy to add new students, courses, or instructors
✅ Maintainability: Updates in one place reflect everywhere
✅ Storage Efficiency: Minimal wasted space

Key Takeaways

Normalization is a step-by-step process
Each normal form builds upon the previous one
3NF is usually sufficient for most applications
Higher normal forms exist (BCNF, 4NF, 5NF) for complex scenarios
Balance normalization with query performance needs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database Normalization

What is Normalization?

Why Normalize?

Normal Forms

Unnormalized Form (0NF)

First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)

Practical Example: Student Database

Step 1: Create Database

Step 2: Unnormalized Data (0NF)

Step 3: First Normal Form (1NF)

Step 4: Second Normal Form (2NF)

Step 5: Third Normal Form (3NF)

Step 6: Query Normalized Data

Final Database Structure (3NF)

Table: `students`

Table: `course`

Table: `students_course` (Junction Table)

Benefits Achieved

Key Takeaways

FilesExpand file tree

Normailazion.md

Latest commit

History

Normailazion.md

File metadata and controls

Database Normalization

What is Normalization?

Why Normalize?

Normal Forms

Unnormalized Form (0NF)

First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)

Practical Example: Student Database

Step 1: Create Database

Step 2: Unnormalized Data (0NF)

Step 3: First Normal Form (1NF)

Step 4: Second Normal Form (2NF)

Step 5: Third Normal Form (3NF)

Step 6: Query Normalized Data

Final Database Structure (3NF)

Table: students

Table: course

Table: students_course (Junction Table)

Benefits Achieved

Key Takeaways

Table: `students`

Table: `course`

Table: `students_course` (Junction Table)