Normalisation is a step towards DB optimisation.
X -> Y, the left side of FD is known as a Determinant, the right side of the production is known as a Dependent.Example: {user_id, user_name} -> user_name
Since user_name is a subset of {user_id, user_name}, this dependency is trivial.
Example: user_id -> user_address
Here, user_address is not a subset of user_id. Finding an address based on an ID is exactly what non-trivial FD is about.
To avoid redundancy in the DB, not to store redundant data.
What happen if we have redundant data? Insertion, deletion, and updation anomalies arise. Due to these anomalies, DB size increases and DB performance becomes very slow. To rectify these anomalies and the effect of these of DB, we use Database optimisation technique called NORMALISATION.
Anomalies means abnormalities, there are three types of anomalies introduced by data redundancy.
When certain data (attribute) cannot be inserted into the DB without the presence of other data.
The delete anomaly refers to the situation where the deletion of data results in the unintended loss of some other important data.
Example: A user table with a multi-valued Phone column.
Unnormalized Table (Not 1NF):
| Emp_ID | Name | Phone |
|---|---|---|
| 1 | John | 123-456, 987-654 |
| 2 | Alice | 555-123 |
Converted to 1NF:
Rule: Create a new tuple (row) for each multi-valued item.
| Emp_ID | Name | Phone |
|---|---|---|
| 1 | John | 123-456 |
| 1 | John | 987-654 |
| 2 | Alice | 555-123 |
Example: Consider a relation R(A, B, C, D)
{A, B}.A, B are prime attributes.C, D are non-prime attributes.Functional Dependencies (FD):
B -> C (This is a Partial Dependency because C depends only on part of the PK, which is B)AB -> C (Valid)AB -> D (Valid)2NF Conversion (Decomposition):
To remove the partial dependency B -> C, split the table into two:
Example: A student/department relationship where Dept_Name depends on Dept_ID, which depends on Student_ID.
FD: Student_ID -> Dept_ID & Dept_ID -> Dept_Name
Not in 3NF (Transitive Dependency exists):
| Student_ID | Name | Dept_ID | Dept_Name |
|---|---|---|---|
| 101 | John | D1 | CS |
| 102 | Alice | D2 | IT |
Converted to 3NF:
Table 1: Student
| Student_ID | Name | Dept_ID |
|---|---|---|
| 101 | John | D1 |
| 102 | Alice | D2 |
Table 2: Department
| Dept_ID | Dept_Name |
|---|---|
| D1 | CS |
| D2 | IT |
Example: Student Enrollment Table
| Stud_ID | Subject | Professor |
|---|---|---|
| 101 | Java | PJ |
| 101 | CPP | PC |
| 102 | Java | PJ2 |
| 103 | C# | PC# |
| 104 | Java | PJ |
Rules in this Table:
Identifying Keys & Dependencies:
Primary Key: {Stud_ID, Subject}
{Stud_ID, Subject} -> Professor (Valid PK dependency)Professor -> Subject (Violation of BCNF!)Why it violates BCNF: The attribute Professor determines Subject (a prime attribute). However, Professor is not a Super Key on its own. To fix this, decompose the table into R1(Stud_ID, Professor) and R2(Professor, Subject).