Database normalization in Sql Server

Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency. Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships among them.

Database normalization in Sql Server
Normalization, Sql Server, Database - thetechfoyer,thetechfoyer.com

This Articles helps you to understand the concept of Normalization in Sql Server.



Normalization Resolves

  • Duplication of Data- The same data is listed in multiple lines of the database
  • Insert Anomaly- A record about an entity cannot be inserted into the table without first inserting information about another entity - Cannot enter a customer without a sales order
  • Delete Anomaly- A record cannot be deleted without deleting a record about a related entity. Cannot delete a sales order without deleting all of the customer's information.
  • Update Anomaly- Cannot update information without changing information in many places. To update customer information, it must be updated for each sales order the customer has placed


Table Before Normalization



First Normal Form (1st NF) :
An entity is in the first normal form if it contains no repeating groups, no repeating columns. Repeating columns make your data less flexible, waste disk space, and makes it more difficult to search for data.  In 1NF relation, the order of tuples (rows) and attributes (columns) does not matter. In this Normal Form, we tackle the problem of atomicity. Here atomicity means values in the table should not be further divided. In simple terms, a single cell cannot hold multiple values. . in 1st NF:-

  • The table cells must be of a single value.
  • Eliminate repeating groups in individual tables.
  • Create a separate table for each set of related data.
  • Identify each set of related data with a primary key.



Second Normal Form (2nd NF):
A relation is in 2NF if it is in 1NF and every non-key attribute is fully dependent on each candidate key of the relation.
In 2nd NF, Remove All Partial Dependencies. Partial dependency means the proper subset of candidate key determines a non-prime attribute

  1. Functional Dependency: The value of one attribute in a table is determined entirely by the value of another.
  2. Partial Dependency: A type of functional dependency where an attribute is functionally dependent on only part of the primary key (primary key must be a composite key).
    How to Achieve?

Create a separate table with the functionally dependent data and the part of the key on which it depends. The tables created at this step will usually contain descriptions of resources.
In the table below, the OrderId serves as the primary key. The customer and total amount are dependent upon the OrderId -- this data is specific to each order. , the contact person is dependent upon the customer.

Here, COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO; Even COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO; Also COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO;
This shows that COURSE_FEE would be a non-prime attribute, as it does not belong to the only one candidate key {STUD_NO, COURSE_NO}; But, COURSE_FEE is dependent on COURSE_NO, which is a proper subset of the candidate key.
Non-prime attribute COURSE_FEE is dependent on a proper subset of the candidate key, which is a partial dependency and so this relation is not in 2NF.
To convert the above relation to 2NF, we need to split the table into two tables as below



Third Normal Form (3rd NF)
A relation is in third normal form if it is in 2NF and every non-key attribute of the relation is non-transitively dependent on each candidate key of the relation.
In 3rd NF, Remove transitive dependencies.

  • The first condition for the table to be in Third Normal Form is that the table should be in the Second Normal Form.
  • The second condition is that there should be no transitive dependency for non-prime attributes, which indicates that non-prime attributes (which are not a part of the candidate key) should not depend on other non-prime attributes in a table.
  • The third Normal Form ensures the reduction of data duplication. It is also used to achieve data integrity.

Transitive Dependency is a type of functional dependency in which A → C (A determines C) indirectly, because of A → B and B → C (where it is not the case that B → A), where an attribute is functionally dependent on an attribute other than the primary key. Thus its value is only indirectly determined by the primary key.
Create a separate table containing the attribute and the fields that are functionally dependent on it. The tables created at this step will usually contain descriptions of either resources or agents. Keep a copy of the key attribute in the original file.

IN above table Functional Dependency set:
{StudentId -> StudentName,   StudentId -> State,   State-> Country,   StudentId -> StudentAge}
Candidate Key: {StudentId }
In above tablle StudentId -> State and State-> Country are true. So Country is transitively dependent on StudentId . It violates the third normal form.
To convert it in third normal form, we will decompose the relation STUDENT as follows: -



Boyce-Codd Normal Form (BCNF)
The Boyce-Codd Normal Form, also referred to as the "third and half (3.5) normal form
For Boyce-Codd Normal Form, Table Should be in the form of the 3 NF. Also, there shouldn’t be any multi-valued dependencies i.e. Every determinant must be a candidate key.


Lets consider all possiblities: -

  • One student can enrol for multiple subjects.
  • There can be multiple professors teaching one subject
  •  For each subject, a professor is assigned to the student

In this table, all the normal forms are satisfied except BCNF. Why?
1. Student ID,  and Subject form the primary key, which means the Subject column is a prime attribute. But, there is one more dependency, Professor → Subject.
2. And while Subject is a prime attribute, Professor is a non-prime attribute, which is not allowed by BCNF.
Now in order to satisfy the BCNF, we will be dividing the table into two parts. One table will hold Student ID which already exists and newly created column Professor ID.


4th Normal Form
Satisfies all the requirements of the third normal form.
A relation is in 4NF if it has no multi-valued dependencies.



5th Normal Form
Satisfies all the requirements of the fourth normal form.
It cannot be decomposed into any number of smaller tables without loss of data.