Normalization

Database Normalization is a technique of organizing the data in the database. It is a systematic approach of decomposing tables into atomic entities to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anomalies.

If a database design is not perfect, it may contain anomalies, which are like a bad dream for any database administrator. Managing a database with anomalies is difficult and next to impossible.

Normalization is a method to remove all these anomalies and bring the database to a consistent state.

It is a multi-step process that puts data into tabular form by removing duplicated data from the relation tables.

Normalization is used or serves mainly two purposes,

Eliminating redundant (useless or repetitive) data.
Ensuring data dependencies make sense i.e. data is logically stored.

Anomalies in DBMS

There are three types of anomalies that occur when the database is not normalized. These are – Insertion, update and deletion anomaly.

Update anomalies− If data items are scattered and are not linked to each other properly, then it could lead to strange situations. For example, when we try to update one data item having its copies scattered over several places, a few instances get updated properly while a few others are left with old values. Such instances leave the database in an inconsistent state.
Deletion anomalies− Say we tried to delete a record, but parts of it were left undeleted because of unawareness, the data is also saved somewhere else.
Insert anomalies– Say we tried to insert data in a record that does not exist at all.

Let’s take an example to understand this.

Suppose a manufacturing company stores the employee details in a table named employee that has four attributes: emp_id for storing employee’s id, emp_name for storing employee’s name, emp_address for storing employee’s address and emp_dept for storing the department details in which the employee works. At some point of time the table looks like this:

emp_id emp_name emp_address emp_dept

101 Rick Delhi D001

101 Rick Delhi D002

123 Maggie Agra D890

166 Glenn Chennai D900

166 Glenn Chennai D004

The above table is not normalized. We will see the problems that we face when a table is not normalized.

Update anomaly: In the above table we have two rows for employee Rick as he belongs to two departments of the company. If we want to update the address of Rick then we have to update the same in two rows or the data will become inconsistent. If somehow, the correct address gets updated in one department but not in other then as per the database, Rick would be having two different addresses, which is not correct and would lead to inconsistent data.

Insert anomaly: Suppose a new employee joins the company, who is under training and currently not assigned to any department then we would not be able to insert the data into the table if emp_dept field doesn’t allow nulls.

Delete anomaly: Suppose, if at a point of time the company closes the department D890 then deleting the rows that are having emp_dept as D890 would also delete the information of employee Maggie since she is assigned only to this department.

To overcome these anomalies and having a peaceful life we need to normalize the data.

Bibliography source:

http://www.studytonight.com/dbms/database-normalization.php

https://www.tutorialspoint.com/dbms/database_normalization.htm

https://beginnersbook.com/2015/05/normalization-in-dbms/

In a Database Management System

A transitive dependency is a functional dependency which holds by virtue of transitivity.

Transitive dependencies occur when there is an indirect relationship that causes a functional dependency.

A transitive dependency can occur only in a relation that has three or more attributes.

The symbol “->” means than.

”A -> C” is a transitive dependency when it is true only because both “A -> B” and “B -> C” are true

Suppose A, B, and C designates three distinct attributes (or distinct collections of attributes) in the relation. Suppose all three of the following conditions hold:

A → B

(If A is true than B holds true)

It is not the case that B → A

(And when B is not true than A holds true)

B → C

(And when B is true C is also true in the relation.)

Consider the below table:

Book Author Author_age

Game of Thrones George R. R. Martin 66

Harry Potter J. K. Rowling 49

Dying of the Light George R. R. Martin 66

{Book} -> {Author}

If we know the book, we knows the authors name

{Author} does not -> {Book}

If we don’t know the authors name than we can still have a book.

{Author} -> {Author_age}

If we know Author, we also know Authors age.

Therefore as per the rule of transitive dependency: {Book} -> {Author_age} should hold, that makes sense because if we know the book name we can know the author’s age from the table.

Bibliography Source

https://en.wikipedia.org/wiki/Transitive_dependency

https://beginnersbook.com/2015/04/transitive-dependency-in-dbms/

Fidelity and Assets

Tag: Table

Why is normalization required?

Normalization

Transitive functional dependency.