Database Tables and Normalization
* Normalization is a process for assigning attributes to entities. It reduces data redundancies and helps eliminate the data anomalies. * Normalization works through a series of stages called normal forms: * First normal form (1NF) * Second normal form (2NF) * Third normal form (3NF) * Fourth normal form (4NF) * The highest level of normalization is not always desirable.
* The Need for Normalization * Case of a Construction Company * Building project -- Project number, Name, Employees assigned to the project. * Employee -- Employee number, Name, Job classification * The company charges its clients by billing the hours spent on each project. The hourly billing rate is dependent on the employee’s position. * Periodically, a report is generated. * The table whose contents correspond to the reporting requirements is shown in Table 5.1.
Scenario
A few employees works for one project.
Employee Num : 101, 102, 103, 105 Project Num : 15 Project Name : Evergreen
Sample Form
Table Structure Matches the Report Format
Database Tables and Normalization * Problems with the Figure 5.1 * The project number is intended to be a primary key, but it contains nulls. * The table displays data redundancies. * The table entries invite data inconsistencies. * The data redundancies yield the following anomalies: * Update anomalies. * Addition anomalies. * Deletion anomalies.
* Conversion to First Normal Form * A relational table must not contain repeating groups. * Repeating groups can be eliminated by adding the appropriate entry in at least the primary key column(s).
Data Organization: First Normal Form
BEFORE
AFTER
First Normal Form ( 1 NF )
* 1NF Definition