Describe what the term data redundancy means in database design. Explain if it is bad and if a normalized database design is always the best design. Discuss the pros and cons of a normalized database.

Data redundancy is a condition created within a database or data storage technology in which the same piece of data is held in two separate places. This can mean two different fields within a single database, or two different spots in multiple software environments or platforms. Whenever data is repeated, this basically constitutes data redundancy. This can occur by accident, but is also done deliberately for backup and recovery purposes.

Pros of a Normalized Database

Updates run quickly due to no data being duplicated in multiple locations.

Also, Inserts run quickly since there is only a single insertion point for a piece of data and no duplication is required.

Cons of a Normalized Database

Since data is not duplicated, table joins are required. This makes quires more complicated, and thus read times are slower.

And, since joins are required, indexing does not work as efficiently.

Again, this makes readtimes slower because the joins don’t typically work well with indexing.

In database design, a normalized database helps remove redundant data. This saves storage space and, more importantly, various data anomalies, such as an insert problem, delete problem, and so forth. However, normalizing a database can create problems. The data retrieval becomes complex and time-consuming. For this reason, a little bit of data redundancy is often desirable and allowable. It makes the data reporting easier. Generally, OLTP (transaction database) has a normalized design and the reporting database, such as a data warehouse, has a highly denormalized design.