Referential integrity is a system of rules that RDBMS uses to make sure that relationships between records in related tables are valid, and that the related data are not accidentally deleted or changed.
- Referential integrity ensures that the relationship between two tables remains synchronized during updates and deletes.
- For example:
- Let assume that an application has both a Titles table and a Publishers table as shown in the following table.
Title Table | Publishers Table |
ti_isbn (PK) | pu_id (PK) |
ti_title | pu_name |
ti_publishedyear | pu_address |
pu_id (foreign key) | pu_phone |
- Referential integrity requires that these two tables must be synchronized. That is, each publisher identification (pu_id) in the Titles table must also exist in the Publishers table.
- Your application cannot just delete the pu_id row from the Publishers table because that would leave the pu_id in the Titles table without a reference. It would be permissible, however, to delete the pu_id row from the Publishers table and also delete every row in the Titles table that has the same pu_id. Such an action would maintain referential integrity for these two tables.
- In a similar manner, your application cannot just add a row to the Titles table without a valid pu_id already in the Publishers table. To do so would insert "bad" data in the pu_id field. So, your application must ensure a valid pu_id key in the Publishers table before inserting the pu_id in the related Titles row.
- The actual implementation of referential integrity depends entirely on the data storage engine you choose and your application's design requirements. Historically, applications using mainframe VSAM files used application code to handle referential integrity. Today, even if your application uses SQL Server, that does not mean you must use triggers, foreign keys, constraints, and cascading deletes to maintain referential integrity. You might again choose to handle referential issues with application-based code.