Relating data in different tables

If SQL could only handle one table at a time, it would be convenient, but not very powerful. The true power of SQL and relational databases lies in the fact that tables can be related to each other in useful ways, and SQL queries can gather data from multiple tables and display that data in a logical fashion.

We will show how multiple tables are useful by using a bank as an example.

Each customer of a bank may have more than 1 account. There is no real limit to the number of accounts a person might have. One customer might have a checking account, savings account, certificate of deposit, mortgage, credit card, and so on. Furthermore, a person may have multiple accounts of the same type. For example, a customer might have one savings account with retirement money and another savings account (of the same type) that has money for her daughter's college fund. We describe the "relationship" between a person and her accounts as a "one to many" relationship -- one person may have many accounts.

Because there is no limit to the number of accounts a person may have, there is no way to design a record structure ahead of time that can handle all possible combinations of accounts. And if you created a record structure that held the maximum number of accounts that anyone actually owned, you'd have to waste a lot of space. Let's suppose that we tried to build a single table that held all the information about one bank customer and her accounts. Our first draft might look like the following:

Customer ID Number
Customer Name
Customer Address
Checking Account #1 ID
Checking Account #1 Balance
CD #1 ID
CD #1 Balance
CD #2 ID
CD #2 Balance
...

As you can see, we just do not know when to stop because there is no obvious limit to the number of accounts that each person might own.

Another solution is to create multiple records, one for each account, and duplicate the customer information for each account. So we have a table that looks like:

If a customer has more than one account, we merely create a complete record for each account. This works reasonably well, but it means that every single account record holds all the information about the customer. This wastes storage space and also makes it harder to update the customer's address if the customer moves (you may have to update the address in several places).

Relational databases, such as solidDB^®'s, are designed to solve this problem. We will create one table for customers, and another table for accounts. (In a real bank, we would probably divide the accounts into multiple tables, too, with one table for checking accounts, another table for savings accounts, and so on) Then we create a "link" between the customer and each of her accounts. This allows us to waste very little space and yet still have complete information available to us.

As we mentioned earlier, in our example of composers, every record should have a unique value that allows us to identify that record. The unique value is usually just an integer. We'll use that unique integer to help us "relate" a customer to her accounts. This is discussed in more detail in Getting started with SQL.

When we create an account for a customer, we store that customer's ID number as part of the account information. Specifically, each row in the accounts table has a customer_id value, and that customer_id value matches the id of the customer who owns that account. Smith has customer id 1, and each of Smith's accounts has a 1 in the customer_id field. That means that we can find all of Smith's account records by doing the following:

It's as though you taped a copy of your home telephone number onto the forehead of each of your children when they went to school. If there is an emergency and you need to send a taxi driver to find and pick up your children at school, you can simply tell the taxi driver your phone number and he can check every child in the school to see if the child has your phone number. (This isn't very efficient, but it works.) By knowing the parent's id number, you can identify all the children. Conversely, by knowing each child, you can identify the parent. If, for example, one of your children is lost on a field trip away from the school, any helpful person can simply read the telephone number off the child's forehead and call you.

As you can see, the parent and child are linked to each other without any sort of physical contact. Simply having the id number (or phone number) is enough to determine which children belong to a parent and which parent belongs to each child. The technique works regardless of how many children you have.

Relational databases use the same technique. Note that join operations are not limited to two tables. It's possible to create joins with an almost arbitrary number of tables. As a realistic extension of our banking example, we might have another table, "checks", which holds information about each check written. Thus we would have not only a 1-to-many relationship from each customer to her accounts, but also a 1-to-many relationship from each checking account to all of the checks written on that account. It's quite possible to write a query that will list all the checks that a customer has written, even if that customer has multiple checking accounts.