Section 2: Relational Foundations

Tables, Keys, and Relationships

§1 Relational Foundations #

This section talks about foundational concepts for understanding how data is structured and managed in most modern database systems. We will explore core components such as tables and keys, and learn how they connect to form useful data stores.

§2 relational database concepts #

Relational databases are based on the relational model, an idea from mathematics. It’s a way of seeing data organized in tables.

§2.1 Tables, rows, columns #

+-----------+-----------+----------+
| StudentID | FirstName | LastName |
+-----------+-----------+----------+
| 14378     | Alice     | Smith    |
| 28371     | Bob       | Jones    |
+-----------+-----------+----------+
+-----------+-----------+----------+
| StudentID | FirstName | LastName |
+-----------+-----------+----------+
| 14378     | Alice     | Smith    |  ⬅︎ One Row of data
| 28371     | Bob       | Jones    |
+-----------+-----------+----------+
+-----------+-----------+----------+
| StudentID | FirstName | LastName |
+-----------+-----------+----------+
| 14378     | Alice     | Smith    |
| 28371     | Bob       | Jones    |
+-----------+-----------+----------+
                 ⬆︎ 
        One column of data

§2.2 Domains and data types #

A domain is the set of all possible legal values for a column, put another way: a domain is the set of all values that the column is allowed to hold, defined by the data type plus any additional constraints (NOT NULL, CHECK, FOREIGN KEY, etc.). This is enforced by data types.
For example, a StudentID column might have a data type of INTEGER, meaning only whole numbers are allowed. A FirstName column would be a TEXT or VARCHAR (variable character) type. Using the right data type is crucial for data integrity.

§2.2.1 A Note on NULL #

All data types accept the value NULL unless the column was explicitly set as NOT NULL. NULL represents the absence of value, and is different than an empty string or a 0. A Database always considers NULL to be an unknown value, SQL considers NULL = NULL to be a False statement, because unknown values cannot be compared.

§2.3 Schema vs. instance #

Schema: This is the blueprint or structure of the database. It defines the tables, the columns in those tables, the data types for each column, the keys, and the relationships between tables. The schema is designed once and doesn’t change often.

Schema changes are inherently high-risk: they usually require coordinated updates to every application that touches the database, careful data migration scripts, extensive regression testing, and often a maintenance window. A single mis-migration can corrupt data or break downstream systems, so schema alterations need to be planned, reviewed, and rehearsed far more rigorously than routine code changes.

Instance: is the live, self-contained execution environment that makes the database files accessible to clients.

Simple Instance Architecture Diagram

graph TD Cl((Client)) subgraph B[Single Instance] direction LR EP[Database Endpoint] end subgraph C[Data Layer] DB[(Database File)] end Cl -->|connect| B EP --> DB

Complex Sharded Instance Architecture Diagram

graph TD Cl((Client)) subgraph B["Logical Instance"] EP[Database Endpoint] end subgraph C[Shard Layer] S1[(Shard 1)] S2[(Shard 2)] Sn[(Shard n)] end Cl -->|connect| B EP --> S1 EP --> S2 EP --> Sn

Notice how in both diagrams, the client only connects to a single endpoint, regardless of the underlying database architecture.

§3 Data Types #

Data types are essential for maintaining data integrity by defining the legal values a column can hold. While the specific names and features can vary between database systems, they generally fall into several common categories:

Choosing the correct data type is a balance between flexibility and performance. Here is a comparison of common data types across different database systems:

Data TypeOraclePostgreSQLMySQLSQLite
32-bit intINTEGERINTEGERINTINTEGER
64-bit intNUMBER(19)BIGINTBIGINTINTEGER
Variable textVARCHAR2(n)VARCHAR(n)VARCHAR(n)TEXT
Exact decimalNUMBER(p,s)NUMERICDECIMALREAL
BooleanNUMBER(1)BOOLEANBOOLEANINTEGER (0/1)
Date & timeDATE/TIMESTAMPTIMESTAMPDATETIMETEXT/NUMERIC
Small decimalNUMBER(p,s)NUMERIC/DECIMALDECIMAL(p,s)REAL
Big decimalNUMBER(p,s)NUMERIC/DECIMALDECIMAL(p,s)REAL
Fixed-length stringCHAR(n)CHAR(n)CHAR(n)TEXT (affinity)
Variable-length stringVARCHAR2(n)/VARCHAR(n)VARCHAR(n)VARCHAR(n)TEXT
Large textCLOBTEXTLONGTEXTTEXT
Date onlyDATEDATEDATETEXT (ISO8601)
Time onlyN/A (DATE with time)TIMETIMETEXT (ISO8601)
Date/timeTIMESTAMPTIMESTAMPDATETIMETEXT (ISO8601)
Binary/blobBLOB/RAWBYTEABLOBBLOB

Rule of thumb: choose the smallest type that safely holds every possible value.
e.g. prefer an INTEGER to a BIGINT when storing whole numbers that will never be larger than 2,147,483,647.

§4 Database Keys #

Keys are special columns used to enforce rules and relationships. They are fundamental to the relational model. There are seven major key types in relational databases, each enforces a different aspect of integrity or navigability.

Examples of each type of database key

Table: Department
dept_code CHAR(2)dept_name VARCHAR(40)dept_phone CHAR(10)
CSComputer Science5551234567
MAMathematics5552345678
ENEngineering5553456789
Table: Employee
emp_id INT AUTO_INCREMENTdept_code CHAR(2)ssn CHAR(11)first_name VARCHAR(40)last_name VARCHAR(40)birth_date DATE
1CS123-45-6789AliceSmith1990-05-15
2CS234-56-7890BobJohnson1985-11-03
3MA345-67-8901CarolLee1992-08-21
4EN456-78-9012DavidPatel1988-02-27
5EN567-89-0123EveChen1991-12-09

§5 Database Relationships #

Relationships define how tables are connected to each other, using foreign keys.

§6 Questions #

Always provide at least a brief explanations for your answers.
Please number each response to directly correspond with the question. Provide clear, concise answers that fully address each question without unnecessary elaboration or essay-style writing.

  1. What is the difference between a primary key and a foreign key?
  1. Provide an example of a Many to Many relationship that might be found in a database containing banking data. what would the “junction” table be?
  1. What is the difference between a candidate key and a superkey?
  1. When is a surrogate key useful?
  1. Explain the difference between a database schema and a database instance.
  1. Why should you choose to use an INTEGER instead of a BIGINT when storing numbers such as zip codes.
  1. Assume the column email VARCHAR(255) is declared as UNIQUE in a Users table. Does this make email a candidate key? Justify your answer.
  1. A university database has the following tables:
Students:
StudentIDFirstNameLastName
Courses:
CourseIDTitle
Enrollments:
StudentIDCourseIDSemesterGrade

List all the foreign keys in the Enrollments table and explain what each one enforces.