SQL Keys & Constraints

The rules that keep your data honest, explained step by step with examples, violations, and real syntax.

In SQL, keys identify rows uniquely and link tables together. Constraints are rules that enforce the integrity of your data. They prevent invalid, duplicate, or missing values from ever entering the database.

Together, keys and constraints are the backbone of a well-structured relational database. Without them, data becomes inconsistent, unreliable, and hard to query correctly.

Why this matters: A database without constraints is like a spreadsheet with no validation. Anyone can enter duplicate IDs, blank required fields, or references to rows that do not exist. Constraints make the database itself enforce correctness, not just your application code.
00

Types of Keys & Constraints

SQL organises keys and constraints into four categories based on what they protect or provide. Each category has a distinct purpose in keeping your database reliable.

🔑 Keys — Row Identity

Identify individual rows and establish relationships between tables.

🔑
PRIMARY KEY
Uniquely identifies every row in a table. Cannot be NULL or duplicated. Each table has exactly one.
🔗
FOREIGN KEY
References a primary key in another table. Enforces that relationships always point to real rows.
🛡️ Integrity Constraints — Data Uniqueness

Prevent duplicate or ambiguous data from entering the database.

UNIQUE
Ensures no two rows share the same value in a column. Unlike PRIMARY KEY, it does allow NULL values. Useful for emails, usernames and phone numbers.
📋 Value Constraints — Column Rules

Control what values are allowed or automatically assigned to a column.

🚫
NOT NULL
Requires a value to always be present. Blocks empty entries for mandatory fields like names and prices.
CHECK
Validates a value against a custom condition before it is stored. Rejects the row if the condition is false.
📋
DEFAULT
Automatically fills in a preset value when no value is provided during INSERT. Does not reject data.
⚡ Performance — Query Speed

Not a data rule, but a structure that makes finding data faster.

INDEX
Builds a sorted lookup structure on a column so the database can find rows instantly instead of scanning the entire table. Automatically created for PRIMARY KEY and UNIQUE columns.

01

PRIMARY KEY

Unique + Not Null

A PRIMARY KEY uniquely identifies every row in a table. No two rows can share the same primary key value, and it can never be NULL.

Every table should have exactly one primary key. It can be a single column (simple key) or a combination of columns (composite key). The database automatically creates an index on the primary key for fast lookups.

-- Single-column primary key
CREATE TABLE students (
    student_id  INT          PRIMARY KEY,
    name        VARCHAR(100) NOT NULL,
    email       VARCHAR(150)
);

-- Composite primary key (defined at table level)
CREATE TABLE enrollments (
    student_id  INT,
    course_id   INT,
    enrolled_on DATE,
    PRIMARY KEY (student_id, course_id)
);

How it works: step by step

1

You declare one column (or a combination) as the PRIMARY KEY.

2

On every INSERT or UPDATE, the database checks: is this value already used? Is it NULL?

3

If either check fails, the operation is rejected with an error.

4

If both checks pass, the row is accepted and stored.

Worked Trace: INSERT attempts on the students table

Existing Data
student_idname
1Alice
2Bob
New INSERT attempts | Primary Key: student_id
id=3 INSERT student_id = 3 ✅ Unique and not null: ACCEPTED
id=1 INSERT student_id = 1 ❌ Duplicate value: REJECTED
NULL INSERT student_id = NULL ❌ NULL not allowed: REJECTED
📖 What this means
student_id = 3 is brand new and not NULL, so it is inserted successfully.
student_id = 1 already exists, so the database rejects it with a duplicate key error.
student_id = NULL violates the not-null rule and is rejected immediately.
This guarantees every row has a unique, stable identity you can always refer to.
❌ Error Examples
ERROR: duplicate key value violates unique constraint "students_pkey"
ERROR: null value in column "student_id" violates not-null constraint
02

FOREIGN KEY

Referential Integrity

A FOREIGN KEY links a column in one table to the PRIMARY KEY of another. It ensures you can't reference a row that doesn't exist.

Foreign keys enforce referential integrity: the guarantee that every relationship in your database points to something real. You cannot add an order for a customer who does not exist, and you cannot delete a customer who still has orders (unless you use CASCADE).

CREATE TABLE orders (
    order_id    INT  PRIMARY KEY,
    customer_id INT,
    amount      DECIMAL(10,2),
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

-- With cascade options
FOREIGN KEY (customer_id)
    REFERENCES customers(customer_id)
    ON DELETE CASCADE    -- delete orders when customer is deleted
    ON UPDATE CASCADE;   -- update orders if customer_id changes

How it works: step by step

1

You declare a column as a FOREIGN KEY referencing another table's primary key.

2

On every INSERT or UPDATE in the child table, the database checks: does this value exist in the parent table?

3

If the value does not exist in the parent, the operation is rejected.

4

On DELETE or UPDATE of the parent row, the ON DELETE or ON UPDATE rule determines what happens to children.

Worked Trace: INSERT attempts on the orders table

customers (parent)
customer_idname
101Alice
102Bob
orders (child)
order_idcustomer_id
1101
New INSERT attempts into orders | FK: customer_id references customers(customer_id)
cid=102 INSERT order with customer_id = 102 ✅ Exists in customers: ACCEPTED
cid=999 INSERT order with customer_id = 999 ❌ Not in customers: REJECTED
NULL INSERT order with customer_id = NULL ⚠️ Allowed unless NOT NULL is also set
📖 What this means
customer_id = 102 exists in the customers table → order inserted successfully.
customer_id = 999 has no matching row in customers → database rejects it. This prevents "orphan" orders that point to nobody.
NULL is allowed by default unless you also add NOT NULL. It means "order with no customer assigned yet".

CASCADE options explained

What happens to child rows when parent row is deleted
ON DELETE CASCADEChild rows are automatically deleted too
ON DELETE SET NULLFK column set to NULL in child rows
ON DELETE RESTRICTParent delete is blocked if children exist (default)
ON DELETE NO ACTIONLike RESTRICT but checked at end of transaction
03

UNIQUE

No Duplicates

A UNIQUE constraint ensures no two rows have the same value in the specified column(s). Unlike PRIMARY KEY, a UNIQUE column can hold NULL.

Use UNIQUE for columns that are natural identifiers but not the primary key, such as email addresses, usernames, or phone numbers. A table can have multiple UNIQUE constraints but only one PRIMARY KEY.

CREATE TABLE users (
    user_id  INT          PRIMARY KEY,
    email    VARCHAR(200) UNIQUE,        -- inline
    username VARCHAR(50),
    phone    VARCHAR(20),
    UNIQUE (username),                   -- table level
    UNIQUE (phone)
);

How it works: step by step

1

On INSERT or UPDATE, the database scans existing values in the UNIQUE column.

2

If the new value already exists, the operation is rejected.

3

If the value is NULL, it passes (most databases allow multiple NULLs in a UNIQUE column).

4

If the value is brand new, the row is inserted or updated successfully.

Worked Trace: INSERT attempts on the users table

Existing Data
user_idemailusername
1alice@x.comalice
2bob@x.combob
New INSERT attempts | UNIQUE constraints on email and username
email carol@x.com (new email) ✅ Not a duplicate: ACCEPTED
email alice@x.com (already exists) ❌ Duplicate: REJECTED
NULL email = NULL ✅ NULL allowed in UNIQUE: ACCEPTED
user username = "alice" (already exists) ❌ Duplicate username: REJECTED
📖 What this means
New unique values pass freely. Duplicate non-null values are always rejected.
NULL is a special case. It represents "unknown", so most databases treat multiple NULLs as not equal to each other, allowing several rows to have NULL in a UNIQUE column.
Difference from PRIMARY KEY: UNIQUE allows NULL; PRIMARY KEY does not. A table can have many UNIQUE constraints, but only one PRIMARY KEY.
04

NOT NULL

Value Required

NOT NULL ensures a column always has a value. It can never be left empty (NULL). It is the simplest and most common constraint.

By default, every column in SQL accepts NULL unless you explicitly add NOT NULL. Use it for fields that must always be present: names, dates, status codes, prices, and so on.

CREATE TABLE products (
    product_id   INT           PRIMARY KEY,
    product_name VARCHAR(200)  NOT NULL,   -- must be provided
    price        DECIMAL(8,2)  NOT NULL,   -- must be provided
    description  TEXT                        -- nullable (optional)
);

How it works: step by step

1

When a row is inserted or updated, the database checks every NOT NULL column.

2

If a NOT NULL column has no value supplied (or is explicitly NULL), it is rejected immediately.

3

If all NOT NULL columns have values, the row proceeds to other constraint checks.

Worked Trace: INSERT attempts on the products table

NOT NULL columns: product_name, price
INSERT (1, "Laptop", 999.99, "Fast laptop") ✅ All NOT NULL fields present: ACCEPTED
INSERT (2, "Mouse", 29.99, NULL) ✅ description is nullable: ACCEPTED
INSERT (3, NULL, 49.99, "Keyboard") ❌ product_name is NULL: REJECTED
INSERT (4, "Monitor", NULL, NULL) ❌ price is NULL: REJECTED
📖 What this means
Row 1: every required field has a value, so it is accepted.
Row 2: description is NULL but it has no NOT NULL constraint, so NULL is perfectly fine here.
Row 3: product_name was not provided. A product with no name makes no sense, so it is rejected.
Row 4: price is NULL. A product must always have a price, so it is rejected.
The description column is intentionally optional and valid to leave blank.
05

CHECK

Custom Condition

A CHECK constraint lets you define a custom rule that every row must satisfy. If the condition evaluates to false, the row is rejected.

CHECK is the most flexible constraint. You can write any valid SQL expression. Common uses include ensuring age is within range, salary is above minimum, or status is one of a fixed set of values.

CREATE TABLE employees (
    emp_id   INT          PRIMARY KEY,
    name     VARCHAR(100) NOT NULL,
    age      INT          CHECK (age >= 18 AND age <= 65),
    salary   DECIMAL(10,2) CHECK (salary > 0),
    status   VARCHAR(20)  CHECK (status IN ('active', 'inactive', 'on_leave'))
);

How it works: step by step

1

On INSERT or UPDATE, the database evaluates the CHECK expression with the new column value.

2

If the expression returns TRUE, the row is accepted and passes this constraint.

3

If the expression returns FALSE, the row is rejected with an error.

4

If the expression returns NULL (for example when the column is NULL), the row is accepted because NULL is not the same as FALSE.

Worked Trace: INSERT attempts on the employees table

CHECK constraints: age (18–65), salary (>0), status (active/inactive/on_leave)
age=30, salary=50000, status='active' ✅ All checks pass: ACCEPTED
age age = 15 (fails age >= 18) ❌ CHECK violated: REJECTED
salary salary = -500 (fails salary > 0) ❌ CHECK violated: REJECTED
status status = 'fired' (not in allowed list) ❌ CHECK violated: REJECTED
NULL age = NULL ⚠️ NULL skips CHECK (not FALSE): ACCEPTED
📖 What this means
Row 1: all three conditions evaluate to TRUE, so it is inserted.
age = 15: the expression 15 >= 18 is FALSE, so it is rejected.
salary = -500: the expression -500 > 0 is FALSE, so it is rejected.
status = 'fired': not in the IN list, so it evaluates to FALSE and is rejected.
NULL edge case: when the column is NULL, CHECK evaluates to NULL rather than FALSE, so the row passes. To also block NULLs, combine CHECK with NOT NULL.
06

DEFAULT

Auto-fill Value

DEFAULT automatically fills in a value when no value is provided during INSERT. It does not restrict data, it supplies it.

DEFAULT is the only constraint that adds data rather than rejecting it. It is useful for timestamps (created_at), boolean flags (is_active), counters, or any field with a sensible starting value.

CREATE TABLE orders (
    order_id    INT          PRIMARY KEY,
    customer_id INT          NOT NULL,
    status      VARCHAR(20)  DEFAULT 'pending',
    created_at  TIMESTAMP    DEFAULT CURRENT_TIMESTAMP,
    quantity    INT          DEFAULT 1,
    is_paid     BOOLEAN      DEFAULT FALSE
);

How it works: step by step

1

On INSERT, the database checks each DEFAULT column: was a value provided?

2

If yes, the provided value is used as-is.

3

If the column was omitted from the INSERT statement, the DEFAULT value is substituted automatically.

4

If NULL is explicitly inserted, NULL is used and the DEFAULT is not applied.

Worked Trace: INSERT attempts on the orders table

DEFAULT columns: status='pending', quantity=1, is_paid=FALSE
full INSERT (1, 101, 'shipped', NOW(), 3, TRUE) ✅ All values provided: used as-is
part INSERT (2, 101) — status, qty, is_paid omitted ✅ Defaults fill in: 'pending', 1, FALSE
NULL INSERT (3, 101, NULL, ...) ⚠️ Explicit NULL overrides DEFAULT: status stored as NULL
✅ Resulting stored data
order_idcustomer_idstatusquantityis_paid
1101shipped3TRUE
2101pending1FALSE
3101NULL1FALSE
📖 What this means
Row 1: all values were given explicitly, so nothing is substituted.
Row 2: only order_id and customer_id were provided. The database auto-filled status, quantity, and is_paid from their DEFAULT values.
Row 3: status = NULL was explicitly written. DEFAULT is not applied when NULL is given directly. To prevent this, also add NOT NULL to the column.
07

INDEX

Performance Booster

An INDEX is a lookup structure that speeds up queries on a column. It does not restrict data; it makes finding it faster.

Without an index, the database must scan every row (a full table scan) to find matching values. An index works like a book's index: you jump straight to the right page instead of reading every page. The trade-off is that indexes use extra storage and slow down INSERT and UPDATE operations slightly.

-- Create a regular index
CREATE INDEX idx_orders_customer
ON orders (customer_id);

-- Create a unique index (same effect as UNIQUE constraint)
CREATE UNIQUE INDEX idx_users_email
ON users (email);

-- Composite index (for queries that filter on both columns)
CREATE INDEX idx_orders_customer_date
ON orders (customer_id, order_date);

-- Drop an index
DROP INDEX idx_orders_customer;

How it works: step by step

1

When you create an index, the database builds a sorted B-tree structure of the column's values, each pointing to the actual row location.

2

When a query filters or sorts by that column, the database uses the index to locate matching rows instantly instead of scanning the whole table.

3

On INSERT, UPDATE, or DELETE, the index is automatically updated to stay in sync.

Worked Trace: query performance with and without an index

Table: orders (1,000,000 rows) | Query: WHERE customer_id = 102
slow Without index: full table scan Reads all 1,000,000 rows — approx 800ms
fast With index on customer_id: B-tree lookup Reads approx 20 rows directly — approx 1ms
📖 What this means
Without an index, finding all orders for customer_id = 102 requires checking every single row. With an index, the database jumps directly to the matching rows using the sorted B-tree structure.

When to add an index: columns used frequently in WHERE, JOIN ON, or ORDER BY clauses.
When not to add one: small tables where a full scan is already fast, columns with very few distinct values, or tables with very frequent INSERT and UPDATE operations where index maintenance adds overhead.
Note: PRIMARY KEY and UNIQUE constraints automatically create an index behind the scenes. You don't need to create one manually for those columns.

08

Quick Reference Summary

All seven keys & constraints at a glance:

Constraint Allows NULL? Allows Duplicates? Rejects Data? Per Table
PRIMARY KEYNoNoYesOnly 1
FOREIGN KEYYes*YesYesMany
UNIQUEYesNoYesMany
NOT NULLNoYesYesMany
CHECKYesYesYesMany
DEFAULTYesYesNoMany
INDEXYesYes*NoMany

* FOREIGN KEY allows NULL unless combined with NOT NULL. UNIQUE INDEX does not allow duplicates.