This section is continuing off of the previous section. covering more advanced concepts for writing Stored Programs in SQL

The SQL examples in this section will be using PostgreSQL syntax unless otherwise noted. Each DBMS’s language implementation will differ.

§1 Returning Tables from Functions #

While the previous section focused on functions returning a single scalar value, stored functions can also be designed to return a table (a set of rows and columns). This allows you to capture complex queries or data transformations within a function and use its result directly in your FROM clause, much like a regular table or view.

§1.1 Syntax for `RETURNS TABLE` #

To return a table, you specify RETURNS TABLE followed by the column definitions (name and data type) of the table that the function will output.

CREATE FUNCTION function_name (
  [ parameter_name data_type [ DEFAULT value ], ... ]
)
RETURNS TABLE (
  column1_name data_type,
  column2_name data_type,
  -- ... more columns
)
LANGUAGE plpgsql
AS $$
BEGIN
  -- SQL statements and procedural logic go here
  -- The final SELECT statement's result set must match the RETURNS TABLE definition
  RETURN QUERY SELECT ...;
END;
$$;

§1.2 Example: Function Returning a Table of Employees by Department #

Let’s create a function that returns a list of employees for a given department.

-- Assume an Employees table with employee_id, first_name, last_name, department_id
-- and a Departments table with department_id, department_name

CREATE OR REPLACE FUNCTION GetEmployeesByDepartment (
  p_department_name VARCHAR(100)
)
RETURNS TABLE (
  employee_id INT,
  full_name VARCHAR(101),
  department VARCHAR(100)
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    e.employee_id,
    e.first_name || ' ' || e.last_name AS full_name,
    d.department_name
  FROM Employees e
  JOIN Departments d ON e.department_id = d.department_id
  WHERE d.department_name = p_department_name;
END;
$$;

§1.3 Calling Functions that Return Tables #

Functions that return tables are typically called in the FROM clause of a SELECT statement, using the SELECT * FROM function_name(...) syntax.

SELECT * FROM GetEmployeesByDepartment('Sales');

 employee_id | full_name  | department 
-------------+------------+------------
           3 | Alice Brown| Sales      
           5 | Charlie Day| Sales

§2 Control Flow Statements #

Control flow statements allow you to introduce logic into your stored programs, allowing for logical decisions and repeat actions based on conditions.

§2.1 Selection Statements #

Selection statements allow your code to execute different blocks of code based on specific conditions.

IF-THEN-ELSE: Executes a block of code if a condition is true, and optionally another block if it’s false.

IF first_condition THEN
  -- runs if first_condition is true
ELSIF different_condition THEN --note the syntax is `ELSIF`
  -- runs if different_condition is true
ELSE
  -- runs if no previous condition is true
END IF; -- must always end an IF statement with END IF

Example: return different values based on salary value:

CREATE OR REPLACE FUNCTION GetEmployeeStatus(p_salary DECIMAL(10,2))
RETURNS VARCHAR(20)
LANGUAGE plpgsql
AS $$
BEGIN
  IF p_salary >= 70000.00 THEN
    RETURN 'Senior';
  ELSIF p_salary >= 50000.00 THEN
    RETURN 'Mid-Level';
  ELSE
    RETURN 'Junior';
  END IF;
END;
$$;

CASE Statements: Provides a more structured way to handle multiple conditional branches, especially when checking a single expression against several possible values.

CASE expression
  WHEN value1 THEN
    -- runs if expression = value1
  WHEN value2 THEN
    -- runs if expression = value2
  ELSE
    -- runs if no match
END CASE;

Example: return value based on department name:

CREATE OR REPLACE FUNCTION GetDepartmentCategory(p_department_name VARCHAR(100))
RETURNS VARCHAR(20)
LANGUAGE plpgsql
AS $$
DECLARE
  v_category VARCHAR(20);
BEGIN
  -- store result of CASE statement into v_category variable
  v_category = CASE p_department_name
    WHEN 'Sales' THEN 'Revenue Generating'
    WHEN 'Marketing' THEN 'Revenue Generating'
    WHEN 'HR' THEN 'Support'
    WHEN 'IT' THEN 'Support'
    ELSE 'Other'
  END;
  RETURN v_category;
END;
$$;

Another Example: return value based on data from another query:
This example returns an employee status based on data from a table.

CREATE OR REPLACE FUNCTION GetEmployeeStatus(p_employee_id INT)
RETURNS VARCHAR(20)
LANGUAGE plpgsql
AS $$
DECLARE
  v_status VARCHAR(20);
BEGIN
  SELECT
    CASE
      WHEN is_active = TRUE AND hire_date <= (CURRENT_DATE - INTERVAL) '90 days' THEN 'Permanent'
      WHEN is_active = TRUE AND hire_date > (CURRENT_DATE - INTERVAL) '90 days' THEN 'Probationary'
      ELSE 'Inactive'
    END
  INTO v_status  -- store result of the conditional in v_status
  FROM employees
  WHERE employee_id = p_employee_id;

  RETURN v_status;
END;
$$;

§2.2 Iteration Statements #

Iteration statements (better known as “loops”) allow you to execute a block of code repeatedly.

LOOP: An unconditional loop that requires an explicit EXIT condition.

LOOP
  -- statements go here
  IF condition THEN
    EXIT; -- Exit the loop
  END IF;
END LOOP;

WHILE LOOP: Executes a block of code as long as a condition is true.

WHILE condition LOOP
  -- statements (must eventually update condition to FALSE or this will run infinitely)
END LOOP;

FOR LOOP: Iterates over a range of numbers. Or more commonly in SQL, over the result set of a query.

-- Iterating over a range
FOR counter IN 1..10 LOOP
  -- statements run 10 times
END LOOP;

-- Iterating over query results
FOR record_var IN SELECT column1, column2 FROM my_table WHERE condition LOOP
  -- statements using record_var.column1, record_var.column2
END LOOP;

When working with queries in a loop, you will need somewhere to store the returned rows of data. the RECORD data type can hold an entire row of data.

Example: FOR LOOP printing contents of table:

CREATE OR REPLACE PROCEDURE LogAllEmployeeEmails()
LANGUAGE plpgsql
AS $$
DECLARE
  employee_record RECORD; -- variable to hold an entire row
BEGIN
  FOR employee_record IN SELECT first_name, last_name, email FROM Employees LOOP
    RAISE NOTICE 'Employee: % %, Email: %', employee_record.first_name, employee_record.last_name, employee_record.email;
    -- RAISE NOTICE is akin to a print statement in many programming languages. 
  END LOOP;
END;
$$;

NOTICE:  Employee: Alice Brown, Email: abrown@example.com
NOTICE:  Employee: David Evans, Email: devans@example.com
NOTICE:  Employee: Charlie Day, Email: cday@example.com
NOTICE:  Employee: Fiona Green, Email: fgreen@example.com

Practical Example: Insert rows from a query into a table

CREATE OR REPLACE PROCEDURE LogAllEmployeeEmailsToTable()
LANGUAGE plpgsql
AS $$
DECLARE
  employee_record RECORD;
BEGIN
  -- Loop through each employee returned from query 
  FOR employee_record IN SELECT first_name, last_name, email FROM Employees LOOP
    -- Insert the employee's details into the email_log table
    INSERT INTO email_log (employee_first_name, employee_last_name, employee_email)
    VALUES (employee_record.first_name, employee_record.last_name, employee_record.email);
  END LOOP;

  -- RAISE NOTICE 'All employee emails have been logged to the email_log table.';
END;
$$;

NOTICE:  All employee emails have been logged to the email_log table.

§3 Cursors #

A cursor is a database object that enables you to traverse and process the rows of a query result set one by one. Generally a set-based operation is preferred for performance, but cursors are sometimes necessary for row-by-row processing, especially when complex logic needs to be applied to each individual row.

A cursor allows you to iterate through the rows returned by a SELECT statement, fetching one row at a time and performing operations on it.

When to use cursors:

When you need to perform complex, row-specific logic that cannot be easily expressed with set-based SQL operations.
When you need to update or delete the current row being processed.
For administrative tasks or data migration scripts where row-by-row processing is unavoidable.

When to avoid cursors:

Cursors are generally much slower and resource-intensive than set-based SQL operations (e.g., UPDATE ... WHERE, INSERT ... SELECT). If you can achieve the same result with a single UPDATE, DELETE, or INSERT statement, do so.

§3.1 Working with Cursors #

This example demonstrates how to declare a cursor and use it to perform operations on rows returned from a query.

CREATE OR REPLACE PROCEDURE LogDepartment1EmployeeSalaries()
LANGUAGE plpgsql
AS $$
DECLARE
  -- Declare the cursor: [ NAME ] CURSOR FOR [query]...
  employee_cursor CURSOR FOR
    SELECT employee_id, first_name, last_name, salary 
    FROM Employees WHERE department_id = 1;

  -- Variables to hold temporary data from rows
  emp_id INT;
  f_name VARCHAR(50);
  l_name VARCHAR(50);
  emp_salary DECIMAL(10,2);
BEGIN
  -- Open the cursor
  OPEN employee_cursor;

  LOOP
    -- Fetch a row from the cursor
    FETCH employee_cursor INTO emp_id, f_name, l_name, emp_salary;

    -- Exit the loop after processing all rows (or if no rows returned)
    EXIT WHEN NOT FOUND; -- an exit statement is required to complete the cursor

    -- Logic to process the fetched row: 
    -- here we update salary for this specific employee
    UPDATE Employees SET salary = (emp_salary * 1.05) WHERE employee_id = emp_id;
    -- Note: this specific example would be better accomplished with a simple UPDATE, 
    -- this example is simple to focus on the syntax of CURSOR.
  END LOOP;

  -- Close the cursor
  CLOSE employee_cursor;
END;
$$;

This PL/pgSQL procedure iterates through each employee in department 1 using a cursor, fetching their details one by one. For each employee, it then updates their salary in the Employees table by increasing it by 5%.

An example of a task that would be suitable for a CURSOR would be generating individualized reports for a table of customers, where specific data dictates unique report content.

§4 Error Handling #

Stored programs should include mechanisms to handle errors gracefully. Error handlers can prevent unexpected failures and provide informative feedback.

NOTE: Different DBMS have their own syntax for error handling.

An error handler in SQL is similar in form to a try/catch block from other languages like C++ or Java.

BEGIN
  -- Your code that might raise an error
EXCEPTION
  WHEN division_by_zero THEN
    -- Handle specific error
    RAISE WARNING 'Division by zero occurred!';
  WHEN OTHERS THEN
    -- Catch-all to handle any other error, printing the error type and message
    RAISE EXCEPTION 'An unexpected error occurred (SQLSTATE: %): %', SQLSTATE, SQLERRM;
END;

Example:

CREATE OR REPLACE FUNCTION SafeDivide(numerator DECIMAL, denominator DECIMAL)
RETURNS DECIMAL
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN numerator / denominator;
EXCEPTION
  WHEN division_by_zero THEN
    RAISE WARNING 'Attempted to divide % by zero. Returning NULL.', numerator;
    RETURN NULL;
  WHEN OTHERS THEN
    RAISE EXCEPTION 'An unexpected error occurred (SQLSTATE: %): %', SQLSTATE, SQLERRM;
END;
$$;

Here is the output when you call SELECT SafeDivide(10, 0);

WARNING:  Attempted to divide 10 by zero. Returning NULL.
 safedivide 
------------
       NULL
(1 row)

§4.1 Error Types and SQLSTATE Codes #

SQL error handling relies on predefined error conditions or SQLSTATE codes. Each SQLSTATE is a five-character string, with the first two characters indicating the class of the error and the last three indicating the subclass.

You can catch SQL errors using either the specific names (such as division_by_zero) or the SQLSTATE code.

For example:

22012: division_by_zero
23502: not_null_violation

BEGIN
  -- Your SQL code
EXCEPTION
  WHEN SQLSTATE '22012' THEN -- division_by_zero
    RAISE WARNING 'Caught division by zero using SQLSTATE!';
  WHEN not_null_violation THEN -- 23502
    RAISE EXCEPTION 'A NOT NULL constraint was violated.';
  WHEN OTHERS THEN  -- Catch-all
    RAISE EXCEPTION 'An unexpected error occurred (SQLSTATE: %): %', SQLSTATE, SQLERRM;
END;

You can find the list of SQLSTATE codes along with their names in your DBMS documentation, for PostgreSQL this can be found here: https://www.postgresql.org/docs/current/errcodes-appendix.html

§5 More Resources #

PostgreSQL specific resources:

Questions: #

All of these questions will be assuming use of plpgsql

What clause must immediately follow the function parameter list when you want the function to return an entire table (CREATE FUNCTION function_name () ____?

provide the syntax for calling a function named my_example_function() that returns a table?

Which form of CASE statement is best suited for checking one expression against many constant values?

In a FOR loop that iterates over query results, what data type is used to hold each row?

What keyword raises a message to the console, similar to a print statement in other languages?

in a CURSOR, what does the statement EXIT WHEN NOT FOUND do? When will this statement run?

When should you avoid CURSOR statements? Why?

In the EXCEPTION clause, what keyword catches every error that is not explicitly listed?

Which SQLSTATE code corresponds to null_value_not_allowed in PostgreSQL?

Complete the body of this function so that it returns every order for the customer whose name matches p_customer_name. Assume you are getting data from a table named Orders which contains: order_id, order_date, total_amount, and customer_id columns, and a Customers table which has customer_id and customer_name columns.

CREATE OR REPLACE FUNCTION GetOrdersByCustomer (
  p_customer_name VARCHAR(100)
)
RETURNS TABLE (
  order_id INT,
  order_date DATE,
  total_amount NUMERIC(10,2)
)
LANGUAGE plpgsql
AS $$
BEGIN
  -- complete this function

END;
$$;

Complete the function so that it returns High when p_score is greater or equal to 90, Mid when p_score is greater or equal to 60, and Low otherwise.

CREATE OR REPLACE FUNCTION GetGradeBand(p_score INT)
RETURNS TEXT
LANGUAGE plpgsql
AS $$
BEGIN
  -- complete this function
END;
$$;

Complete the body of this function so it uses a CASE statement to return Fast when p_delivery_days is 2 or fewer, Standard for 3–5 days, and Slow for anything longer.

CREATE OR REPLACE FUNCTION GetShippingSpeed(p_delivery_days INT)
RETURNS VARCHAR(10)
LANGUAGE plpgsql
AS $$
DECLARE
  v_speed VARCHAR(10);
BEGIN
  -- complete using CASE statement
END;
$$;

Section 13: Advanced Stored Programs