Recursion in SQL - User Study

Like any programming language, SQL allows you to express complex computation in various (equivalent) forms. This survey studies two particular forms of authoring user-defined SQL functions (UDFs). Both forms are used to implement the same recursive programming problems—but the forms read differently:

This survey compares these two forms in terms of readability and thus focuses on your ability to read and understand snippets of SQL code. We therefore ask you to work through the study without the use of external programs: answer the questions below without executing the UDFs on a SQL database system. Use pen and paper only (if required at all). Thank you!

The SQL dialect presented to you in this survey leans heavily on PostgreSQL 12. If any feature found in the following queries is new to you, feel free to look them up here.

Important: We will also ask you for the time it took you to complete each task. As you complete each task, keep track of the required time and enter it in the boxes below. Please answer truthfully. Each task should take about 5 − 10 minutes, if you find yourself exceeding this timespan, please feel free to skip to the next task (in this case, please tick the corresponding box).

The entire survey should take about 30 − 45 minutes to complete. Once done, please make sure to press the Submit your results!-button at the end.

Graham et. al. define the Fibonacci number function in Concrete Mathematics, 2nd edition as:

We propose the following UDFs which aim to compute the Fibonacci number of

n

$n$ using SQL. You can assume that each of the UDFs are syntactically sound but only some of them implement the Fibonacci number function correctly.

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  WITH RECURSIVE
  fib(i, prev_n, curr_n) AS (
    VALUES (1, 0, 1)
      UNION 
    SELECT f.i + 1,
           f.curr_n,
           f.prev_n - f.curr_n
    FROM   fib AS f
    WHERE  f.i <= n
  )
  SELECT f.curr_n 
  FROM   fib AS f
  WHERE  f.i = n;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  WITH RECURSIVE
  fib(i, prev_n, curr_n) AS (
    SELECT 1, 0, 1
      UNION 
    SELECT f.i + 1,
           f.curr_n,
           f.prev_n + f.curr_n
    FROM   fib AS f
    WHERE  f.i BETWEEN 0 AND n
  )
  SELECT f.curr_n
  FROM   fib AS f
  WHERE  f.i < n;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  WITH RECURSIVE
  fib(i, prev_n, curr_n) AS (
    VALUES (1, 0, 1)
      UNION ALL
    SELECT f.i + 1,
           f.curr_n,
           f.prev_n + f.curr_n
    FROM   fib AS f
    WHERE  f.i <= n
  )
  SELECT   f.prev_n
  FROM     fib AS f
  ORDER BY f.i DESC
  LIMIT 1;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  WITH RECURSIVE
  fib(i, prev_n, curr_n) AS (
    VALUES (1, 0, 1)
      UNION ALL
    SELECT f.i + 1,
           f.curr_n,
           f.prev_n + f.curr_n
    FROM   fib AS f
    WHERE  f.i < n
  )
  SELECT   f.curr_n
  FROM     fib AS f
  ORDER BY f.i 
  LIMIT 1;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  SELECT 0
  WHERE  n <= 1
    UNION ALL
  SELECT fib(n-1) + fib(n-2)
  WHERE  n > 1
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  SELECT CASE
    WHEN n = 0 THEN 0
    WHEN n = 1 THEN 1
    ELSE fib(n-1) + fib(n-2)
  END;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  SELECT CASE
    WHEN n = 0 THEN 0
    WHEN n = 1 THEN 1
    ELSE fib(n-1) - fib(n-2)
  END;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  SELECT 1
  WHERE  n <= 2
    UNION ALL
  SELECT fib(n-1) + fib(n-2)
  WHERE  n > 2
$$ LANGUAGE SQL;

Knuth et. al. define the calculation of the greatest common divisor in The Art of Programming (vol. 2 Seminumerical Algorithms), 3rd edition as:

We propose the following SQL UDFs which aim to compute the greatest common divisor of

u

$u$ and

v

$v$ . You can assume that each of the UDFs are syntactically sound but only some of them implement the greatest common divisor function correctly.

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  SELECT CASE
    WHEN k = 0 THEN n
    ELSE gcd(k, MOD(n,k))
  END;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  SELECT n
  WHERE  k = 0
    UNION ALL
  SELECT gcd(k, MOD(k,n))
  WHERE  k > 0;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  SELECT CASE
    WHEN n = 0 THEN k
    ELSE gcd(k, MOD(k,n))
  END;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  SELECT k
  WHERE  k = 0
    UNION 
  SELECT gcd(k, MOD(n,k))
  WHERE  k > 0;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  WITH RECURSIVE gcd(n,k) AS (
    SELECT n, k
      UNION ALL
    SELECT g.k, MOD(g.n,g.k)
    FROM   gcd AS g
    WHERE  g.k <> 0
  )
  SELECT g.n
  FROM   gcd AS g
  WHERE  g.k = 0;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  WITH RECURSIVE gcd(n,k) AS (
    SELECT n, k
      UNION 
    SELECT g.k, MOD(g.n,g.k)
    FROM   gcd AS g
    WHERE  g.k > 0
  )
  SELECT   g.n
  FROM     gcd AS g
  ORDER BY g.k
  LIMIT 1;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  WITH RECURSIVE gcd(n,k) AS (
    SELECT n, k
      UNION ALL
    SELECT g.k, MOD(g.n,g.k)
    FROM   gcd AS g
    WHERE  g.k <> 0
  )
  SELECT g.k
  FROM   gcd AS g
  WHERE  g.k = 0;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  WITH RECURSIVE gcd(n,k) AS (
    SELECT n, k
      UNION ALL
    SELECT g.k, MOD(g.n,g.k)
    FROM   gcd AS g
    WHERE  g.k = 0
  )
  SELECT   g.n
  FROM     gcd AS g
  ORDER BY g.k DESC
  LIMIT 1;
$$ LANGUAGE SQL;

CREATE FUNCTION f(i int, j int, k float) 
RETURNS float AS $$
  SELECT CASE 
    WHEN i > j THEN k 
    ELSE (SELECT f(i+1, j, s.b + 0.5 * k)
          FROM   s
          WHERE  s.a = i)
  END;
$$ LANGUAGE SQL;

Keep track on how long it takes you to describe (in 100 words or less) what the function f(i int, j int, 0) does:

CREATE FUNCTION g(a int) 
RETURNS bigint AS $$
  WITH RECURSIVE 
  r(x, y, z) AS (
    SELECT t.x, t.y, t.z
    FROM   t
    WHERE  t.x = a
      UNION 
    SELECT r.x, t.y, t.z
    FROM   r, t
    WHERE  r.y = t.x
  )
  SELECT   SUM(r.z) 
  FROM     r;
$$ LANGUAGE SQL;

Keep track on how long it takes you to describe (in 100 words or less) what the function g(a int) does:

Important: This part builds on functions found in part 3 and part 4 of this survey. Make sure to answer these parts first, before you continue here.

Keep track on how long it takes you to compute (using pen and paper) the result of the function from part 3 when called as f(1,3,0).
Write down the result:

Keep track on how long it takes you to compute (using pen and paper) the result of the function from part 4 when called as g(4).
Write down the result:

Martello et. al. define the 0-1 Knapsack Problem in Knapsack Problems - Algorithms and Computer Implementations as follows:

Consider items

i \in {1, \dots, n}

$i ∈ \{1,\ldots,n\}$ where each item has a weight

w_{i}

$w_i$ and a value

p_{i}

$p_i$ . Then

k n a p (n, w)

$\mathit{knap}(n,w)$ maximizes the sum of values of items that fit into a knapsack of weight

w

$w$ and is defined as:

Keep track of how long it takes you to complete the SQL function knap(n,w) which solves the 0-1 Knapsack problem either using a recursive or CTE-based UDF. Submit your solution in the following textbox in which we also provide to you the definition of table items which holds all items i and their respective weight w and value p to be used in this task.