RECURSION IN SQL

User Study

Introduction

Thank you for participating in this anonymous user study!

Like any programming language, SQL allows you to express complex computation in various (equivalent) forms. This survey studies two particular forms of authoring user-defined SQL functions (UDFs). Both forms are used to implement the same recursive programming problems—but the forms read differently:

This survey compares these two forms in terms of readability and thus focuses on your ability to read and understand snippets of SQL code. We therefore ask you to work through the study without the use of external programs: answer the questions below without executing the UDFs on a SQL database system. Use pen and paper only (if required at all). Thank you!

The SQL dialect presented to you in this survey leans heavily on PostgreSQL 12. If any feature found in the following queries is new to you, feel free to look them up here.

Important: We will also ask you for the time it took you to complete each task. As you complete each task, keep track of the required time and enter it in the boxes below. Please answer truthfully. Each task should take about 5 − 10 minutes, if you find yourself exceeding this timespan, please feel free to skip to the next task (in this case, please tick the corresponding box).

The entire survey should take about 30 − 45 minutes to complete. Once done, please make sure to press the Submit your results!-button at the end.

Before we begin, please list the three (regular) programming languages that you are most familiar with:
1.
2.
3.

I have had some exposure to Functional Programming:
Yes No

Part 1: Fibonacci Numbers

Graham et. al. define the Fibonacci number function in Concrete Mathematics, 2nd edition as:

(1)fib(0)=0fib(1)=1fib(n)=fib(n1)+fib(n2)

We propose the following UDFs which aim to compute the Fibonacci number of n using SQL. You can assume that each of the UDFs are syntactically sound but only some of them implement the Fibonacci number function correctly.

  1. Keep track of how long it takes for you to choose which of the following CTE-based UDFs implement the Fibonacci number function correctly (there may be more than one):

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  WITH RECURSIVE
  fib(i, prev_n, curr_n) AS (
    VALUES (1, 0, 1)
      UNION 
    SELECT f.i + 1,
           f.curr_n,
           f.prev_n - f.curr_n
    FROM   fib AS f
    WHERE  f.i <= n
  )
  SELECT f.curr_n 
  FROM   fib AS f
  WHERE  f.i = n;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  WITH RECURSIVE
  fib(i, prev_n, curr_n) AS (
    SELECT 1, 0, 1
      UNION 
    SELECT f.i + 1,
           f.curr_n,
           f.prev_n + f.curr_n
    FROM   fib AS f
    WHERE  f.i BETWEEN 0 AND n
  )
  SELECT f.curr_n
  FROM   fib AS f
  WHERE  f.i < n;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  WITH RECURSIVE
  fib(i, prev_n, curr_n) AS (
    VALUES (1, 0, 1)
      UNION ALL
    SELECT f.i + 1,
           f.curr_n,
           f.prev_n + f.curr_n
    FROM   fib AS f
    WHERE  f.i <= n
  )
  SELECT   f.prev_n
  FROM     fib AS f
  ORDER BY f.i DESC
  LIMIT 1;
$$ LANGUAGE SQL;  

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  WITH RECURSIVE
  fib(i, prev_n, curr_n) AS (
    VALUES (1, 0, 1)
      UNION ALL
    SELECT f.i + 1,
           f.curr_n,
           f.prev_n + f.curr_n
    FROM   fib AS f
    WHERE  f.i < n
  )
  SELECT   f.curr_n
  FROM     fib AS f
  ORDER BY f.i 
  LIMIT 1;
$$ LANGUAGE SQL;

Solving this task took about minutes.
Or, tick this () if you want to skip it.

  1. Keep track of how long it takes for you to choose which of the following recursive UDFs implement the Fibonacci number function correctly (there may be more than one):

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  SELECT 0
  WHERE  n <= 1
    UNION ALL
  SELECT fib(n-1) + fib(n-2)
  WHERE  n > 1
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  SELECT CASE
    WHEN n = 0 THEN 0
    WHEN n = 1 THEN 1
    ELSE fib(n-1) + fib(n-2)
  END;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  SELECT CASE
    WHEN n = 0 THEN 0
    WHEN n = 1 THEN 1
    ELSE fib(n-1) - fib(n-2)
  END;
$$ LANGUAGE SQL;

CREATE FUNCTION fib(n int)
RETURNS int AS $$
  SELECT 1
  WHERE  n <= 2
    UNION ALL
  SELECT fib(n-1) + fib(n-2)
  WHERE  n > 2
$$ LANGUAGE SQL;

Solving this task took about minutes.
Or, tick this () if you want to skip it.

Part 2: Greatest Common Divisor

Knuth et. al. define the calculation of the greatest common divisor in The Art of Programming (vol. 2 Seminumerical Algorithms), 3rd edition as:

(2)gcd(n,0)=ngcd(n,k)=gcd(k,n mod k)

We propose the following SQL UDFs which aim to compute the greatest common divisor of u and v. You can assume that each of the UDFs are syntactically sound but only some of them implement the greatest common divisor function correctly.

  1. Keep track of how long it takes for you to choose which of the following recursive UDFs compute the greatest common divisor correctly (there may be more than one):

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  SELECT CASE
    WHEN k = 0 THEN n
    ELSE gcd(k, MOD(n,k))
  END;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  SELECT n
  WHERE  k = 0
    UNION ALL
  SELECT gcd(k, MOD(k,n))
  WHERE  k > 0;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  SELECT CASE
    WHEN n = 0 THEN k
    ELSE gcd(k, MOD(k,n))
  END;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  SELECT k
  WHERE  k = 0
    UNION 
  SELECT gcd(k, MOD(n,k))
  WHERE  k > 0;
$$ LANGUAGE SQL;

Solving this task took about minutes.
Or, tick this () if you want to skip it.

  1. Keep track of how long it takes for you to choose which of the following CTE-based UDFs compute the greatest common divisor correctly (there may be more than one):

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  WITH RECURSIVE gcd(n,k) AS (
    SELECT n, k
      UNION ALL
    SELECT g.k, MOD(g.n,g.k)
    FROM   gcd AS g
    WHERE  g.k <> 0
  )
  SELECT g.n
  FROM   gcd AS g
  WHERE  g.k = 0;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  WITH RECURSIVE gcd(n,k) AS (
    SELECT n, k
      UNION 
    SELECT g.k, MOD(g.n,g.k)
    FROM   gcd AS g
    WHERE  g.k > 0
  )
  SELECT   g.n
  FROM     gcd AS g
  ORDER BY g.k
  LIMIT 1;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  WITH RECURSIVE gcd(n,k) AS (
    SELECT n, k
      UNION ALL
    SELECT g.k, MOD(g.n,g.k)
    FROM   gcd AS g
    WHERE  g.k <> 0
  )
  SELECT g.k
  FROM   gcd AS g
  WHERE  g.k = 0;
$$ LANGUAGE SQL;

CREATE FUNCTION gcd(n int, k int)
RETURNS int AS $$
  WITH RECURSIVE gcd(n,k) AS (
    SELECT n, k
      UNION ALL
    SELECT g.k, MOD(g.n,g.k)
    FROM   gcd AS g
    WHERE  g.k = 0
  )
  SELECT   g.n
  FROM     gcd AS g
  ORDER BY g.k DESC
  LIMIT 1;
$$ LANGUAGE SQL;


Solving this task took about minutes.
Or, tick this () if you want to skip it.

Part 3: Comprehension I

Consider the following table s:

CREATE TABLE s (
  a serial PRIMARY KEY,
  b float  NOT NULL
);

We define the following function f(i,j,k) which evaluates over table s:

CREATE FUNCTION f(i int, j int, k float) 
RETURNS float AS $$
  SELECT CASE 
    WHEN i > j THEN k 
    ELSE (SELECT f(i+1, j, s.b + 0.5 * k)
          FROM   s
          WHERE  s.a = i)
  END;
$$ LANGUAGE SQL;

Keep track on how long it takes you to describe (in 100 words or less) what the function f(i int, j int, 0) does:


Solving this task took about minutes.
Or, tick this () if you want to skip it.

Part 4: Comprehension II

Consider the following table t:

CREATE TABLE t (
  x int PRIMARY KEY, 
  y int REFERENCES t(x),
  z int NOT NULL 
);

Consider the following function g(a int) which evaluates over table t:

CREATE FUNCTION g(a int) 
RETURNS bigint AS $$
  WITH RECURSIVE 
  r(x, y, z) AS (
    SELECT t.x, t.y, t.z
    FROM   t
    WHERE  t.x = a
      UNION 
    SELECT r.x, t.y, t.z
    FROM   r, t
    WHERE  r.y = t.x
  )
  SELECT   SUM(r.z) 
  FROM     r;
$$ LANGUAGE SQL;

Keep track on how long it takes you to describe (in 100 words or less) what the function g(a int) does:


Solving this task took about minutes.
Or, tick this () if you want to skip it.

Part 5: Evaluation

Important: This part builds on functions found in part 3 and part 4 of this survey. Make sure to answer these parts first, before you continue here.

1. Consider the following instance for table s from part 3:

INSERT INTO s(a,b) VALUES (1,4),(2,2.5),(3,1.5);

Keep track on how long it takes you to compute (using pen and paper) the result of the function from part 3 when called as f(1,3,0).
Write down the result:


Solving this task took about minutes.
Or, tick this () if you want to skip it.

2. Consider the following instance for table t from part 4:

INSERT INTO t(x,y,z) VALUES (1,2,5),(2,4,3),(3,2,2),(4,3,1);

Keep track on how long it takes you to compute (using pen and paper) the result of the function from part 4 when called as g(4).
Write down the result:


Solving this task took about minutes.
Or, tick this () if you want to skip it.

Part 6: 0-1 Knapsack Problem

Martello et. al. define the 0-1 Knapsack Problem in Knapsack Problems - Algorithms and Computer Implementations as follows:

Consider items i{1,,n} where each item has a weight wi and a value pi. Then knap(n,w) maximizes the sum of values of items that fit into a knapsack of weight w and is defined as:

(3)knap(1,u)=0knap(k,u)={knap(k1,u), if wk>umax(knap(k1,u), knap(k1,uwk)+pk), otherwise

Keep track of how long it takes you to complete the SQL function knap(n,w) which solves the 0-1 Knapsack problem either using a recursive or CTE-based UDF. Submit your solution in the following textbox in which we also provide to you the definition of table items which holds all items i and their respective weight w and value p to be used in this task.

Solving this task took about minutes.
Or, tick this () if you want to skip it.




All done?

Thank you for participating!

formatted by Markdeep 1.10