Thank you for participating in this anonymous user study!
Like any programming language, SQL allows you to express complex computation in various (equivalent) forms. This survey studies two particular forms of authoring user-defined SQL functions (UDFs). Both forms are used to implement the same recursive programming problems—but the forms read differently:
This survey compares these two forms in terms of readability and thus focuses on your ability to read and understand snippets of SQL code. We therefore ask you to work through the study without the use of external programs: answer the questions below without executing the UDFs on a SQL database system. Use pen and paper only (if required at all). Thank you!
The SQL dialect presented to you in this survey leans heavily on PostgreSQL 12. If any feature found in the following queries is new to you, feel free to look them up here.
Important: We will also ask you for the time it took you to complete each task. As you complete each task, keep track of the required time and enter it in the boxes below. Please answer truthfully. Each task should take about 5 − 10 minutes, if you find yourself exceeding this timespan, please feel free to skip to the next task (in this case, please tick the corresponding box).
The entire survey should take about 30 − 45 minutes to complete. Once done, please make sure to press the Submit your results!-button at the end.
Graham et. al. define the Fibonacci number function in Concrete Mathematics, 2nd edition as:
We propose the following UDFs which aim to compute the Fibonacci number of 𝑛n using SQL. You can assume that each of the UDFs are syntactically sound but only some of them implement the Fibonacci number function correctly.
CREATE FUNCTION fib(n int) RETURNS int AS $$ WITH RECURSIVE fib(i, prev_n, curr_n) AS ( VALUES (1, 0, 1) UNION SELECT f.i + 1, f.curr_n, f.prev_n - f.curr_n FROM fib AS f WHERE f.i <= n ) SELECT f.curr_n FROM fib AS f WHERE f.i = n; $$ LANGUAGE SQL;
CREATE FUNCTION fib(n int) RETURNS int AS $$ WITH RECURSIVE fib(i, prev_n, curr_n) AS ( SELECT 1, 0, 1 UNION SELECT f.i + 1, f.curr_n, f.prev_n + f.curr_n FROM fib AS f WHERE f.i BETWEEN 0 AND n ) SELECT f.curr_n FROM fib AS f WHERE f.i < n; $$ LANGUAGE SQL;
CREATE FUNCTION fib(n int) RETURNS int AS $$ WITH RECURSIVE fib(i, prev_n, curr_n) AS ( VALUES (1, 0, 1) UNION ALL SELECT f.i + 1, f.curr_n, f.prev_n + f.curr_n FROM fib AS f WHERE f.i <= n ) SELECT f.prev_n FROM fib AS f ORDER BY f.i DESC LIMIT 1; $$ LANGUAGE SQL;
CREATE FUNCTION fib(n int) RETURNS int AS $$ WITH RECURSIVE fib(i, prev_n, curr_n) AS ( VALUES (1, 0, 1) UNION ALL SELECT f.i + 1, f.curr_n, f.prev_n + f.curr_n FROM fib AS f WHERE f.i < n ) SELECT f.curr_n FROM fib AS f ORDER BY f.i LIMIT 1; $$ LANGUAGE SQL;
CREATE FUNCTION fib(n int) RETURNS int AS $$ SELECT 0 WHERE n <= 1 UNION ALL SELECT fib(n-1) + fib(n-2) WHERE n > 1 $$ LANGUAGE SQL;
CREATE FUNCTION fib(n int) RETURNS int AS $$ SELECT CASE WHEN n = 0 THEN 0 WHEN n = 1 THEN 1 ELSE fib(n-1) + fib(n-2) END; $$ LANGUAGE SQL;
CREATE FUNCTION fib(n int) RETURNS int AS $$ SELECT CASE WHEN n = 0 THEN 0 WHEN n = 1 THEN 1 ELSE fib(n-1) - fib(n-2) END; $$ LANGUAGE SQL;
CREATE FUNCTION fib(n int) RETURNS int AS $$ SELECT 1 WHERE n <= 2 UNION ALL SELECT fib(n-1) + fib(n-2) WHERE n > 2 $$ LANGUAGE SQL;
Knuth et. al. define the calculation of the greatest common divisor in The Art of Programming (vol. 2 Seminumerical Algorithms), 3rd edition as:
We propose the following SQL UDFs which aim to compute the greatest common divisor of 𝑢u and 𝑣v. You can assume that each of the UDFs are syntactically sound but only some of them implement the greatest common divisor function correctly.
CREATE FUNCTION gcd(n int, k int) RETURNS int AS $$ SELECT CASE WHEN k = 0 THEN n ELSE gcd(k, MOD(n,k)) END; $$ LANGUAGE SQL;
CREATE FUNCTION gcd(n int, k int) RETURNS int AS $$ SELECT n WHERE k = 0 UNION ALL SELECT gcd(k, MOD(k,n)) WHERE k > 0; $$ LANGUAGE SQL;
CREATE FUNCTION gcd(n int, k int) RETURNS int AS $$ SELECT CASE WHEN n = 0 THEN k ELSE gcd(k, MOD(k,n)) END; $$ LANGUAGE SQL;
CREATE FUNCTION gcd(n int, k int) RETURNS int AS $$ SELECT k WHERE k = 0 UNION SELECT gcd(k, MOD(n,k)) WHERE k > 0; $$ LANGUAGE SQL;
CREATE FUNCTION gcd(n int, k int) RETURNS int AS $$ WITH RECURSIVE gcd(n,k) AS ( SELECT n, k UNION ALL SELECT g.k, MOD(g.n,g.k) FROM gcd AS g WHERE g.k <> 0 ) SELECT g.n FROM gcd AS g WHERE g.k = 0; $$ LANGUAGE SQL;
CREATE FUNCTION gcd(n int, k int) RETURNS int AS $$ WITH RECURSIVE gcd(n,k) AS ( SELECT n, k UNION SELECT g.k, MOD(g.n,g.k) FROM gcd AS g WHERE g.k > 0 ) SELECT g.n FROM gcd AS g ORDER BY g.k LIMIT 1; $$ LANGUAGE SQL;
CREATE FUNCTION gcd(n int, k int) RETURNS int AS $$ WITH RECURSIVE gcd(n,k) AS ( SELECT n, k UNION ALL SELECT g.k, MOD(g.n,g.k) FROM gcd AS g WHERE g.k <> 0 ) SELECT g.k FROM gcd AS g WHERE g.k = 0; $$ LANGUAGE SQL;
CREATE FUNCTION gcd(n int, k int) RETURNS int AS $$ WITH RECURSIVE gcd(n,k) AS ( SELECT n, k UNION ALL SELECT g.k, MOD(g.n,g.k) FROM gcd AS g WHERE g.k = 0 ) SELECT g.n FROM gcd AS g ORDER BY g.k DESC LIMIT 1; $$ LANGUAGE SQL;
Consider the following table s:
CREATE TABLE s ( a serial PRIMARY KEY, b float NOT NULL );
We define the following function f(i,j,k) which evaluates over table s:
CREATE FUNCTION f(i int, j int, k float) RETURNS float AS $$ SELECT CASE WHEN i > j THEN k ELSE (SELECT f(i+1, j, s.b + 0.5 * k) FROM s WHERE s.a = i) END; $$ LANGUAGE SQL;
Keep track on how long it takes you to describe (in 100 words or less) what the function f(i int, j int, 0) does:
xxxxxxxxxx
What does f(i int, j int, 0) do?
WRITE YOUR YOUR ANSWER HERE!
Consider the following table t:
CREATE TABLE t ( x int PRIMARY KEY, y int REFERENCES t(x), z int NOT NULL );
Consider the following function g(a int) which evaluates over table t:
CREATE FUNCTION g(a int) RETURNS bigint AS $$ WITH RECURSIVE r(x, y, z) AS ( SELECT t.x, t.y, t.z FROM t WHERE t.x = a UNION SELECT r.x, t.y, t.z FROM r, t WHERE r.y = t.x ) SELECT SUM(r.z) FROM r; $$ LANGUAGE SQL;
Keep track on how long it takes you to describe (in 100 words or less) what the function g(a int) does:
What does g(a int) do?
Important: This part builds on functions found in part 3 and part 4 of this survey. Make sure to answer these parts first, before you continue here.
INSERT INTO s(a,b) VALUES (1,4),(2,2.5),(3,1.5);
Keep track on how long it takes you to compute (using pen and paper) the result of the function from part 3 when called as f(1,3,0). Write down the result:
The result of f(1,3,0)?
INSERT INTO t(x,y,z) VALUES (1,2,5),(2,4,3),(3,2,2),(4,3,1);
Keep track on how long it takes you to compute (using pen and paper) the result of the function from part 4 when called as g(4). Write down the result:
The result of g(4)?
Martello et. al. define the 0-1 Knapsack Problem in Knapsack Problems - Algorithms and Computer Implementations as follows:
Consider items 𝑖∈{1,…,𝑛}i∈{1,…,n} where each item has a weight 𝑤𝑖wi and a value 𝑝𝑖pi. Then knap(𝑛,𝑤)knap(n,w) maximizes the sum of values of items that fit into a knapsack of weight 𝑤w and is defined as:
Keep track of how long it takes you to complete the SQL function knap(n,w) which solves the 0-1 Knapsack problem either using a recursive or CTE-based UDF. Submit your solution in the following textbox in which we also provide to you the definition of table items which holds all items i and their respective weight w and value p to be used in this task.
CREATE TABLE items (
i serial PRIMARY KEY,
w int NOT NULL,
p int NOT NULL
);
CREATE FUNCTION knap(k int, u int)
RETURNS int AS $$
/* YOUR CODE HERE */
$$ LANGUAGE SQL;