## SQL Challenges

14 06 2012

June 14, 2012

Dominic Delmolino put together a very interesting challenge.  The challenge is to produce something called a Pascal matrix using Oracle Database… more specifically, just SQL.  I had a vague recollection of Pascal matrixes when I read Dominic’s challenge.  Basically, the goal is to create a matrix similar to the following:

The rule for generating the matrix is simply that a cell’s value is the sum of the value in the cell that is immediately to the left plus the value in the cell that is immediately above.  Sounds easy, right?

If we were just working in Microsoft Excel (or some other spreadsheet package), we could do something like this to quickly create the matrix:

Dominic’s challenge probably would not be much of a challenge if we could just type in formulas like the above into a SQL statement.  Give his challenge a try to see if you are able to derive a unique solution to the problem.  I probably spent a couple of minutes (maybe 60 seconds with the help of copy and paste) creating the above example using Microsoft Excel, but spent a couple of hours trying to produce a solution that worked using SQL.

——

Part 2 of the challenge.

Take a look at the bullet point items in this blog article about NULL values in table’s columns, in particular the first three large bullet point items.  Do you agree or disagree with the statements, and why?

### 14 responses

15 06 2012

Easy way is this

select
col_1,
col_2,
col_3,
col_4,
col_5,
col_6,
col_7,
col_8
from
dual connect by level 1] = col_1[cv()] + col_2[cv(lev) – 1] ,
col_3[lev > 1] = col_2[cv()] + col_3[cv(lev) – 1] ,
col_4[lev > 1] = col_3[cv()] + col_4[cv(lev) – 1] ,
col_5[lev > 1] = col_4[cv()] + col_5[cv(lev) – 1] ,
col_6[lev > 1] = col_5[cv()] + col_6[cv(lev) – 1] ,
col_7[lev > 1] = col_6[cv()] + col_7[cv(lev) – 1] ,
col_8[lev > 1] = col_7[cv()] + col_8[cv(lev) – 1]
);

15 06 2012

Raj,

Thanks for posting, but it appears that a portion of your SQL statement disappeared. Less than and greater than signs in comments can cause problems, because those characters are used to denote formatting keywords on web pages. I am interested in seeing the full SQL statement, please repost. See the note at the very bottom of the blue section at the right regarding what substitutions needs to be performed to include less than and greater than signs in comments. Thanks.

P.S. Please also post your solution at http://www.oraclemusings.com/?p=221 – you will need to make the same replacements for less than and greater than signs on that site.

16 06 2012

Hi Charles,

Apologies. I didn’t read the guidelines before posting the comments. My bad. I think I got it right this time.

```SQL> l
1  select
2  col_1,
3  col_2,
4  col_3,
5  col_4,
6  col_5,
7  col_6,
8  col_7,
9  col_8
10  from
11  dual connect by level <= 8
12  model
13  dimension by (level as lev)
14  measures (
15  1 as col_1, 1 as col_2, 1 as col_3, 1 as col_4, 1 as col_5, 1 as col_6, 1 as col_7, 1 as col_8)
16  rules
17  (
18  col_1[any] = 1,
19  col_2[lev > 1] = col_1[cv()] + col_2[cv(lev) - 1] ,
20  col_3[lev > 1] = col_2[cv()] + col_3[cv(lev) - 1] ,
21  col_4[lev > 1] = col_3[cv()] + col_4[cv(lev) - 1] ,
22  col_5[lev > 1] = col_4[cv()] + col_5[cv(lev) - 1] ,
23  col_6[lev > 1] = col_5[cv()] + col_6[cv(lev) - 1] ,
24  col_7[lev > 1] = col_6[cv()] + col_7[cv(lev) - 1] ,
25  col_8[lev > 1] = col_7[cv()] + col_8[cv(lev) - 1]
26* )
SQL> /

COL_1      COL_2      COL_3      COL_4      COL_5      COL_6      COL_7      COL_8
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1          1          1          1          1          1          1          1
1          2          3          4          5          6          7          8
1          3          6         10         15         21         28         36
1          4         10         20         35         56         84        120
1          5         15         35         70        126        210        330
1          6         21         56        126        252        462        792
1          7         28         84        210        462        924       1716
1          8         36        120        330        792       1716       3432

8 rows selected.
```

Thanks

Raj

16 06 2012

Raj,

That is impressive.

I have not worked with the model clause, but had hoped that someone would post an example using that clause. It appears that the model clause is quite powerful, allowing the specification of formulas that are similar to the formulas that I used in Microsoft Excel at the start of this article.

Thanks again for posting the example.

15 06 2012

There are clearly some very creative solutions to this problem, as seen on the other blog.

Formatting of code sections is a bit of a problem, so I thought that I would share my two solutions here.

The first solution that I created:
I have a degree in mathematics, but that degree is very, very rusty. Based on the provided link, we are trying to produce an array like the following, where the value in a cell in the array is the sum of the value immediately above and immediately to the left:

```1  1  1  1  1
1  2  3  4  5
1  3  6 10 15
1  4 10 20 35
1  5 15 35 70
```

The first rows have the following values:

```1^0 + 2^0
1^0 + 2^0 + 3^0
1^0 + 2^0 + 3^0 + 4^0
1^0 + 2^0 + 3^0 + 4^0 + 5^0
```

The second row has the following values:

```1^0 + 2^0 + 1^0
1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 3^0 + 2 * 2^0 + 3 * 1^0
1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 4^0 + 2 * 3^0 + 3* 2^0 + 4 * 1^0
1^0 + 2^0 + 3^0 + 4^0 + 5^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 5^0 + 2 * 4^0 + 3 * 3^0 + 4 * 2^0 + 5 * 1^0
```

The third row:

```1^0 + 2^0 + 1^0 + 1^0
1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0 = 1 * 3^0 + 3 * 2^0 + 6 * 1^0
1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0 = 1 * 4^0 + 3 * 3^0 + 6 * 2^0 + 10 * 1^0
1^0 + 2^0 + 3^0 + 4^0 + 5^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0
```

Now we can just… cheat and look at the formula from the article that you provided:

```(i + j - 2)! / ((i - 1)! * (j - 1)!)
```

! is short for factorial

```4! = 4 * 3 * 2 * 1
3! = 3 * 2 * 1
2! = 2 * 1
1! = 1
```

Calculating the factorial is a bit of a challenge. The SUM analytic function can be used to generate running sums of a column, but there is no equivalent for running products of a column.
If we calculate the natural log of a column value, generate a running sum of those natural log values, and then generate the anti-log, we can essentially generate a running product (factorial) for a column:

```WITH
FACTORIAL AS
(SELECT /*+ MATERIALIZE */
N,
EXP(SUM(LN(N)) OVER (ORDER BY N)) FACTORIAL
FROM
(SELECT
ROWNUM N,
LN(ROWNUM) RNL
FROM
DUAL
CONNECT BY
LEVEL<=20))
SELECT
N,
FACTORIAL F
FROM
FACTORIAL;

N          F
-- ----------
1          1
2          2
3          6
4         24
5        120
6        720
7       5040
8      40320
9     362880
10    3628800
11   39916800
12  479001600
13 6227020800
14 8.7178E+10
15 1.3077E+12
16 2.0923E+13
17 3.5569E+14
18 6.4024E+15
19 1.2165E+17
20 2.4329E+18
```

One problem is that with the formula (i + j – 2)!, we could very well need to calculate 0!, which is defined as 1. To accomplish this, we need to add the following inside the WITH block:

```   UNION ALL
SELECT
0 N,
1 FACTORIAL
FROM
DUAL
```

Now it is just a matter of defining how large of an array is needed, performing a Cartesian join to achieve the correct number of rows (I * J), and placing the factorial values where necessary:

```WITH
FACTORIAL AS
(SELECT /*+ MATERIALIZE */
N,
EXP(SUM(LN(N)) OVER (ORDER BY N)) FACTORIAL
FROM
(SELECT
ROWNUM N,
LN(ROWNUM) RNL
FROM
DUAL
CONNECT BY
LEVEL<=20)
UNION ALL
SELECT
0 N,
1 FACTORIAL
FROM
DUAL),
I AS
(SELECT /*+ MATERIALIZE */
ROWNUM I
FROM
DUAL
CONNECT BY
LEVEL<=4),
J AS
(SELECT /*+ MATERIALIZE */
ROWNUM J
FROM
DUAL
CONNECT BY
LEVEL<=4)
SELECT
I.I,
J.J,
F1.FACTORIAL / (F2.FACTORIAL * F3.FACTORIAL) V
FROM
FACTORIAL F1,
FACTORIAL F2,
FACTORIAL F3,
I,
J
WHERE
F1.N=(I.I + J.J - 2)
AND F2.N=(I.I - 1)
AND F3.N=(J.J - 1)
ORDER BY
J.J,
I.I;
```
15 06 2012

The second solution that I created:
Here is another possible solution that uses SYS_CONNECT_BY_PATH to create formulas and XMLQUERY to calculate the value of those formulas.

First, the formulas are built:

```SELECT
ROWNUM N,
FF
FROM
(SELECT
'1'||SYS_CONNECT_BY_PATH(ROWNUM,' * ') FF
FROM
DUAL
CONNECT BY
LEVEL<=8
ORDER BY 1);

N FF
- ---------------------------------
1 1 * 1
2 1 * 1 * 2
3 1 * 1 * 2 * 3
4 1 * 1 * 2 * 3 * 4
5 1 * 1 * 2 * 3 * 4 * 5
6 1 * 1 * 2 * 3 * 4 * 5 * 6
7 1 * 1 * 2 * 3 * 4 * 5 * 6 * 7
8 1 * 1 * 2 * 3 * 4 * 5 * 6 * 7 * 8
```

As shown above, I am only planning to calculate through 8!, but that will later be changed to 20!.

Next, the value of the formulas are calculated:

```SELECT
ROWNUM N,
XMLQUERY((FF) RETURNING CONTENT).GETNUMBERVAL() FACTORIAL
FROM
(SELECT
'1'||SYS_CONNECT_BY_PATH(ROWNUM,' * ') FF
FROM
DUAL
CONNECT BY
LEVEL<=8
ORDER BY 1);

N  FACTORIAL
- ----------
1          1
2          2
3          6
4         24
5        120
6        720
7       5040
8      40320
```

The above SQL statement does not calculate 0!, so that problem will be handled later by a DECODE statement.

The following will be used to derive the i and j values for an 8 x 8 matrix:

```SELECT
MOD(ROWNUM-1,8)+1 I,
TRUNC((ROWNUM-1)/8)+1 J
FROM
DUAL
CONNECT BY
LEVEL <= 8 * 8;
```

The final query uses two WITH block subqueries, and scalar subqueries to pull the correct factorial value for the calculation:

```WITH
FACTORIAL AS
(SELECT /*+ MATERIALIZE */
ROWNUM N,
XMLQUERY((FF) RETURNING CONTENT).GETNUMBERVAL() FACTORIAL
FROM
(SELECT
'1'||SYS_CONNECT_BY_PATH(ROWNUM,' * ') FF
FROM
DUAL
CONNECT BY
LEVEL<=20
ORDER BY 1)),
NUMBERS AS
(SELECT
MOD(ROWNUM-1,8)+1 I,
TRUNC((ROWNUM-1)/8)+1 J
FROM
DUAL
CONNECT BY
LEVEL <= 8 * 8)
SELECT
N.I,
N.J,
DECODE((N.I + N.J – 2),0,1,
(SELECT
F.FACTORIAL
FROM
FACTORIAL F
WHERE
F.N=(N.I + N.J – 2))) /
DECODE((N.I - 1),0,1,
(SELECT
F.FACTORIAL
FROM
FACTORIAL F
WHERE
F.N=(N.I - 1))) /
DECODE((N.J - 1),0,1,
(SELECT
F.FACTORIAL
FROM
FACTORIAL F
WHERE
F.N=(N.J - 1))) V
FROM
NUMBERS N;

I  J     V
-- -- -----
1  1     1
2  1     1
3  1     1
4  1     1
5  1     1
6  1     1
7  1     1
8  1     1
1  2     1
2  2     2
3  2     3
4  2     4
5  2     5
6  2     6
7  2     7
8  2     8
1  3     1
2  3     3
3  3     6
4  3    10
5  3    15
6  3    21
7  3    28
8  3    36
1  4     1
2  4     4
3  4    10
4  4    20
5  4    35
6  4    56
7  4    84
8  4   120
1  5     1
2  5     5
3  5    15
4  5    35
5  5    70
6  5   126
7  5   210
8  5   330
1  6     1
2  6     6
3  6    21
4  6    56
5  6   126
6  6   252
7  6   462
8  6   792
1  7     1
2  7     7
3  7    28
4  7    84
5  7   210
6  7   462
7  7   924
8  7  1716
1  8     1
2  8     8
3  8    36
4  8   120
5  8   330
6  8   792
7  8  1716
8  8  3432
```
16 06 2012

[…] with the solutions to my little problem of generating a symmetric Pascal matrix using SQL. Charles Hooper in particular has provided some very nice commentary on the problem, complete with diagrams and 2 alternative […]

17 06 2012

select * from (
select *
from
(select level x from dual connect by level<=8),
(select level y from dual connect by level1, y>1] = s[cv(x)-1, cv()] + s[cv(), cv(y)-1]
)
)
pivot (max(s) for x in (1,2,3,4,5,6,7,8))
order by 1

17 06 2012

Mrpy,

An interesting looking SQL statement. Unfortunately, it appears that part of your SQL statement disappeared:

```ERROR at line 5:
ORA-00920: invalid relational operator
```

Unfortunately, less than and greater than signs are HTML formatting keywords. As such, it is necessary to replace less than (<) and greater than (>) signs in comments with the appropriate HTML equivalent – see the bottom of the blue section at the right.

I am looking forward to seeing this full SQL statement.

20 06 2012

[…] with the solutions to my little problem of generating a symmetric Pascal matrix using SQL. Charles Hooper in particular has provided some very nice commentary on the problem, complete with diagrams and 2 alternative […]

17 08 2012

Hi,

Here is another solutionusing the model clause:

```with m as
(select
*
from
dual
model
ignore nav
dimension by (1 x, 1 y)
measures (1 m)
rules
(
m[for x from 1 to 8 increment 1,1] = 1
,m[1, for y from 1 to 8 increment 1] = 1
,m[for x from 2 to 8 increment 1, for y from 2 to 8 increment 1] =  m[cv(x)-1,cv(y)] + m[cv(x), cv(y) -1]
)
)
select
*
from
m
pivot (sum(m) for x in ('1', '2', '3','4' ,'5' ,'6' ,'7' ,'8'))
order by
y;
```

The result is below:

```Y '1' '2' '3' '4' '5' '6' '7' '8'
- --- --- --- --- --- --- --- ---
1   1   1   1   1   1   1   1   1
2   1   2   3   4   5   6   7   8
3   1   3   6  10  15  21  28  36
4   1   4  10  20  35  56  84 120
5   1   5  15  35  70 126 210 330
6   1   6  21  56 126 252 462 792
7   1   7  28  84 210 462 924 1716
8   1   8  36 120 330 792 1716 3432

8 rows selected

```

Regards,

Peter

21 08 2012

Peter,

Sorry for the delay in responding. Very nice solution with the MODEL clause. I suspected that there was an elegant solution to the problem using the MODEL clause, but I have not worked with it often enough to put the solution together from the beginning to the end.

10 02 2013

Neat post! 🙂 I was able to solve it in a few minutes with this query (using postgres):

select x,y,factorial(x+y-2)/(factorial(x-1)*factorial(y-1)) as value from (select x,generate_series(1,8) as y from (select generate_series(1,8) as x) as s) as a group by x,y order by value;

10 02 2013

Morgan,

Thank you for offering a solution to this SQL challenge – there are a couple of other SQL challenges on this blog.

Postgres certainly seems to offer features/funtions that make this problem seem easy to solve, based on the solution that you offered.

If I recall correctly, I thought that one of Postgres greatest asset was its ability to nearly behave as a drop-in replacement for Oracle Database. Postgres ability to select rows without specifying a datasource is interesting, as is its function to calculate factorials.