## How Many Ways to Solve this Problem? Generate a Calendar that Displays Average Hours Per Day

18 07 2011

July 18, 2011

I am surprised at the significant number of unique solutions to the SQL problems that I have previously posed on this blog.  For fun I thought that I would give people another chance to demonstrate their unique approaches to solving another problem.

Supposed that a transaction table exists with the following definition:

```CREATE TABLE T1 (
TRANS_ID NUMBER,
EMPLOYEE_ID VARCHAR2(10),
PART_ID VARCHAR2(20),
CLOCK_IN DATE,
CLOCK_OUT DATE,
QTY_COMPLETE NUMBER,
PRIMARY KEY (TRANS_ID));  ```

The above table simulates a transaction table that might capture production run times for people (or even robots) to produce sets of parts.  The requirement is rather simple – we want to know the average number of hours required to produce a particular part by day in a calendar type layout.  The only restriction is that the output must appear in a SQL*Plus window (or SQL*Plus may be used to spool the result to a text file).  For example, the output might look like this (feel free to be creative):

```PART_ID    W     SUNDAY     MONDAY    TUESDAY  WEDNESDAY   THURSDAY     FRIDAY   SATURDAY
-------- --- ---------- ---------- ---------- ---------- ---------- ---------- ----------
PART1      1        .98        .56        .67       1.21        .23        .98        .02
PART1      2        .57        .87        .76        .01        .45        .55        .77
PART1      3        .01        .77        .66        .54        .02        .44        .92
PART1      4        .55        .02        .88        .11        .05        .33        .89
PART1      5        .11        .43        .55        .14        .03        .90        .78
PART1      6        .87        .45        .63        .02        .82        .78        .55
PART1      7        .02                                         .01 ```

If there was no transaction for a particular part on a particular date, a blank (NULL) should appear; to prevent division by zero errors, the hours should be output divided by one if the total QTY_COMPLETE is zero.

Let’s create some reproducible sample data in table T1:

```DROP TABLE TABLE T1_EMP PURGE;

CREATE TABLE T1_EMP AS
SELECT
'EMP'||TO_CHAR(ROWNUM) EMPLOYEE_ID
FROM
DUAL
CONNECT BY
LEVEL<=1000;

DROP TABLE T1_PART PURGE;

CREATE TABLE T1_PART AS
SELECT
'PART'||TO_CHAR(ROWNUM) PART_ID
FROM
DUAL
CONNECT BY
LEVEL<=200;

INSERT INTO
T1
ROWNUM TRANS_ID,
E.EMPLOYEE_ID,
P.PART_ID,
TO_DATE('01-JAN-2000','DD-MON-YYYY')+ROWNUM/500 CLOCK_IN,
TO_DATE('01-JAN-2000','DD-MON-YYYY')+ROWNUM/500 + ((MOD(ROWNUM,20) + 1) * 0.5) / 24 CLOCK_OUT,
ROUND(ABS(SIN(ROWNUM/180*3.141592)*10)) QTY_COMPLETE
FROM
(SELECT
EMPLOYEE_ID,
ROW_NUMBER() OVER (ORDER BY EMPLOYEE_ID) RN
FROM
T1_EMP) E,
(SELECT
PART_ID,
ROW_NUMBER() OVER (ORDER BY PART_ID) RN
FROM
T1_PART) P,
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=10) D
WHERE
MOD(E.RN,8)=MOD(P.RN,8);

COMMIT; ```

Taking a quick look at the sample data in table T1:

```COLUMN PART_ID FORMAT A8
COLUMN HOURS FORMAT 90.00
SET PAGESIZE 1000
SET LINESIZE 140
SET TRIMSPOOL ON

SELECT
PART_ID,
EMPLOYEE_ID,
TRUNC(CLOCK_IN) SHIFT_DATE,
CLOCK_OUT-CLOCK_IN HOURS,
QTY_COMPLETE
FROM
T1
WHERE
PART_ID='PART1'
ORDER BY
CLOCK_IN;

PART_ID  EMPLOYEE_I SHIFT_DAT  HOURS QTY_COMPLETE
-------- ---------- --------- ------ ------------
PART1    EMP811     31-JAN-01   0.19            9
PART1    EMP811     31-JAN-01   0.21            9
PART1    EMP811     31-JAN-01   0.23            9
PART1    EMP819     04-FEB-01   0.04           10
PART1    EMP819     04-FEB-01   0.06           10
PART1    EMP819     04-FEB-01   0.08           10
PART1    EMP819     04-FEB-01   0.10           10
PART1    EMP819     04-FEB-01   0.13           10
PART1    EMP819     04-FEB-01   0.15           10
...
PART1    EMP912     28-MAR-01   0.19            2
PART1    EMP912     28-MAR-01   0.21            2
PART1    EMP912     28-MAR-01   0.23            2
PART1    EMP92      01-APR-01   0.04            0
PART1    EMP92      01-APR-01   0.06            0
PART1    EMP92      01-APR-01   0.08            1
PART1    EMP92      01-APR-01   0.10            1
...
PART1    EMP992     11-MAY-01   0.19            5
PART1    EMP992     11-MAY-01   0.21            5
PART1    EMP992     11-MAY-01   0.23            5

1250 rows selected. ```

Now just a quick example of how you might calculate the average hours per piece (note that the HOURS column actually shows the number of days, NOT hours – multiply the numbers in that column by 24 to convert to hours):

```SELECT
PART_ID,
TRUNC(CLOCK_IN) SHIFT_DATE,
SUM(CLOCK_OUT-CLOCK_IN) HOURS,
SUM(QTY_COMPLETE) QTY_COMPLETE
FROM
T1
WHERE
PART_ID='PART1'
GROUP BY
PART_ID,
TRUNC(CLOCK_IN)
ORDER BY
TRUNC(CLOCK_IN);

PART_ID  SHIFT_DAT  HOURS QTY_COMPLETE
-------- --------- ------ ------------
PART1    01-JAN-00   1.35           90
PART1    05-JAN-00   1.35          100
PART1    09-JAN-00   1.35           98
PART1    13-JAN-00   1.35           81
PART1    17-JAN-00   1.35           56
PART1    21-JAN-00   1.35           25
PART1    25-JAN-00   1.35           10
PART1    29-JAN-00   1.35           44
...
PART1    29-APR-01   1.35           56
PART1    03-MAY-01   1.35           25
PART1    07-MAY-01   1.35           10
PART1    11-MAY-01   1.35           44

125 rows selected. ```

Interesting how the HOURS column always sums to 1.35 per day…

OK, for you self-starters, create the calendar output.  Think about any other kinds of analysis that might be done with this data – is it possible to determine which PART_ID takes the least (or the most) average time per piece on any given day.

For the non-self starters, scroll down.

One of the challenges that we face is determining the calendar row and column for any given shift date (the date associated with the CLOCK_IN date/time stamp).  Since in this case we know that the first row of data has a date of 01-JAN-2000, we need to determine the Sunday of that week.  One way to do that is to use the NEXT_DAY function, like this:

```SELECT
NEXT_DAY(TO_DATE('01-JAN-2000','DD-MON-YYYY')-7,'SUNDAY') S
FROM
DUAL;

S
---------
26-DEC-99 ```

The next step is to divide the shift dates into rows and columns based on the zero date of 26-DEC-99.  TRUNC and MOD could be used for this calculation, but to keep things interesting I will use TO_CHAR rather than MOD:

```COLUMN R FORMAT 99
COLUMN C FORMAT 9

SELECT
PART_ID,
NEXT_DAY(TO_DATE('01-JAN-2000','DD-MON-YYYY')-7,'SUNDAY') S,
TRUNC(CLOCK_IN) SHIFT_DATE,
TRUNC((TRUNC(CLOCK_IN)-NEXT_DAY(TO_DATE('01-JAN-2000','DD-MON-YYYY')-7,'SUNDAY'))/7)+1 R,
TO_NUMBER(TO_CHAR(TRUNC(CLOCK_IN),'D')) C,
SUM(CLOCK_OUT-CLOCK_IN) HOURS,
SUM(QTY_COMPLETE) QTY_COMPLETE
FROM
T1
WHERE
PART_ID='PART1'
GROUP BY
PART_ID,
TRUNC(CLOCK_IN)
ORDER BY
TRUNC(CLOCK_IN);

PART_ID  S         SHIFT_DAT   R  C  HOURS QTY_COMPLETE
-------- --------- --------- --- -- ------ ------------
PART1    26-DEC-99 01-JAN-00   1  7   1.35           90
PART1    26-DEC-99 05-JAN-00   2  4   1.35          100
PART1    26-DEC-99 09-JAN-00   3  1   1.35           98
PART1    26-DEC-99 13-JAN-00   3  5   1.35           81
PART1    26-DEC-99 17-JAN-00   4  2   1.35           56
PART1    26-DEC-99 21-JAN-00   4  6   1.35           25
PART1    26-DEC-99 25-JAN-00   5  3   1.35           10
...
PART1    26-DEC-99 16-NOV-00  47  5   1.35           72
PART1    26-DEC-99 20-NOV-00  48  2   1.35           90
PART1    26-DEC-99 24-NOV-00  48  6   1.35          100
PART1    26-DEC-99 28-NOV-00  49  3   1.35           98
PART1    26-DEC-99 02-DEC-00  49  7   1.35           81
...
PART1    26-DEC-99 25-APR-01  70  4   1.35           81
PART1    26-DEC-99 29-APR-01  71  1   1.35           56
PART1    26-DEC-99 03-MAY-01  71  5   1.35           25
PART1    26-DEC-99 07-MAY-01  72  2   1.35           10
PART1    26-DEC-99 11-MAY-01  72  6   1.35           44

125 rows selected. ```

The next step is to collapse all of the rows with the same R value into a single row – we will do this with a combination of the MAX and DECODE functions:

```SELECT
PART_ID,
R,
ROUND(MAX(DECODE(C,1,DECODE(QTY_COMPLETE,0,HOURS,HOURS/QTY_COMPLETE),NULL)),2) SUNDAY,
ROUND(MAX(DECODE(C,2,DECODE(QTY_COMPLETE,0,HOURS,HOURS/QTY_COMPLETE),NULL)),2) MONDAY,
ROUND(MAX(DECODE(C,3,DECODE(QTY_COMPLETE,0,HOURS,HOURS/QTY_COMPLETE),NULL)),2) TUESDAY,
ROUND(MAX(DECODE(C,4,DECODE(QTY_COMPLETE,0,HOURS,HOURS/QTY_COMPLETE),NULL)),2) WEDNESDAY,
ROUND(MAX(DECODE(C,5,DECODE(QTY_COMPLETE,0,HOURS,HOURS/QTY_COMPLETE),NULL)),2) THURSDAY,
ROUND(MAX(DECODE(C,6,DECODE(QTY_COMPLETE,0,HOURS,HOURS/QTY_COMPLETE),NULL)),2) FRIDAY,
ROUND(MAX(DECODE(C,7,DECODE(QTY_COMPLETE,0,HOURS,HOURS/QTY_COMPLETE),NULL)),2) SATURDAY
FROM
(SELECT
PART_ID,
TRUNC(CLOCK_IN) SHIFT_DATE,
TRUNC((TRUNC(CLOCK_IN)-NEXT_DAY(TO_DATE('01-JAN-2000','DD-MON-YYYY')-7,'SUNDAY'))/7)+1 R,
TO_NUMBER(TO_CHAR(TRUNC(CLOCK_IN),'D')) C,
SUM(CLOCK_OUT-CLOCK_IN) HOURS,
SUM(QTY_COMPLETE) QTY_COMPLETE
FROM
T1
WHERE
PART_ID='PART1'
GROUP BY
PART_ID,
TRUNC(CLOCK_IN))
GROUP BY
PART_ID,
R
ORDER BY
PART_ID,
R;

PART_ID    R     SUNDAY     MONDAY    TUESDAY  WEDNESDAY   THURSDAY     FRIDAY   SATURDAY
-------- --- ---------- ---------- ---------- ---------- ---------- ---------- ----------
PART1      1                                                                          .02
PART1      2                                         .01
PART1      3        .01                                         .02
PART1      4                   .02                                         .05
PART1      5                              .14                                         .03
PART1      6                                         .02
PART1      7        .02                                         .01
PART1      8                   .01                                         .02
...
PART1     69                              .01                                         .01
PART1     70                                         .02
PART1     71        .02                                         .05
PART1     72                   .14                                         .03

72 rows selected. ```

Of course there is a problem with the above – the scale of the time is actually in days, rather than hours, but that is easy enough to fix.

Anyone care to share their approach to solving the original problem?

Assume that the output will be used by a production supervisor – what other kinds of analysis can be performed?  For example, could you produce an analysis like the one below that compares recent transactions with those from prior periods and color codes those prior periods based on how the current period (the Avg Hrs Pc column in the below screen capture) compares with the prior periods:

Be creative – performance, while important, is not the primary objective.