Monitoring Changes to Table Data

22 03 2012

March 22, 2012

Suppose that you receive a request stating that a particular table in one of your databases must be monitored for changes. One such table might be the one that lists the ERP system’s suppliers and their addresses – you would not want someone to be able to set up a legitimate supplier, and later have the billing remit to name and address changed without being detected.

What approach would you take to solve the above problem? A question similar to the above arrived recently in an ERP mailing list email – the original poster (OP) is using a SQL Server RDBMS, so that might change the proposed solution just a bit.

Here is the “How simple is too simple?” suggestion that I offered:

Periodically, create a VENDOR_SAVE table (drop it if it already exists, or delete all rows and re-insert from the original source table):
CREATE TABLE VENDOR_SAVE AS
SELECT
  *
FROM
  VENDOR;
Now, just wait a while.

The question then becomes, how do we detect:

A new row (record) added to the original source (VENDOR) table.

An old row (record) deleted from the original source (VENDOR) table.

A change to any column (program field) in the original source (VENDOR) table since the last time the VENDOR_SAVE table was created/refreshed.

Let’s start with the first two bullet points. I will write the SQL statements so that the statements should work with Oracle Database 9.0.1 and above, and SQL Server with very few changes, even if I feel a bit ANSI about doing so. I will use the COALESCE function, rather than the NVL function, and CASE syntax rather than the equivalent DECODE syntax.

To identify cases where a row has appeared in, or disappeared from the original source (VENDOR) table, we can simply perform a full outer join between the original source table and the historical mirror image of the original table (VENDOR_SAVE). We are only interested in cases where the primary key column (ID) is found in exactly one of the two tables:
SELECT
  COALESCE(V.ID,V2.ID) AS ID,
  COALESCE(V.NAME,V2.NAME) AS NAME,
  COALESCE(V.ADDR_1,V2.ADDR_1) AS ADDR_1,
  CASE WHEN V.ID IS NULL THEN 'VENDOR ADDED' ELSE 'VENDOR DELETED' END AS PROBLEM
FROM
  VENDOR V
FULL OUTER JOIN
  VENDOR_SAVE V2
ON (V.ID=V2.ID)
WHERE
  V.ID IS NULL
  OR V2.ID IS NULL;
So, the above SQL statement satisfies the first two bullet points. The third bullet point is a little more challenging to accomplish… unless of course we employ UNION labor. If we have two row sources with identical columns, and UNION the row sources together, the resulting row source will be absent of any entirely duplicated rows from the two original row sources (two rows will be reduced to a single row). If there were no changes to any of the column values (or if the row was added to or deleted from the original source table), there will be a single row for the primary key column value. If any columns were changed, there will be two rows containing the primary key column value.

Let’s build a SQL statement that UNIONs the rows from the two tables together, and counts the number of rows for each primary key value:
SELECT
  COUNT(*) OVER (PARTITION BY VL.ID) AS CNT,
  VL.ID,
  VL.NAME,
  VL.ADDR_1
FROM
  (SELECT
    *
  FROM
    VENDOR
  UNION
  SELECT
    *
  FROM
    VENDOR_SAVE) VL;
To complete the requirement for bullet point 3 above, we need to eliminate all rows from the result set where there is a single row for the primary key value:
SELECT DISTINCT
  VL2.ID,
  VL2.NAME,
  VL2.ADDR_1,
  'VALUE CHANGE' PROBLEM
FROM
  (SELECT
    COUNT(*) OVER (PARTITION BY VL.ID) AS CNT,
    VL.ID,
    VL.NAME,
    VL.ADDR_1
  FROM
    (SELECT
      *
    FROM
      VENDOR
    UNION
    SELECT
      *
    FROM
      VENDOR_SAVE) VL) VL2
WHERE
  VL2.CNT>1;
As a final step, we should join the two resultsets into a single resultset using UNION ALL:
SELECT
  COALESCE(V.ID,V2.ID) AS ID,
  COALESCE(V.NAME,V2.NAME) AS NAME,
  COALESCE(V.ADDR_1,V2.ADDR_1) AS ADDR_1,
  CASE WHEN V.ID IS NULL THEN 'VENDOR ADDED' ELSE 'VENDOR DELETED' END AS PROBLEM
FROM
  VENDOR V
FULL OUTER JOIN
  VENDOR_SAVE V2
ON (V.ID=V2.ID)
WHERE
  V.ID IS NULL
  OR V2.ID IS NULL
UNION ALL
SELECT DISTINCT
  VL2.ID,
  VL2.NAME,
  VL2.ADDR_1,
  'VALUE CHANGE' PROBLEM
FROM
  (SELECT
    COUNT(*) OVER (PARTITION BY VL.ID) AS CNT,
    VL.ID,
    VL.NAME,
    VL.ADDR_1
  FROM
    (SELECT
      *
    FROM
      VENDOR
    UNION
    SELECT
      *
    FROM
      VENDOR_SAVE) VL) VL2
WHERE
  VL2.CNT>1
ORDER BY
  ID;

While somewhat limited in usefulness, the above approach will indicate which rows in the original source table should be examined because the column values in those rows changed (or were added or deleted).

Another, potentially more useful approach involves setting up a logging trigger and logging table. I previously shared a VBS script that helps to automate and standardize the process of creating the logging trigger and logging table. A very similar feature is built into my Hyper-Extended Oracle Performance Monitor program – but, much like the VBS script, this solution is useless for the OP who uses a SQL Server backend.

What are the other options? Oracle Database’s built-in auditing capabilities. Auditing built into the ERP system (this particular ERP system writes all audit/change records to a single table that uses a VARCHAR2 column to capture the before/after images of the values stored in columns). Any other options? (The mess that someone can create with a bit of idle time on their hands…)

Actions

Information

Date : March 22, 2012
Categories : General Administration, SQL

25 responses

22 03 2012

Damir Vadas (@DamirVadas) (17:25:25) :

Have you seen mine approach to this theme:
http://damir-vadas.blogspot.com/2012/02/browsing-history-data-changes-for.html
Rg
Damir

22 03 2012

Charles Hooper (19:35:11) :

Damir,

It appears that the approach mentioned in your blog article is similar to my approach – thanks for leaving the link.

I remember thinking several years ago: “I need to create another silly logging table and trigger that is almost the same as the last 10 that I manually created?” As hard as I tried to standardize my naming convention, it seemed that every logging table that I manually created was just a bit different from the last. That is about the time that I decided to create the Configure Data Change Log feature in my program (https://hoopercharles.files.wordpress.com/2012/03/heopm_configure_data_change_log.jpg ).

24 04 2012

damirvadas (10:42:37) :

Charles,

This program, that you mention, is this public or private stuff …
Any demo/download?

Rg
Damir

24 04 2012

Charles Hooper (11:02:04) :

Damir,

A Windows VBS/Internet Explorer version:
https://hoopercharles.wordpress.com/2010/08/25/oracle-logging-trigger-creator/

An executable version that is part of a much larger project:
https://hoopercharles.wordpress.com/2012/03/15/hyper-extended-oracle-performance-monitor-6-0-beta/

24 04 2012

damirvadas (11:55:58) :

Charles,

Thank you … will take a look
Rg,
Damir

22 03 2012

talebzadeh (18:14:56) :

Consider triggers on the table for all the DML actions to allow you to put in any logic to prevent “unauthorized” ( defined your way ) insert/update/deletes. This should work for any engine Oracle, Sybase, MSSQL

Mich

22 03 2012

jimmybrock (18:26:51) :

Oracle 11g Release 2 has “Total Recall” which will capture all transactions for a specified table.

22 03 2012

Charles Hooper (19:27:13) :

Jimmy,

Thanks for the mention of that feature – I was not aware that it existed. A search of the documentation suggests that it is an extra cost feature that may be added to the Enterprise Edition:
http://docs.oracle.com/cd/E11882_01/license.112/e10594/options.htm#CJABHEBI

22 03 2012

talebzadeh (18:50:14) :

It is not also very clear whether you just want monitoring or prevent unauthorised changes going through. Certainly triggers can be used for either with rollback trigger if the action is not deemed authorised

22 03 2012

Charles Hooper (19:22:20) :

Mich,

I think that the OP was just interested in monitoring the changes… but of course the next logical step is probably prevention, so that too is a possible discussion item.

23 03 2012

Log Buffer #264, A Carnival of the Vanities for DBAs | The Pythian Blog (02:01:41) :

[…] stating that a particular table in one of your databases must be monitored for changes. Yes, Charles Hooper blogs about yet another real world […]

23 03 2012

Marcus Mönnig (03:48:52) :

Given that some prerequisites (flashback, undo retention, small data volumen, probably not too much undo for your check intervals in the whole database) are meet, one could regurlary substract (MINUS operator) the existing table data from the table data some time ago (and vice versa) with the AS OF syntax to check for changed data.

You could also use “Database Change Notification”, which is called “Continuous Query Notification” in 11g and available even in SE.

23 03 2012

Charles Hooper (20:54:33) :

Marcus,

Nice ideas. “Continuous Query Notification” is a feature that I was unaware of – probably a sign that I need to stop reading/reviewing books, and instead read the official Oracle Database documentation instead. 🙂

“Continuous Query Notification”:
http://docs.oracle.com/cd/E11882_01/appdev.112/e25518/adfns_cqn.htm

23 03 2012

hourim (03:51:01) :

Hi Charles,

Funny that I recently wrote a blog article about such a kind of solution

http://hourim.wordpress.com/2012/03/16/auditing-updates-and-deletes/

Because I was receiving e-mails and questions about auditing changes I finally decided to write the above very simple article

23 03 2012

Charles Hooper (20:46:44) :

Mohamed,

I just read your article – nice, and you covered the topic a week before it was mentioned here.

For your emp_history table, would you grant to PUBLIC just INSERT and not SELECT, UPDATE, and DELETE to minimize the chances of the problems mentioned by Pete. Or possibly call a definer’s rights (http://docs.oracle.com/cd/E11882_01/timesten.112/e21639/accesscntl.htm#BABDDCHC ) procedure within the trigger code to perform the insert into your _HISTORY tables?

I see that your method makes use of two table triggers – does that present more of a performance overhead than a single trigger?

Two of your _HISTORY columns are defined as follows:

dml_usr varchar2(48),
dml_dat date

I would probably create those two columns with the following definitions so that I would not need to populate those columns in a trigger (that is not to say that this method is better than your method):

dml_usr varchar2(30) DEFAULT USER,
dml_dat date DEFAULT SYSDATE

I have not experimented yet, but could a DEFAULT value also be declared for your lv_module column?

24 03 2012

hourim (08:13:01) :

Charles,

What I have implemented in several real life applications is a little bit different from the example I mentioned in my blog. In these applications the table to be audited (emp here) contains the columns dml_usr, dml_dat, dml_pgm so that the historical table (emp_history here) doesn’t contain any trigger. As such there is no trigger at all in the historical table.

Yes, your suggestion to use DEFAULT values could be envisaged and might give the same results.

As per regards to performance, I can assure you this audit history has never caused us any noticeable performance problem. In addition I can tell you that what was initially implemented as an audit task (to maintain a change history), you could not imagine, how it reveals itself as a wonderful tool for debugging and understanding some bugs in PRODUCTION, just by looking at those dml_usr, dml_dat, dml_pgm (which in reality have been implemented as user_ins, dat_ins, pgm_ins and user_upd, dat_upd, pgm_upd)

24 04 2012

damirvadas (10:45:27) :

Hi!

I look in your example and I do not see how easy to see only changes? Not whole rows?
Did I missed something?
Rg
Damir

24 04 2012

Charles Hooper (11:08:34) :

Damir,

I am not sure that I understand your question. My VBS script and EXE referenced above allow the user to select which columns of a table to log, and which columns that undergo changes will trigger the logging. The existing and the new values may be captured in the same logging table row. If you capture the existing and the new values, you can determine how the value has changed, and you can determine how the row changed with the assistance of the LOG_TRANSACTION_TYPE column in the logging table.

23 03 2012

Pete Finnigan (04:30:30) :

Hi Charles,

I think that there are a lot of options here. The first is prevention; stop the changes from being made; sometimes this is harder than it seems to do when specific privs and sweeping privs and privs via roles are taken into account. You should also use core audit toi detect attempted change (assuming its now locked down). Dont copy the table for two reasons. A copy is just as good to steal from. Imagine that the copy is not locked down and someone forces the app to read the copy instead of the real table by use of the owner account via the app or via alter session set current_schema….. or via private or public synonyms…..or….. the second reason is that you should not duplicate the data per se anyway as two copies then need to be secured generally. Other options include redo mining to detect change or to use flashback to detect change or if you already clone the database compare clone to real if you can open the clone. For me its lockdown, audit, triggers… this seems simplest.

cheers

Pete

23 03 2012

Charles Hooper (17:35:43) :

Hi Pete,

Great insight on this particular change logging problem; you mentioned several items/issues that I had not previously considered. Thank you for sharing your wide-ranging knowledge of Oracle security.

23 03 2012

talebzadeh (08:40:30) :

I agree with Pete that having another table call it audit is also vulnerable for being changed by anyone who has access to it. Obviously there is a need to record these changes somewhere and excluding flast files a table seems to be the best option.

My understanding was the OP was working on MSSQL so log mining and flashback etc are more appropriate for Oracle. It is also important to note that audit of this type should be available easily. In other words it should be routine to access the audit information at any time.

To this end I would think having triggers on the main table with an audit table kep in a schema accessible to DBA users only will be best.

I also note that Charles wrote his SQL to be sort of generic (ansi complient) and database agnostic. I decided to write similar for Sybase that should work for MSSQL I believe. Anyway here we go:

First let us have a table with two columns only. The first column is identity (read sequence) and the second column is varchar(30). I kep all columns small for the sake of this demo.

First let us create the base table. I call it source.

create table source
(
           col1 bigint identity not null
          ,col2 varchar(30) not null
)
go
alter table source add constraint source_pk primary key (col1)
go

Now in another database (read schema) create the source_audit table. The database is called DBA_CONTROL

use DBA_CONTROL
go
create table source_audit
(
           col1 bigint not null
          ,col2 varchar(30) not null
          ,who  varchar(30) not null
          ,application varchar(30) null
          ,hostname varchar(30) null
          ,when_changed  datetime not null
          ,action char(1) not null
)
go

Now that both tables are created go back and create three triggers (insert/update/delete) on source

create trigger source_i_tr
on source
for insert
as
begin
  declare @hostname varchar(30)
  declare @program_name varchar(30)
  select @hostname = hostname, @program_name = program_name
  from master..sysprocesses where spid = @@SPID
  insert into DBA_CONTROL..source_audit
  (
     col1
    ,col2
    ,who
    ,application
    ,hostname
    ,when_changed
    ,action
  )
  select
     col1
    ,col2
    ,suser_name()
    ,@program_name
    ,@hostname
    ,getdate()
    ,'I'
  from inserted
end
go
create trigger source_u_tr
on source
for update
as
begin
  declare @hostname varchar(30)
  declare @program_name varchar(30)
  select @hostname = hostname, @program_name = program_name
  from master..sysprocesses where spid = @@SPID
  if update (col2)
  begin
    insert into DBA_CONTROL..source_audit
    (
       col1
      ,col2
      ,who
      ,application
      ,hostname
      ,when_changed
      ,action
    )
    select
       col1
      ,col2
      ,suser_name()
      ,@program_name
      ,@hostname
      ,getdate()
      ,'U'
    from deleted
    insert into DBA_CONTROL..source_audit
    (
       col1
      ,col2
      ,who
      ,application
      ,hostname
      ,when_changed
      ,action
    )
    select
       col1
      ,col2
      ,suser_name()
      ,@program_name
      ,@hostname
      ,getdate()
      ,'U'
    from inserted
  end
end
go
create trigger source_d_tr
on source
for delete
as
begin
  declare @hostname varchar(30)
  declare @program_name varchar(30)
  select @hostname = hostname, @program_name = program_name
  from master..sysprocesses where spid = @@SPID
  insert into DBA_CONTROL..source_audit
  (
     col1
    ,col2
    ,who
    ,application
    ,hostname
    ,when_changed
    ,action
  )
  select
     col1
    ,col2
    ,suser_name()
    ,@program_name
    ,@hostname
    ,getdate()
    ,'D'
  from deleted
end
go

inserted and deleted are temporary tables created in the transaction log (redo +undo) of database in the image of the original table. They keep after image and before image of rows much like :new and :old in Oracle. suser_name() is a function that has the login name of the process like user. Program_name and hostname are the application and hostname much like v$session.program and v$session.machine etc. These are all obtained from master..sysprocesses much like sys.v$session

Anyway if I try this we can see what is happening.

-- Do DML on source table
1> insert source values ('LOndon')
2> go
(1 row affected)
1> update source set col2 = 'London' where col1 = 1
2> go
(1 row affected)
1> insert source values('NY')
2>  insert source values('LA')
3> go
(1 row affected)
(1 row affected)
1> select * from source
2> go
 col1                 col2
 -------------------- ------------------------------
                    1 London
                    2 NY
                    3 LA

(3 rows affected)
1> delete from source where col1 = 3
2> go
(1 row affected)

OK now check what is going on on source_audit table

1> select * from DBA_CONTROL..source_audit
2> go
 col1                 col2                           who                            application                    hostname                       when_changed                    action
 -------------------- ------------------------------ ------------------------------ ------------------------------ ------------------------------ ------------------------------- ------
                    1 LOndon                         sa                             isql                           rhes564                                    Mar 23 2012 12:45PM I
                    1 LOndon                         sa                             isql                           rhes564                                    Mar 23 2012 12:45PM U
                    1 London                         sa                             isql                           rhes564                                    Mar 23 2012 12:45PM U
                    2 NY                             sa                             isql                           rhes564                                    Mar 23 2012 12:46PM I
                    3 LA                             sa                             isql                           rhes564                                    Mar 23 2012 12:46PM I
                    3 LA                             sa                             isql                           rhes564                                    Mar 23 2012 12:47PM D

(6 rows affected)

Pretty much the changes we made to the source table

Mich

23 03 2012

Charles Hooper (17:53:56) :

Mich,

Thanks for posting a working example for Sybase. I am not familiar with the Sybase scripting language, but I follow what the code in your example is accomplishing.

23 03 2012

talebzadeh (17:57:46) :

Hi CHarles,

It is standard transact SQL used by Sybase and MSSQL. I believe the OP was working on MSSQL backend so the code should work for OP as well (hopefully).

Cheers

23 03 2012

Charles Hooper (21:06:48) :

You are correct that the OP was using Microsoft SQL Server – he was previously running Oracle Database but switched to SQL Server to save money.

My comment was intended to state that while I probably could not write a trigger in SQL Server (or Sybase) I could understand what your trigger was accomplishing (even though the trigger syntax appears a bit different than that used by Oracle). I used to be able to say the same about Fortran, COBOL, ARREX and a couple of other programming languages, but those languages are just distant now memories.

23 03 2012

talebzadeh (18:42:38) :

The same code in Oracle

drop sequence identity;
CREATE SEQUENCE identity
    MINVALUE 1
    MAXVALUE 999999999999999999999999999
    START WITH 1
    INCREMENT BY 1
    CACHE  1000;
drop table source;
create table source
(
           col1 number not null
          ,col2 varchar(30) not null
);
alter table source add constraint source_pk primary key (col1);
desc source
drop table source_audit;
create table source_audit
(
           col1 number not null
          ,col2 varchar(30) not null
          ,who  varchar(30) not null
          ,application varchar(30) null
          ,hostname varchar(30) null
          ,when_changed  date not null
          ,action char(1) not null
);
desc source_audit
create or replace trigger source_b_ud_tr
before update or delete
on source
for each row
declare
  v_who varchar(30) := USER;
  v_application varchar(30);
  v_hostname varchar(30);
  v_when_changed date := sysdate;
  v_action char(1);
begin
  SELECT
            substr(sys_context('USERENV','HOST'),1,30)
          , substr(sys_context('USERENV','MODULE'),1,30)
  INTO    v_hostname
         ,v_application
FROM
        dual;
  if updating then
    v_action := 'U';
  elsif deleting then
    v_action := 'D';
  end if;
  insert into source_audit
  (
          col1
         ,col2
         ,who
         ,application
         ,hostname
         ,when_changed
         ,action
  )
  values
  (
          :old.col1
         ,:old.col2
         ,v_who
         ,v_application
         ,v_hostname
         ,v_when_changed
         ,v_action
  );
end source_b_ud_tr;
/
show error
create or replace trigger source_a_iu_tr
after insert or update
on source
for each row
declare
  v_who varchar(30) := USER;
  v_application varchar(30);
  v_hostname varchar(30);
  v_when_changed date := sysdate;
  v_action char(1);
begin
  SELECT
            substr(sys_context('USERENV','HOST'),1,30)
          , substr(sys_context('USERENV','MODULE'),1,30)
  INTO    v_hostname
         ,v_application
FROM
        dual;
  if inserting then
    v_action := 'I';
  elsif updating then
    v_action := 'U';
  end if;
  insert into source_audit
  (
          col1
         ,col2
         ,who
         ,application
         ,hostname
         ,when_changed
         ,action
  )
  values
  (
          :new.col1
         ,:new.col2
         ,v_who
         ,v_application
         ,v_hostname
         ,v_when_changed
         ,v_action
  );
end source_a_iud_tr;
/
show error
exit

Well it works. Let us try it

insert into source values (identity.nextval,'LOndon');

1 row created.

update source set col2 = 'London';

1 row updated.

insert into source values (identity.nextval,'NY');

1 row created.

insert into source values (identity.nextval,'LA');

1 row created.

delete from source where col2 = 'LA';

1 row deleted.

select * from source_audit;

      COL1 COL2                           WHO                            APPLICATION
---------- ------------------------------ ------------------------------ ------------------------------
HOSTNAME                       WHEN_CHANGED        A
------------------------------ ------------------- -
         2 LOndon                         SCRATCHPAD                     SQL*Plus
rhes564                        23/03/2012 22:52:26 I

         2 LOndon                         SCRATCHPAD                     SQL*Plus
rhes564                        23/03/2012 22:52:50 U

         2 London                         SCRATCHPAD                     SQL*Plus
rhes564                        23/03/2012 22:52:50 U

         3 NY                             SCRATCHPAD                     SQL*Plus
rhes564                        23/03/2012 22:53:30 I

         4 LA                             SCRATCHPAD                     SQL*Plus
rhes564                        23/03/2012 22:53:39 I

         4 LA                             SCRATCHPAD                     SQL*Plus
rhes564                        23/03/2012 22:54:24 D


6 rows selected.

I am sure you guys can write better code than myself so please suggest where it can be bettered.

Mich

	Charles Hooper on Favorite Free Applications to…
	Zach on Favorite Free Applications to…
	Charles Hooper on Favorite Free Applications to…
	Zach on Favorite Free Applications to…
	Charles Hooper on Huge Presentation – Work…
	Dan on Huge Presentation – Work…
	Charles Hooper on Review of Synology DS415+, How…
	Matthias Clausen on Review of Synology DS415+, How…
	Charles Hooper on _SMALL_TABLE_THRESHOLD Paramet…
	mohd atif on _SMALL_TABLE_THRESHOLD Paramet…
	Charles Hooper on Free ANSI SQL to Oracle Specif…
	Kalin Agrawal on Free ANSI SQL to Oracle Specif…
	Charles Hooper on Large Page Support or Not…
	vaurob on Large Page Support or Not…
	Locating Oracle… on Finding a New Home for a Clien…

Charles Hooper's Oracle Notes