March 22, 2012
Suppose that you receive a request stating that a particular table in one of your databases must be monitored for changes. One such table might be the one that lists the ERP system’s suppliers and their addresses – you would not want someone to be able to set up a legitimate supplier, and later have the billing remit to name and address changed without being detected.
What approach would you take to solve the above problem? A question similar to the above arrived recently in an ERP mailing list email – the original poster (OP) is using a SQL Server RDBMS, so that might change the proposed solution just a bit.
Here is the “How simple is too simple?” suggestion that I offered:
Periodically, create a VENDOR_SAVE table (drop it if it already exists, or delete all rows and re-insert from the original source table):
CREATE TABLE VENDOR_SAVE AS SELECT * FROM VENDOR;Now, just wait a while.
The question then becomes, how do we detect:
- A new row (record) added to the original source (VENDOR) table.
- An old row (record) deleted from the original source (VENDOR) table.
- A change to any column (program field) in the original source (VENDOR) table since the last time the VENDOR_SAVE table was created/refreshed.
Let’s start with the first two bullet points. I will write the SQL statements so that the statements should work with Oracle Database 9.0.1 and above, and SQL Server with very few changes, even if I feel a bit ANSI about doing so. I will use the COALESCE function, rather than the NVL function, and CASE syntax rather than the equivalent DECODE syntax.
To identify cases where a row has appeared in, or disappeared from the original source (VENDOR) table, we can simply perform a full outer join between the original source table and the historical mirror image of the original table (VENDOR_SAVE). We are only interested in cases where the primary key column (ID) is found in exactly one of the two tables:
SELECT COALESCE(V.ID,V2.ID) AS ID, COALESCE(V.NAME,V2.NAME) AS NAME, COALESCE(V.ADDR_1,V2.ADDR_1) AS ADDR_1, CASE WHEN V.ID IS NULL THEN 'VENDOR ADDED' ELSE 'VENDOR DELETED' END AS PROBLEM FROM VENDOR V FULL OUTER JOIN VENDOR_SAVE V2 ON (V.ID=V2.ID) WHERE V.ID IS NULL OR V2.ID IS NULL;So, the above SQL statement satisfies the first two bullet points. The third bullet point is a little more challenging to accomplish… unless of course we employ UNION labor. If we have two row sources with identical columns, and UNION the row sources together, the resulting row source will be absent of any entirely duplicated rows from the two original row sources (two rows will be reduced to a single row). If there were no changes to any of the column values (or if the row was added to or deleted from the original source table), there will be a single row for the primary key column value. If any columns were changed, there will be two rows containing the primary key column value.
Let’s build a SQL statement that UNIONs the rows from the two tables together, and counts the number of rows for each primary key value:
SELECT COUNT(*) OVER (PARTITION BY VL.ID) AS CNT, VL.ID, VL.NAME, VL.ADDR_1 FROM (SELECT * FROM VENDOR UNION SELECT * FROM VENDOR_SAVE) VL;To complete the requirement for bullet point 3 above, we need to eliminate all rows from the result set where there is a single row for the primary key value:
SELECT DISTINCT VL2.ID, VL2.NAME, VL2.ADDR_1, 'VALUE CHANGE' PROBLEM FROM (SELECT COUNT(*) OVER (PARTITION BY VL.ID) AS CNT, VL.ID, VL.NAME, VL.ADDR_1 FROM (SELECT * FROM VENDOR UNION SELECT * FROM VENDOR_SAVE) VL) VL2 WHERE VL2.CNT>1;As a final step, we should join the two resultsets into a single resultset using UNION ALL:
SELECT COALESCE(V.ID,V2.ID) AS ID, COALESCE(V.NAME,V2.NAME) AS NAME, COALESCE(V.ADDR_1,V2.ADDR_1) AS ADDR_1, CASE WHEN V.ID IS NULL THEN 'VENDOR ADDED' ELSE 'VENDOR DELETED' END AS PROBLEM FROM VENDOR V FULL OUTER JOIN VENDOR_SAVE V2 ON (V.ID=V2.ID) WHERE V.ID IS NULL OR V2.ID IS NULL UNION ALL SELECT DISTINCT VL2.ID, VL2.NAME, VL2.ADDR_1, 'VALUE CHANGE' PROBLEM FROM (SELECT COUNT(*) OVER (PARTITION BY VL.ID) AS CNT, VL.ID, VL.NAME, VL.ADDR_1 FROM (SELECT * FROM VENDOR UNION SELECT * FROM VENDOR_SAVE) VL) VL2 WHERE VL2.CNT>1 ORDER BY ID;
While somewhat limited in usefulness, the above approach will indicate which rows in the original source table should be examined because the column values in those rows changed (or were added or deleted).
Another, potentially more useful approach involves setting up a logging trigger and logging table. I previously shared a VBS script that helps to automate and standardize the process of creating the logging trigger and logging table. A very similar feature is built into my Hyper-Extended Oracle Performance Monitor program – but, much like the VBS script, this solution is useless for the OP who uses a SQL Server backend.
What are the other options? Oracle Database’s built-in auditing capabilities. Auditing built into the ERP system (this particular ERP system writes all audit/change records to a single table that uses a VARCHAR2 column to capture the before/after images of the values stored in columns). Any other options? (The mess that someone can create with a bit of idle time on their hands…)
Hi
Have you seen mine approach to this theme:
http://damir-vadas.blogspot.com/2012/02/browsing-history-data-changes-for.html
Rg
Damir
Damir,
It appears that the approach mentioned in your blog article is similar to my approach – thanks for leaving the link.
I remember thinking several years ago: “I need to create another silly logging table and trigger that is almost the same as the last 10 that I manually created?” As hard as I tried to standardize my naming convention, it seemed that every logging table that I manually created was just a bit different from the last. That is about the time that I decided to create the Configure Data Change Log feature in my program (https://hoopercharles.files.wordpress.com/2012/03/heopm_configure_data_change_log.jpg ).
Charles,
This program, that you mention, is this public or private stuff …
Any demo/download?
Rg
Damir
Damir,
A Windows VBS/Internet Explorer version:
https://hoopercharles.wordpress.com/2010/08/25/oracle-logging-trigger-creator/
An executable version that is part of a much larger project:
https://hoopercharles.wordpress.com/2012/03/15/hyper-extended-oracle-performance-monitor-6-0-beta/
Charles,
Thank you … will take a look
Rg,
Damir
Consider triggers on the table for all the DML actions to allow you to put in any logic to prevent “unauthorized” ( defined your way ) insert/update/deletes. This should work for any engine Oracle, Sybase, MSSQL
Mich
Oracle 11g Release 2 has “Total Recall” which will capture all transactions for a specified table.
Jimmy,
Thanks for the mention of that feature – I was not aware that it existed. A search of the documentation suggests that it is an extra cost feature that may be added to the Enterprise Edition:
http://docs.oracle.com/cd/E11882_01/license.112/e10594/options.htm#CJABHEBI
It is not also very clear whether you just want monitoring or prevent unauthorised changes going through. Certainly triggers can be used for either with rollback trigger if the action is not deemed authorised
Mich,
I think that the OP was just interested in monitoring the changes… but of course the next logical step is probably prevention, so that too is a possible discussion item.
[…] stating that a particular table in one of your databases must be monitored for changes. Yes, Charles Hooper blogs about yet another real world […]
Given that some prerequisites (flashback, undo retention, small data volumen, probably not too much undo for your check intervals in the whole database) are meet, one could regurlary substract (MINUS operator) the existing table data from the table data some time ago (and vice versa) with the AS OF syntax to check for changed data.
You could also use “Database Change Notification”, which is called “Continuous Query Notification” in 11g and available even in SE.
Marcus,
Nice ideas. “Continuous Query Notification” is a feature that I was unaware of – probably a sign that I need to stop reading/reviewing books, and instead read the official Oracle Database documentation instead. 🙂
“Continuous Query Notification”:
http://docs.oracle.com/cd/E11882_01/appdev.112/e25518/adfns_cqn.htm
Hi Charles,
Funny that I recently wrote a blog article about such a kind of solution
http://hourim.wordpress.com/2012/03/16/auditing-updates-and-deletes/
Because I was receiving e-mails and questions about auditing changes I finally decided to write the above very simple article
Mohamed,
I just read your article – nice, and you covered the topic a week before it was mentioned here.
For your emp_history table, would you grant to PUBLIC just INSERT and not SELECT, UPDATE, and DELETE to minimize the chances of the problems mentioned by Pete. Or possibly call a definer’s rights (http://docs.oracle.com/cd/E11882_01/timesten.112/e21639/accesscntl.htm#BABDDCHC ) procedure within the trigger code to perform the insert into your _HISTORY tables?
I see that your method makes use of two table triggers – does that present more of a performance overhead than a single trigger?
Two of your _HISTORY columns are defined as follows:
I would probably create those two columns with the following definitions so that I would not need to populate those columns in a trigger (that is not to say that this method is better than your method):
I have not experimented yet, but could a DEFAULT value also be declared for your lv_module column?
Charles,
What I have implemented in several real life applications is a little bit different from the example I mentioned in my blog. In these applications the table to be audited (emp here) contains the columns dml_usr, dml_dat, dml_pgm so that the historical table (emp_history here) doesn’t contain any trigger. As such there is no trigger at all in the historical table.
Yes, your suggestion to use DEFAULT values could be envisaged and might give the same results.
As per regards to performance, I can assure you this audit history has never caused us any noticeable performance problem. In addition I can tell you that what was initially implemented as an audit task (to maintain a change history), you could not imagine, how it reveals itself as a wonderful tool for debugging and understanding some bugs in PRODUCTION, just by looking at those dml_usr, dml_dat, dml_pgm (which in reality have been implemented as user_ins, dat_ins, pgm_ins and user_upd, dat_upd, pgm_upd)
Hi!
I look in your example and I do not see how easy to see only changes? Not whole rows?
Did I missed something?
Rg
Damir
Damir,
I am not sure that I understand your question. My VBS script and EXE referenced above allow the user to select which columns of a table to log, and which columns that undergo changes will trigger the logging. The existing and the new values may be captured in the same logging table row. If you capture the existing and the new values, you can determine how the value has changed, and you can determine how the row changed with the assistance of the LOG_TRANSACTION_TYPE column in the logging table.
Hi Charles,
I think that there are a lot of options here. The first is prevention; stop the changes from being made; sometimes this is harder than it seems to do when specific privs and sweeping privs and privs via roles are taken into account. You should also use core audit toi detect attempted change (assuming its now locked down). Dont copy the table for two reasons. A copy is just as good to steal from. Imagine that the copy is not locked down and someone forces the app to read the copy instead of the real table by use of the owner account via the app or via alter session set current_schema….. or via private or public synonyms…..or….. the second reason is that you should not duplicate the data per se anyway as two copies then need to be secured generally. Other options include redo mining to detect change or to use flashback to detect change or if you already clone the database compare clone to real if you can open the clone. For me its lockdown, audit, triggers… this seems simplest.
cheers
Pete
Hi Pete,
Great insight on this particular change logging problem; you mentioned several items/issues that I had not previously considered. Thank you for sharing your wide-ranging knowledge of Oracle security.
I agree with Pete that having another table call it audit is also vulnerable for being changed by anyone who has access to it. Obviously there is a need to record these changes somewhere and excluding flast files a table seems to be the best option.
My understanding was the OP was working on MSSQL so log mining and flashback etc are more appropriate for Oracle. It is also important to note that audit of this type should be available easily. In other words it should be routine to access the audit information at any time.
To this end I would think having triggers on the main table with an audit table kep in a schema accessible to DBA users only will be best.
I also note that Charles wrote his SQL to be sort of generic (ansi complient) and database agnostic. I decided to write similar for Sybase that should work for MSSQL I believe. Anyway here we go:
First let us have a table with two columns only. The first column is identity (read sequence) and the second column is varchar(30). I kep all columns small for the sake of this demo.
First let us create the base table. I call it source.
Now in another database (read schema) create the source_audit table. The database is called DBA_CONTROL
Now that both tables are created go back and create three triggers (insert/update/delete) on source
inserted and deleted are temporary tables created in the transaction log (redo +undo) of database in the image of the original table. They keep after image and before image of rows much like :new and :old in Oracle. suser_name() is a function that has the login name of the process like user. Program_name and hostname are the application and hostname much like v$session.program and v$session.machine etc. These are all obtained from master..sysprocesses much like sys.v$session
Anyway if I try this we can see what is happening.
OK now check what is going on on source_audit table
Pretty much the changes we made to the source table
Mich
Mich,
Thanks for posting a working example for Sybase. I am not familiar with the Sybase scripting language, but I follow what the code in your example is accomplishing.
Hi CHarles,
It is standard transact SQL used by Sybase and MSSQL. I believe the OP was working on MSSQL backend so the code should work for OP as well (hopefully).
Cheers
You are correct that the OP was using Microsoft SQL Server – he was previously running Oracle Database but switched to SQL Server to save money.
My comment was intended to state that while I probably could not write a trigger in SQL Server (or Sybase) I could understand what your trigger was accomplishing (even though the trigger syntax appears a bit different than that used by Oracle). I used to be able to say the same about Fortran, COBOL, ARREX and a couple of other programming languages, but those languages are just distant now memories.
The same code in Oracle
Well it works. Let us try it
I am sure you guys can write better code than myself so please suggest where it can be bettered.
Mich