See Sharp Objects, Run with Sharp Objects, Crash with Sharp Objects – Obscure and Incorrect Error Messages in Commercial Products

15 01 2016

January 15, 2016

Since October 2015 I have been fighting with an application programming interface (API) that works with an enterprise resource planning (ERP) system.  The API, with uses Microsoft .Net 4.0 or higher, is intended to simplify the process of automating the creation of various types of documents within the ERP system.  For example, the API could be used to quickly create 1,000 new part numbers in the ERP system, using an Excel spreadsheet as the source data (a custom program would need to read the Excel spreadsheet and plug that information into the formatting syntax expected by the API).  The API could also be used for creating work orders in the ERP system which would then be used to produce the parts, receive purchased parts into inventory, issue parts in inventory to work orders, receive completed parts into inventory, create shipping documents to send the parts to customers, generate invoice records to accept payment from customers, and a variety of other types of documents.  I have been using Microsoft C# 2015 to interface with the API, and the ERP company helpfully includes a standalone executable (.EXE) that is able to test the API to make certain that all is working well with the API communication to the Oracle database through the Oracle Managed Data Provider through .Net.  This API replaces an older API that exposed a COM interface, which I previously used extensively to automate the creation of various types of documents.

Why have I been fighting with the application programming interface for this ERP system?  To err is human, but two errors does not compute.  While trying to determine how to use the API, I copied a code sample from the API documentation into a new project in Microsoft C# 2015.  The documentation indicates that the methods for using the API are similar to those that I used in the older API that exposed a COM interface, so it appeared that the process would be relatively easy transition to the new API.  That was my thought, any way, until I attempted to run the program.

SeeSharp2FailedEnableConstraintsC

That cannot be good.  I suppose that it is possible that the database schema has non-enabled constraints, as it had just gone through a conversion from an earlier version of the ERP system to the latest version a couple of days prior.  Let’s check:

SELECT
  CONSTRAINT_NAME,
  CONSTRAINT_TYPE,
  TABLE_NAME,
  STATUS
FROM
  DBA_CONSTRAINTS
WHERE
  OWNER='SYSADM'
  AND STATUS<>'ENABLED';
 
no rows selected

I guess that is not the cause of the error.  Let’s take the code out of a try block, and run the program again (click the picture for a larger view):

SeeSharp2FailedEnableConstraintsCExcept

A System.Data.ConstraintException in LsaCore.dll, Failed to enable constraints.  One or more rows contain values violating non-null, unique, or foreign-key constraints error.  Ouch, what did I do wrong?  I was just trying to create a new part number in the database, and the exception was thrown before any part specifications were provided.  Maybe it is because I am trying to develop on a 64 bit computer, or maybe there is some missing Windows registry entry that I still need to create?

Let’s try the standalone executable (.EXE) that the ERP provider distributes to test the API – on a 32 bit computer that was used for a test deployment of the new version of the ERP system.  This computer was an otherwise fresh install of Windows 7 with all security updates installed prior to the deployment of the new version of the ERP system (click the picture for a larger view):

SeeSharp2FailedEnableConstraintsAPITest

That “Failed to enable constraints.  One or more rows contain values violating non-null, unique, or foreign-key constraints.” error message is a bit too familiar at this point.  I need a 10046 trace at level 12 to see what might be triggering the constraints error.  Logging into the database as the SYS user using SQL*Plus, I created an after logon trigger to enable a 10046 trace for my C# test program as well as the API test tool provided with the ERP system:

CREATE OR REPLACE TRIGGER LOGON_CAPTURE_10046 AFTER LOGON ON DATABASE
DECLARE
  SHOULD_EXECUTE INTEGER;
  TRACEFILE VARCHAR2(150);
BEGIN
  SELECT DECODE(SUBSTR(UPPER(PROGRAM),1,5),'VISUA',1,0)+DECODE(SUBSTR(UPPER(PROGRAM),1,5),'VMFGC',1,0) INTO SHOULD_EXECUTE FROM V$SESSION WHERE SID=(SELECT SID FROM V$MYSTAT WHERE ROWNUM=1);
  IF SHOULD_EXECUTE > 0 THEN
    TRACEFILE := 'ALTER SESSION SET TRACEFILE_IDENTIFIER = '''||USER||TO_CHAR(SYSDATE,'YYMMDDHH24MI')||'''';
    EXECUTE IMMEDIATE TRACEFILE;
    EXECUTE IMMEDIATE 'ALTER SESSION SET MAX_DUMP_FILE_SIZE=UNLIMITED';
    EXECUTE IMMEDIATE 'ALTER SESSION SET TIMED_STATISTICS=TRUE';
    EXECUTE IMMEDIATE 'ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT FOREVER, LEVEL 12''';
  END IF;
END;
/

Repeating the test with my C# test program, I found that a few 10046 trace files were created on the database server each time I ran the C# test program.  One of the 10046 trace files included:

PARSE ERROR #425887032:len=42 dep=0 uid=159 oct=3 lid=159 tim=2402951856734 err=904
SELECT DB_VERSION FROM APPLICATION_GLOBAL
WAIT #425887032: nam='SQL*Net break/reset to client' ela= 2 driver id=1413697536 break?=1 p3=0 obj#=-1 tim=2402951856800
WAIT #425887032: nam='SQL*Net break/reset to client' ela= 559 driver id=1413697536 break?=0 p3=0 obj#=-1 tim=2402951857369
WAIT #425887032: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2402951857387

That is interesting.  There is no column named DB_VERSION in the APPLICATION_GLOBAL table, but there is a column named DBVERSION.  Maybe that is the cause of the Failed to enable constraints error?  After double-checking the table definition, the column is in fact supposed to be named DBVERSION, not DB_VERSION, so maybe the programmer that created the API DLLs made a typo?

Taking a look at another trace file, I found another potential problem:

PARSE ERROR #415492776:len=45 dep=0 uid=159 oct=3 lid=159 tim=2402951781868 err=942
SELECT DB_VERSION FROM VQ_APPLICATION_GLOBAL
WAIT #415492776: nam='SQL*Net break/reset to client' ela= 2 driver id=1413697536 break?=1 p3=0 obj#=-1 tim=2402951781960
WAIT #415492776: nam='SQL*Net break/reset to client' ela= 4892 driver id=1413697536 break?=0 p3=0 obj#=-1 tim=2402951786862
WAIT #415492776: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2402951786911

There is no table in the database named VQ_APPLICATION_GLOBAL, and per the table creation script for the ERP system, that table probably should not exist.  Maybe the programmer that created the API DLLs made another typo?  Creating a virtual DB_VERSION column to see if there is any improvement:

ALTER TABLE APPLICATION_GLOBAL ADD (DB_VERSION NVARCHAR2(10) GENERATED ALWAYS AS (TRIM(DBVERSION)));

Well, that took care of the Failed to enable constraints error, but now I have a new problem when I run the C# test program:

SeeSharp2DataspaceNameVMFGC

What does “Dataspace name VMFG not found” mean?  Is that an improvement over the Failed to enable constraints error?   Let’s give the standalone executable (.EXE) that the ERP provider distributes to test the API a go on the 32 bit computer (click the picture for a larger view):

SeeSharp2DataspaceNameVMFGAPITest

“Dataspace name VMFG not found” – at least there is some level of consistency.

Switching back to the C# test program, I decided to enable a bit more of the options in the Exception Settings window (click the picture for a larger view):

SeeSharp2DataspaceNameVMFGCExcDetail

So, the “Dataspace name VMFG not found” error was thrown in the LsaDataLogic.TableDefinition.LoadTableDefinitions procedure in LsaCore.  That certainly is specific.  But, where is “VMFG” coming from, that is not the name of the database that I am trying to work with, although that is the name of a database that is frequently configured for this ERP system.  For fun I opened one of the API DLL files using Wordpad (click the picture for a larger view):

SeeSharpObjects2VMFGInDLLSource

I see VMFG. prefixing what appears to be every table that is in a SELECT type SQL statement within the DLL.  After checking with one of the most experienced ERP support people at the ERP company, the “Dataspace name VMFG not found” error was a complete mystery.  No one in support had ever seen that particular error message.  The “Failed to enable constraints.  One or more rows contain values violating non-null, unique, or foreign-key constraints.” error that I saw earlier was also a complete mystery at that line in the C# test code.

Taking another look at the multiple 10046 trace files that were created before I tried creating the virtual DB_VERSION column, I found that the API was actually sending at least four invalid SQL statements to the database:

SELECT DB_VERSION FROM APPLICATION_GLOBAL
 
SELECT DB_VERSION FROM VQ_APPLICATION_GLOBAL
 
SELECT DBVERSION , SITE_ID , COMPANY_MANAGER FROM APPLICATION_GLOBAL
 
SELECT APP_VERSION , PAY_HELP , HR_HELP FROM APPLICATION

The API was also using two different techniques to identify the name of the primary key column for a table – that table does not, and should not have a primary key per the table creation script for the ERP system:

SELECT COLS.COLUMN_NAME AS COLNAME , COLS.POSITION AS PKCOLSEQNUM FROM ALL_CONSTRAINTS CONS , ALL_CONS_COLUMNS COLS WHERE COLS.TABLE_NAME = 'APPLICATION_GLOBAL' AND CONS.CONSTRAINT_TYPE = 'P' AND CONS.CONSTRAINT_NAME = COLS.CONSTRAINT_NAME AND CONS.OWNER = COLS.OWNER ORDER BY COLS.TABLE_NAME , COLS.POSITION
 
SELECT I.INDEX_NAME AS INDEX_NAME FROM USER_INDEXES I WHERE I.TABLE_NAME = 'APPLICATION_GLOBAL' AND I.TABLE_OWNER = 'SYSADM' AND I.UNIQUENESS = 'UNIQUE' ORDER BY I.INDEX_NAME

Wow!  I asked the experienced ERP support person if any of the above makes sense.  I am not sure if I ever received a response to that question.

So, I enabled tracing for the Oracle Managed Data Provider for .Net driver used by the API with a modification to the MACHINE.CONFIG file found in the folder C:\Windows\Microsoft.NET\Framework\v4.0.30319\Config.  Just before the end of the configSections section of the file (about ½ way through the file – just before the line </configSections>) I added the following line:

    <section name="oracle.manageddataaccess.client" type="OracleInternal.Common.ODPMSectionHandler, Oracle.ManagedDataAccess" />

Next, just before the </configuration> line at the end of the file I added the following lines (specify the correct location of the tnsname.ora file on the tns_admin line):

  <oracle.manageddataaccess.client>
    <version number="*">
       <settings>
          <setting name="tns_admin" value="C:\Oracle\product\11.2.0\client_1\network\admin\" />
          <setting name="TraceOption" value="1" />
          <setting name="TraceLevel" value="127" />
          <setting name="TraceFileLocation" value="C:\Trace\" />
       </settings>
    </version>
  </oracle.manageddataaccess.client>

I then created a folder named Trace in the root of the C: drive (note that if UAC (User Access Control) is enabled, it may be necessary to grant everyone full permissions to that folder if it is located in the root of the C: drive).  The generated trace files initially did not seem to provide any additional insight into the problem beyond what was found in the 10046 trace files.

Earlier this week the experienced ERP support person arranged for an cross-continent video conference call with several other people at the ERP company to help identify the source of the problems that I experienced with the API for the ERP system.  That video conference call lasted a bit over two and a half hours, with no solution, but the conference call included a demonstration that the API for the ERP system does work with the sample database that is used by the ERP company.  The support person even went so far as to record and send to me a Process Monitor trace of the successful test execution, and he also enabled a 10046 trace of the successful test execution.  We only spent a couple of minutes scanning through the 10046 trace files during the call.  Oddly enough, during the successful test execution, I noticed that the following two SQL statements appeared in his 10046 trace files with parse errors:

SELECT DB_VERSION FROM APPLICATION_GLOBAL
  
SELECT DB_VERSION FROM VQ_APPLICATION_GLOBAL

Interesting, and when I pointed out those two parse errors (again, along with the other two that I mentioned earlier) to the people on the video conference call, no one knew why those SQL statements were attempted to be executed.  Along with the Process Monitor trace, the senior support person sent a copy of their sample database to me, exported using Datapump, so that I could try to understand just what was corrupt on the essentially barebones Windows 7 32 bit computer.

After finding that the ERP company’s sample database worked fine with my essentially barebones Windows 7 32 bit computer, I made a somewhat important conclusion.  The “Dataspace name VMFG not found” error is bad news, essentially it means that the API does not think that it is accessing a Visual Manufacturing database.  The DB_VERSION queries against the APPLICATION_GLOBAL table and VQ_APPLICATION_GLOBAL are apparently supposed to fail!  But why could someone on that 2.5+ hour cross-continent video conference call not come to the same conclusion?  I know that two of the people on the conference call have been heavily involved with the ERP system development for well over 15 years, so the people are undoubtedly very knowledgeable about the subject.

After dropping the virtual column that I created, I then repeated the C# test program’s tests with the two databases.  Comparing the trace files for my company’s database and the ERP company’s sample database, I found a difference (see the picture below – click for a larger view).

SeeSharp2TraceCompare

Ah, I found a difference.  With the sample database, after the API’s DLL performed an OracleDataReader.Read() call, it executed an OracleDataReader.NextResult() call.  With my company’s database, after the OracleDataReader.Read() call, the API’s DLL performed an OracleDataReader.Fill() call followed by an OracleDataReader.Close() call.  Buy why?  Here is the SQL statement that appeared in the trace file just prior to the highlighted line:

SELECT ACTIVATION_KEY , ACTIVITY_UDF_LABELS , ALLOW_EMAIL_DOC , AUDIT_REPORT_TIMES , BARCODE_MULTI_JOB ,
 COMPANY_ADDR_1 , COMPANY_ADDR_2 , COMPANY_ADDR_3 , COMPANY_CITY , COMPANY_COUNTRY , COMPANY_NAME ,
 COMPANY_STATE , COMPANY_ZIPCODE , CONFIGURATION_KEY , CUST_PRICE_EFF_REQ , CUST_UDF_LABELS , DBVERSION ,
 DEF_OLDEST_INV_TYPE , DOC_UDF_LABELS , DOCUMENT_DIRECTORY , DRAWING_FILE_PATH , DRAWING_LOCATOR ,
 DRAWING_VIEWER , EURO_CURRENCY_ID , FILTER_AVAIL_TRACE , ID , INSTALL_DATE , LAST_CONVERT_DATE ,
 MAXIMUM_USERS , MFG_INTERFACE_USED , PART_UDF_LABELS , PLM_ENABLED , PLM_URL , PROGRAM_PATCH_LVL ,
 PROJ_UDF_LABELS , PURC_QUOTE_TYPE , SD_SUBGROUP_MIN , SERIAL_NO , SESSION_TIMELIMIT , SHOP_UDF_LABELS ,
 SQLBASE_DATABASE_VERSION , SSO_ENABLED , STORE_MACROS_IN_DB , TABLE_PATCH_LVL , TRIGGER_PATCH_LVL ,
 VALIDATE_LOOKUPS , VEND_UDF_LABELS , VISUAL_USER_GROUP , VQ_DBNAME , VQ_DIRECTORY , VQ_ENABLED ,
 VQ_QUERY_USE , VR_UPDATES , WFL_CMNT_PWD_REQ , WO_PRIORITY_PATH , XBAR_SUBGROUP_MIN 
FROM APPLICATION_GLOBAL APPLICATION_GLOBAL 
ORDER BY ID

The Oracle Managed Data Provider’s OracleCommand object for that SQL statement, when I executed it in another C# test program, indicated that the expected row length is 10,238 bytes.  A problem? After trying a couple of changes in my database’s APPLICATION_GLOBAL table, I noticed that the PART_UDF_LABELS, VEND_UDF_LABELS, and CUST_UDF_LABELS columns in the ERP company’s sample database were all set to NULL.  Below is the contents of the PART_UDF_LABELS column in my company’s APPLICATION_GLOBAL table:

 "% of Plate","Heat Code/Material","Customer ID","Gross Weight","Vendor","Price Each","Surcharge","Freight","EAU (Cpy to Planning Tab)","FAB/COMB"

What, a difference between a sample database and a production database (one with 20+ years of transaction history)?  :-)

I set the PART_UDF_LABELS, VEND_UDF_LABELS, and CUST_UDF_LABELS columns to all be NULL, as they are in the sample database.  All tests in API test program were successful with my company’s database with all three of those columns set to NULL.  For sake of completeness, I then copied only the original PART_UDF_LABELS column value from my company’s database to the sample database, and managed to cause the API test program to fail with the sample database.  Interesting…

I reported my findings to the senior support person at the ERP company.  While waiting for a follow up from him (he needs to be able to reproduce my results), I performed some additional testing.  The PART_UDF_LABELS, VEND_UDF_LABELS, and CUST_UDF_LABELS columns, as well as several other columns in that table, are defined as NVARCHAR2(250) data types, allowing up to 250 characters to be stored per row.  The PLM_URL column in that table is defined as NVARCHAR2(2000), allowing up to 2,000 characters to be stored per row.  I found that by putting the following value into the PART_UDF_LABELS column, the ERP company’s API test program would complete successfully:

"% of Plate","Heat Code/Material","Customer ID","Gross Weight"

Adding back a few more characters to that column value caused the ERP company’s API test program to fail the tests:

"% of Plate","Heat Code/Material","Customer ID","Gross Weight","Vendor"

I found that if the PART_UDF_LABELS, VEND_UDF_LABELS, and CUST_UDF_LABELS columns contained up to 64 characters, the ERP company’s API test program completed the tests successfully.  65 characters caused the API test program to fail the tests.  But, what about that column (PLM_URL) that permits up to 2,000 characters to be stored, surely that column cannot have a 64 character limit for companies that need to automate document creation through the ERP company’s API?  Yep, more than 64 characters in that column will also cause the API tests to fail.

To err is human, but two errors does not compute (or was it three errors identified in this article).





See Sharp Objects, Run with Sharp Objects, Crash with Sharp Objects – Nothing but .Net, C#, and ODP.NET – 1

5 12 2015

December 5, 2015

I have been writing computer programs for a few years – the first more than 30 years ago, and have written programs in a variety of languages including C++, Cobol, Pascal, BASIC, Visual Basic, eVB, a couple of scripting languages (including JavaScript and VBScript), and a few others that are long forgotten (I could read some Fortran, but never had a need for the language).  I used Visual Basic for DOS years ago, and after toying with Visual Basic 2.0 for an hour, I wondered about the purpose of that “toy”, but interesting, programming language that allowed me to draw text boxes and drop-down lists on a form.  I bought and read cover to cover a number of books that described methods to utilize the quick development capabilities of Visual Basic to draw a user interface, and then jump out of the limited safety net of Visual Basic to harness the functionality provided by direct access to the Windows API.  The classic Visual Basic 6.0 is my favorite language so far, allowing for quick development, fast performance, and its ability to produce programs that are compatible with every Windows version from Windows NT 4.0 (and Windows 95) to Windows 10.

In my opinion, Microsoft mutated the clean, quick development programming language in Visual Basic 6.0 with the release of VB.Net.  A decade ago I had to write a program that would run on a Symbol MC3090 handheld computer with built-in barcode scanner (Symbol was later acquired by Motorola, part of Motorola’s operations were acquired by Google, and apparently Zebra is the current owner of this technology).  The MC3090 supported programs that were written in the .Net languages, such as VB.Net and C#, but not the eVB language (language was similar to VB 6.0) that was supported on the earlier Symbol PDT8146.  After a bit of head scratching, I wrote the program in VB.Net 2005.  Not only did the VB.Net solution work, it was also significantly more stable than the eVB developed solution on the PDT8146 (the hardware change may be the primary reason for the change in stability).  I still found VB.Net’s language to be cumbersome to use compared to Visual Basic 6.0 (which is very similar to the macro language in Microsoft Excel, Access, PowerPoint, Word, and Outlook) and VBScript (which is the macro language used by the ERP system where I work).

The earlier versions of the ERP system used where I work offered a COM interface to an API (application programming interface) that could be used for a variety of activities, such as creating inventory transactions, creating purchase orders, modifying customer orders, creating work orders to produce the parts ordered by the customers, and creating the shipping records when the completed parts ship to the customers.  I relied heavily on this COM interface API with Visual Basic 6.0 (or Excel, or the macro language in the ERP system) to automate many processes that previously required manual data entry into the ERP system.  Fast forward a decade, or so, and the most recent versions of the ERP system offers only a .Net compatible API to replace the COM API.  Not only is it a .Net compatible API, but it is also not compatible with Visual Studio 2005 (I thought that .Net was supposed to mark the end of DLL Hell?).  Great, another opportunity to be forced into not using VB 6.0 for development.

I am in the process of translating a MRP (material requirement planning) program that I created 10 to 12 years ago in VB 6.0 into C# to permit the use of the new .Net API.  The program has some fairly complex programming logic that uses recursive programming to drill into the multi-level BOMs (bill of materials) – in some simple cases the multi-level BOM is associated with a single work order, while in more complex cases the multi-level BOM is associated with child, grandchild, great-grandchild, and great-great-grandchild work orders.  The goal of the MRP program is to take the open customer orders in the ERP system, and to plan the top-level, child, grandchild, great-grandchild, and great-great-grandchild work orders and the associated purchased parts so that the purchased parts are available when needed by the work orders, the parts produced by the great-great-grandchild work orders are available when required by the great-grandchild work orders, the parts produced by the great-grandchild work orders are available when needed by the grandchild work orders, the parts produced by the grandchild work orders are available when required by the child work orders, the parts produced by the child work orders are available when required by the top-level work orders, and the parts produced by the top-level work orders are available on the dates specified in the customer order.  The ideal batch (lot) quantities, of course, could be different at each of the work order levels, and the quantity due to the customer on a given date will also typically fluctuate from one day to the next.

A simple illustration of the intended end-goal of the MRP program is shown in the picture below, which is the screen capture of another program that I wrote probably 15 years ago.  The customer orders (the demand) are shown on the left half of the screen capture, and the top-level work orders (the supply) are shown on the right half of the screen capture.  The quantity ordered by the customer for shipment on a given day is variable, but the work order batch (lot) sizes are standardized by part ID.  The intended goal is to permit the completion of the production of the ordered parts one weekday (Monday through Friday) earlier than the customer’s requested ship date (so that the part from a top-level work order could complete at 11:50 PM and then ship out to the customer at 5 AM the following morning).

seesharpobjectsoverview2

After using C# for a little while, for some reason the language reminds me of an odd, but very usable cross-pollination of Visual Basic 2.0 and Borland C++ that might have happened a bit over two decades ago.  Running with sharp objects is not so bad, although there are still exceptions.

After that long introduction, I will show a quick demonstration of accessing Oracle database using Visual Basic 6.0 and C# (I will use the 2005 version of C# – the newer versions provide some alternate syntax that saves a few keystrokes while programming).  The VB 6.0 sample program uses Microsoft ADO (ActiveX Data Objects) and the Oracle Provider for OLE DB (an optional feature installed with the Oracle client, see this article).  The C# sample program uses ODP.Net (Oracle Data Provider for .NET, an optional feature installed with the Oracle client, may also be installed later from a download).

Some general guidelines for programs that interact with Oracle databases:

  • Open a database connection, and keep it open for the duration of the program, if possible.
  • Use bind variables when possible, rather than specifying literals (constants) in the SQL statement that change from one execution to the next (there are a few exceptions to this rule).  Doing so helps to minimize the performance penalty of excessive hard parses.
  • Use a consistent format for all SQL statements submitted to the database – doing so maximizes the chances that a SQL statement will not need to be hard parsed when a similar SQL statement (possibly with just one extra space in the SQL statement) is already in Oracle’s library cache.
  • Perform tasks once, if possible, rather than over and over again.
  • Minimize the number of round trips to the database.  If an operation may be executed within a loop in your program, sending requests to the database for each iteration through the loop, think outside the loop to try to construct a single SQL statement that accomplishes the same task.  For instance, retrieve the required information into program variables either before or after the loop, or perform the processing within the database instance.

For this article, we will create a couple of tables and a sequence in the database for use in the program by using SQL*Plus:

CREATE TABLE PO(
  ORDER_ID VARCHAR2(15),
  VENDOR_ID VARCHAR2(15),
  ORDER_DATE DATE,
  SHIP_TO_ID NUMBER(4),
  PRIMARY KEY (ORDER_ID));
 
CREATE TABLE PO_LINE(
  ORDER_ID VARCHAR2(15),
  LINE_NO NUMBER(4),
  PART_ID VARCHAR2(30),
  QTY NUMBER(14,4),
  PRICE NUMBER(14,4),
  DUE_DATE DATE,
  PRIMARY KEY (ORDER_ID, LINE_NO));
 
CREATE TABLE PO_TRANS(
  TRANSACTION_ID NUMBER(14),
  VENDOR_ID VARCHAR2(15),
  PART_ID VARCHAR2(30),
  QTY NUMBER(14,4),
  DUE_DATE DATE,
  UNKNOWN NUMBER(14,4),
  PRIMARY KEY (TRANSACTION_ID));
 
CREATE SEQUENCE PO_TRANS_ID START WITH 1000;

The first step for VB 6.0 is to add a reference to one of the Microsoft ActiveX Data Objects Library versions.  To access this window, select References… from the Project menu:

SeeSharpObjectsDataAccessReferenceVB6-2

The first step for C# is to add a reference to Oracle.DataAccess.  To access this window, select Add Reference… from the Project menu:

SeeSharpObjectsDataAccessReferenceCS-2

C# also requires a couple of using entries to be added to the top of the program form and class code sections.  For example (the last two lines):

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using Oracle.DataAccess.Client; // ODP.NET Oracle managed provider
using Oracle.DataAccess.Types; // ODP.NET Oracle managed provider

 

Since I only want to create a single connection to the database, and maintain that connection for potential use within a couple of program forms or code modules, in VB 6.0 I will create a module with a Global variable to maintain the database connection (note that I did not include the New keyword before ADODB, while I have not personally performed a comparison of the performance, a couple of books indicate that including the New keyword incurs a performance penalty with every access of that variable).

Global dbConnection As ADODB.Connection

In C# I would also want to have the ability to maintain that database connection for potential use within a couple of program forms or class modules, so in C# we will create a new class that I will name Global.  Inside the class module we will modify the class definition so that it is public, and create a public static variable named dbConnection that is of the type OracleConnection – the public static variable will be accessible from other code sections in the program without having to create an instance of the Global class.  The Global class should appear similar to what is seen below:

using System;
using System.Collections.Generic;
using System.Text;
using Oracle.DataAccess.Client; // ODP.NET Oracle managed provider
using Oracle.DataAccess.Types; // ODP.NET Oracle managed provider
 
namespace SeeSharpObjectsSample
{
    public class Global
    {
        public static OracleConnection dbConnection;
    }
}

In VB 6.0 we will add some code to the Form_Load event procedure to connect to the database.  In VB 6.0, double-click the form’s background to open the code module for the Form_Load event procedure.  In the procedure we will declare a strSQL variable that will be used later in the Form_Load procedure and a strConnection variable.  On Error Resume Next will permit the program to recover gracefully if a command, such as the connection attempt, fails.  The code will connect to the or1212 database (most likely using the tnsnames.ora file to locate the database), connecting as the testuser user which also has a password of testuser, instruct the computer to set the array fetch size to 100 (to retrieve up to 100 rows with a single database fetch call), and to allow up to 40 seconds for the connection to complete.  If the connection fails, an error message is displayed on screen, and the program ends:

Private Sub Form_Load()
    Dim strSQL As String
    Dim strConnection As String
    
    On Error Resume Next
    
    strConnection = "Provider=OraOLEDB.Oracle;Data Source=or1212;User ID=testuser;Password=testuser;FetchSize=100;"
    
    Set dbConnection = New ADODB.Connection
    dbConnection.ConnectionTimeout = 40
    dbConnection.CursorLocation = adUseClient
    dbConnection.ConnectionString = strConnection
    
    dbConnection.Open
    
    If (dbConnection.State <> 1) Then
        MsgBox "Not connected to the database, program ending." & vbCrLf & Error
        End
    Else
        MsgBox "Connected"
    End If
End Sub

In C# we will add some code to the _Load event procedure to connect to the database.  In C#, double-click the form’s background to open the code module for the _Load event procedure.  In the procedure we will declare a strSQL variable that will be used later in the _Load procedure and a strConnection variable.  The try { } and catch { } blocks permit the program to recover gracefully if a command, such as the connection attempt, fails.  The code will connect to the or1212 database (most likely using the tnsnames.ora file to locate the database), connecting as the testuser user which also has a password of testuser, and to allow up to 40 seconds for the connection to complete.  In C# the array fetch size cannot be defaulted to 100, instead the array fetch size must be specified in the parameters of the individual OracleDataReader objects.  If the connection fails, an error message is displayed on screen, and the program ends:

private void frmSeeSharpObjects_Load(object sender, EventArgs e)
{
    string strSQL = "";
    string strConnection = "Data Source=or1212;User ID=testuser;Password=testuser;Connection Timeout=40;";
    
    try
    {
        Global.dbConnection = new OracleConnection(strConnection);
        
        Global.dbConnection.Open();
        MessageBox.Show("Connected");
    }
    catch (OracleException ex)
    {
        MessageBox.Show ("Not connected to the database, program ending." + "\n" + ex.Message.ToString() );
        Application.Exit();
    }
}

At this point, both of the programs should be able to run and connect to the database (Connected should appear in a window on the screen).  I happen to be using the 64 bit version of Windows for development, and need to use the 32 bit Oracle Client (and ODP.NET), so I encountered a few complications – Visual Studio added a reference to the 64 bit Oracle.DataAccess.dll.  As a result, I received an error similar to “Data provider internal error(-3000)”.  Completely fixing this issue required a little bit of work, and might require a different procedure on your computer.  First, I had to select projectname Properties from the Project menu and change the Platform target from Any CPU to x86:

SeeSharpObjectsPropertiesPlatform

I then saw the following warning in Visual Studio: “Assembly generation — Referenced assembly ‘Oracle.DataAccess.dll’ targets a different processor”.  The properties for the Oracle.DataAccess reference in the Solution Explorer showed that the Path property pointed to the 64 bit DLL.  In my case, the 32 bit DLL was in a different Oracle home, specifically located at C:\Oracle\product\11.2.0\client_1\ODP.NET\bin\2.x\Oracle.DataAccess.dll – I did not see any simple method to quickly reference the 32 bit DLL, rather than the 64 bit DLL.  So, I right-clicked Oracle.DataAccess in the References list and selected Remove from the menu.  To add the reference to the 32 bit DLL, I selected Add Reference… from the Project menu, selected the Browse tab, navigated to the C:\Oracle\product\11.2.0\client_1\ODP.NET\bin\2.x folder (the 32 bit ODP.NET driver may be in a different folder on your computer), selected the Oracle.DataAccess.dll file and then clicked OK:

SeeSharpObjectsBrowseReference

The above steps fixed the problems, and allowed the C# program to connect to the database (I thought that .Net was supposed to mark the end of DLL Hell?).  Moving on…

Next, I want to set up ADO Command objects in the VB 6.0 project and OracleCommand objects in the C# project to insert into the three test tables, and to execute a SELECT statement.  Since the routines for inserting into the tables and selecting the data could be called many times in the program, I only want to set up these command objects once, and then use those command objects multiple times.  In the General section of the VB 6.0 form code (above the Private Sub Form_Load() ) line, add the following to create four ADO command objects:

Dim comInsertPO As ADODB.Command
Dim comInsertPOLine As ADODB.Command
Dim comInsertPOTrans As ADODB.Command
Dim comSelectPO As ADODB.Command

In the C# form code, below the opening bracket of the line “public partial class frmSeeSharpObjects : Form“, add the following lines (note, frmSeeSharpObjects is the name of the form that I created for this project):

        OracleCommand comInsertPO;
        OracleCommand comInsertPOLine;
        OracleCommand comInsertPOTrans;
        OracleCommand comSelectPO;

In the VB 6.0 form code, just before the End Sub in the Form_Load() event procedure, we will add additional code to create and configure the four ADO command objects.  ADO uses ? characters as bind variable placeholders:

    Set comInsertPO = New ADODB.Command
    With comInsertPO
        strSQL = "INSERT INTO PO(" & vbCrLf
        strSQL = strSQL & "  ORDER_ID," & vbCrLf
        strSQL = strSQL & "  VENDOR_ID," & vbCrLf
        strSQL = strSQL & "  ORDER_DATE," & vbCrLf
        strSQL = strSQL & "  SHIP_TO_ID)" & vbCrLf
        strSQL = strSQL & "VALUES(" & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?)"
    
        .CommandText = strSQL
        .CommandType = adCmdText
        .CommandTimeout = 30
        .ActiveConnection = dbConnection
        'Add the bind variables
        .Parameters.Append .CreateParameter("order_id", adVarChar, adParamInput, 15, "12345678910")
        .Parameters.Append .CreateParameter("vendor_id", adVarChar, adParamInput, 15, "ABCDEFGHIJKLM")
        .Parameters.Append .CreateParameter("order_date", adDate, adParamInput, 8, Now)
        .Parameters.Append .CreateParameter("ship_to_id", adNumeric, adParamInput, 8, 0)
    End With
        
    Set comInsertPOLine = New ADODB.Command
    With comInsertPOLine
        strSQL = "INSERT INTO PO_LINE(" & vbCrLf
        strSQL = strSQL & "  ORDER_ID," & vbCrLf
        strSQL = strSQL & "  LINE_NO," & vbCrLf
        strSQL = strSQL & "  PART_ID," & vbCrLf
        strSQL = strSQL & "  QTY," & vbCrLf
        strSQL = strSQL & "  PRICE," & vbCrLf
        strSQL = strSQL & "  DUE_DATE)" & vbCrLf
        strSQL = strSQL & "VALUES(" & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?)"
    
        .CommandText = strSQL
        .CommandType = adCmdText
        .CommandTimeout = 30
        .ActiveConnection = dbConnection
        'Add the bind variables
        .Parameters.Append .CreateParameter("order_id", adVarChar, adParamInput, 15, "12345678910")
        .Parameters.Append .CreateParameter("line_no", adNumeric, adParamInput, 8, 1)
        .Parameters.Append .CreateParameter("part_id", adVarChar, adParamInput, 15, "ABCDEFGHIJKLM")
        .Parameters.Append .CreateParameter("qty", adDouble, adParamInput, 15, 1)
        .Parameters.Append .CreateParameter("price", adDouble, adParamInput, 15, 1)
        .Parameters.Append .CreateParameter("due_date", adDate, adParamInput, 15)
    End With
    
    Set comInsertPOTrans = New ADODB.Command
    With comInsertPOTrans
        strSQL = "INSERT INTO PO_TRANS(" & vbCrLf
        strSQL = strSQL & "  TRANSACTION_ID," & vbCrLf
        strSQL = strSQL & "  VENDOR_ID," & vbCrLf
        strSQL = strSQL & "  PART_ID," & vbCrLf
        strSQL = strSQL & "  QTY," & vbCrLf
        strSQL = strSQL & "  DUE_DATE," & vbCrLf
        strSQL = strSQL & "  UNKNOWN)" & vbCrLf
        strSQL = strSQL & "VALUES(" & vbCrLf
        strSQL = strSQL & "  PO_TRANS_ID.NEXTVAL," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?," & vbCrLf
        strSQL = strSQL & "  ?)"
   
        .CommandText = strSQL
        .CommandType = adCmdText
        .CommandTimeout = 30
        .ActiveConnection = dbConnection
        'Add the bind variables
        .Parameters.Append .CreateParameter("vendor_id", adVarChar, adParamInput, 15, "ABCDEFGHIJKLM")
        .Parameters.Append .CreateParameter("part_id", adVarChar, adParamInput, 15, "ABCDEFGHIJKLM")
        .Parameters.Append .CreateParameter("qty", adDouble, adParamInput, 15, 1)
        .Parameters.Append .CreateParameter("due_date", adDate, adParamInput, 15)
        .Parameters.Append .CreateParameter("unknown", adDouble, adParamInput, 15, 1)
    End With
    
    Set comSelectPO = New ADODB.Command
    With comSelectPO
        strSQL = "SELECT" & vbCrLf
        strSQL = strSQL & "  P.VENDOR_ID," & vbCrLf
        strSQL = strSQL & "  PL.PART_ID," & vbCrLf
        strSQL = strSQL & "  PL.QTY," & vbCrLf
        strSQL = strSQL & "  PL.DUE_DATE" & vbCrLf
        strSQL = strSQL & "FROM" & vbCrLf
        strSQL = strSQL & "  PO P," & vbCrLf
        strSQL = strSQL & "  PO_LINE PL" & vbCrLf
        strSQL = strSQL & "WHERE" & vbCrLf
        strSQL = strSQL & "  P.ORDER_ID=PL.ORDER_ID" & vbCrLf
        strSQL = strSQL & "  AND P.ORDER_DATE>=TRUNC(SYSDATE-30)" & vbCrLf
        strSQL = strSQL & "ORDER BY" & vbCrLf
        strSQL = strSQL & "  P.VENDOR_ID," & vbCrLf
        strSQL = strSQL & "  PL.PART_ID"
        
        .CommandText = strSQL
        .CommandType = adCmdText
        .CommandTimeout = 30
        .ActiveConnection = dbConnection
    End With

Now to add the equivalent code to the _Load event procedure code in C# just before the closing } of the procedure (ODP.NET uses : followed by an identifier as bind variable placeholders):

strSQL = "INSERT INTO PO(" + "\n";
strSQL = strSQL + " ORDER_ID," + "\n";
strSQL = strSQL + " VENDOR_ID," + "\n";
strSQL = strSQL + " ORDER_DATE," + "\n";
strSQL = strSQL + " SHIP_TO_ID)" + "\n";
strSQL = strSQL + "VALUES(" + "\n";
strSQL = strSQL + " :1," + "\n";
strSQL = strSQL + " :2," + "\n";
strSQL = strSQL + " :3," + "\n";
strSQL = strSQL + " :4)";
 
comInsertPO = new OracleCommand(strSQL, Global.dbConnection);
//comInsertPO.CommandText = strSQL;
comInsertPO.CommandType = CommandType.Text;
comInsertPO.CommandTimeout = 30;
comInsertPO.BindByName = false;
 
//Add the bind variables
comInsertPO.Parameters.Add("order_id", OracleDbType.Varchar2, 15, "12345678910", ParameterDirection.Input);
comInsertPO.Parameters.Add("vendor_id", OracleDbType.Varchar2, 15, "ABCDEFGHIJKLM", ParameterDirection.Input);
comInsertPO.Parameters.Add("order_date", OracleDbType.Date, 8, DateTime.Now, ParameterDirection.Input);
comInsertPO.Parameters.Add("ship_to_id", OracleDbType.Long, 8, 0, ParameterDirection.Input);
 
strSQL = "INSERT INTO PO_LINE(" + "\n";
strSQL = strSQL + " ORDER_ID," + "\n";
strSQL = strSQL + " LINE_NO," + "\n";
strSQL = strSQL + " PART_ID," + "\n";
strSQL = strSQL + " QTY," + "\n";
strSQL = strSQL + " PRICE," + "\n";
strSQL = strSQL + " DUE_DATE)" + "\n";
strSQL = strSQL + "VALUES(" + "\n";
strSQL = strSQL + " :1," + "\n";
strSQL = strSQL + " :2," + "\n";
strSQL = strSQL + " :3," + "\n";
strSQL = strSQL + " :4," + "\n";
strSQL = strSQL + " :5," + "\n";
strSQL = strSQL + " :6)";
 
comInsertPOLine = new OracleCommand(strSQL, Global.dbConnection);
//comInsertPOLine.CommandText = strSQL;
comInsertPOLine.CommandType = CommandType.Text;
comInsertPOLine.CommandTimeout = 30;
comInsertPOLine.BindByName = false;
 
//Add the bind variables
comInsertPOLine.Parameters.Add("order_id", OracleDbType.Varchar2, 15, "12345678910", ParameterDirection.Input);
comInsertPOLine.Parameters.Add("line_no", OracleDbType.Long, 8, 1, ParameterDirection.Input);
comInsertPOLine.Parameters.Add("part_id", OracleDbType.Varchar2, 15, "ABCDEFGHIJKLM", ParameterDirection.Input);
comInsertPOLine.Parameters.Add("qty", OracleDbType.Double, 15, 1, ParameterDirection.Input);
comInsertPOLine.Parameters.Add("price", OracleDbType.Double, 15, 1, ParameterDirection.Input);
comInsertPOLine.Parameters.Add("due_date", OracleDbType.Date, 15, ParameterDirection.Input);
 
strSQL = "INSERT INTO PO_TRANS(" + "\n";
strSQL = strSQL + " TRANSACTION_ID," + "\n";
strSQL = strSQL + " VENDOR_ID," + "\n";
strSQL = strSQL + " PART_ID," + "\n";
strSQL = strSQL + " QTY," + "\n";
strSQL = strSQL + " DUE_DATE," + "\n";
strSQL = strSQL + " UNKNOWN)" + "\n";
strSQL = strSQL + "VALUES(" + "\n";
strSQL = strSQL + " PO_TRANS_ID.NEXTVAL," + "\n";
strSQL = strSQL + " :1," + "\n";
strSQL = strSQL + " :2," + "\n";
strSQL = strSQL + " :3," + "\n";
strSQL = strSQL + " :4," + "\n";
strSQL = strSQL + " :5)";
 
comInsertPOTrans = new OracleCommand(strSQL, Global.dbConnection);
//comInsertPOTrans.CommandText = strSQL;
comInsertPOTrans.CommandType = CommandType.Text;
comInsertPOTrans.CommandTimeout = 30;
comInsertPOTrans.BindByName = false;
 
//Add the bind variables
comInsertPOTrans.Parameters.Add("vendor_id", OracleDbType.Varchar2, 15, "ABCDEFGHIJKLM", ParameterDirection.Input);
comInsertPOTrans.Parameters.Add("part_id", OracleDbType.Varchar2, 15, "ABCDEFGHIJKLM", ParameterDirection.Input);
comInsertPOTrans.Parameters.Add("qty", OracleDbType.Double, 15, 1, ParameterDirection.Input);
comInsertPOTrans.Parameters.Add("due_date", OracleDbType.Date, 15, ParameterDirection.Input);
comInsertPOTrans.Parameters.Add("unknown", OracleDbType.Double, 15, 1, ParameterDirection.Input);
 
strSQL = "SELECT" + "\n";
strSQL = strSQL + " P.VENDOR_ID," + "\n";
strSQL = strSQL + " PL.PART_ID," + "\n";
strSQL = strSQL + " PL.QTY," + "\n";
strSQL = strSQL + " PL.DUE_DATE" + "\n";
strSQL = strSQL + "FROM" + "\n";
strSQL = strSQL + " PO P," + "\n";
strSQL = strSQL + " PO_LINE PL" + "\n";
strSQL = strSQL + "WHERE" + "\n";
strSQL = strSQL + " P.ORDER_ID=PL.ORDER_ID" + "\n";
strSQL = strSQL + " AND P.ORDER_DATE>=TRUNC(SYSDATE-30)" + "\n";
strSQL = strSQL + "ORDER BY" + "\n";
strSQL = strSQL + " P.VENDOR_ID," + "\n";
strSQL = strSQL + " PL.PART_ID";
 
comSelectPO = new OracleCommand(strSQL, Global.dbConnection);
//comSelectPO.CommandText = strSQL;
comSelectPO.CommandType = CommandType.Text;
comSelectPO.CommandTimeout = 30;

It should be possible to run the program now without generating errors, but the program will not do much other than connect to the database at this point.  Now that the basic setup is out of the way, we will create a Command Button in VB 6.0 (a Button in C#) and set its name to cmdProcess.  In VB 6.0, double-click the command button to access the Click event code for the command button.  Add the following code, which will add a few rows to the PO and PO_LINE tables, select some of the rows from those tables with a join on the ORDER_ID column in each table, and then insert rows into the PO_TRANS table using the SELECT resultset.  Note that it is possible to generate a 10046 trace at level 4 (with bind variables) by removing the character on the line below the On Error Resume Next line.  Note too that it is only necessary to specify the value of bind variables if those values must change after each INSERT is executed:

Private Sub cmdProcess_Click()
    Dim snpData As ADODB.Recordset
    
    On Error Resume Next
    
    'dbConnection.Execute "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 4'"
    
    dbConnection.BeginTrans
    
    comInsertPO("order_id") = "A10000"
    comInsertPO("vendor_id") = "BUBBA GUMP"
    comInsertPO("order_date") = DateAdd("d", -62, Date)
    comInsertPO("ship_to_id") = 1
    comInsertPO.Execute
    
    comInsertPO("order_id") = "A10001"
    comInsertPO("order_date") = DateAdd("d", -25, Date)
    comInsertPO.Execute
    
    comInsertPO("order_id") = "A10002"
    comInsertPO("vendor_id") = "POPPA GUMP"
    comInsertPO("order_date") = DateAdd("d", -20, Date)
    comInsertPO("ship_to_id") = 1
    comInsertPO.Execute
    
    comInsertPO("order_id") = "A10003"
    comInsertPO("order_date") = DateAdd("d", -2, Date)
    comInsertPO("ship_to_id") = 1
    comInsertPO.Execute
    
    comInsertPO("order_id") = "A10004"
    comInsertPO("vendor_id") = "POPPA SMURF"
    comInsertPO("order_date") = DateAdd("d", -1, Date)
    comInsertPO("ship_to_id") = 1
    comInsertPO.Execute
    
    comInsertPOLine("order_id") = "A10000"
    comInsertPOLine("line_no") = 1
    comInsertPOLine("part_id") = "SPAM"
    comInsertPOLine("qty") = 5
    comInsertPOLine("price") = 10.25
    comInsertPOLine("due_date") = DateAdd("d", -1, Date)
    comInsertPOLine.Execute
    
    comInsertPOLine("line_no") = 2
    comInsertPOLine("due_date") = DateAdd("d", 3, Date)
    comInsertPOLine.Execute
    
    comInsertPOLine("line_no") = 3
    comInsertPOLine("due_date") = DateAdd("d", 6, Date)
    comInsertPOLine.Execute
    comInsertPOLine("order_id") = "A10001"
    comInsertPOLine("line_no") = 1
    comInsertPOLine("part_id") = "STEAK"
    comInsertPOLine("qty") = 3
    comInsertPOLine("price") = 21.85
    comInsertPOLine("due_date") = DateAdd("d", -20, Date)
    comInsertPOLine.Execute
    
    comInsertPOLine("line_no") = 2
    comInsertPOLine("qty") = 4
    comInsertPOLine("due_date") = DateAdd("d", 10, Date)
    comInsertPOLine.Execute
    
    comInsertPOLine("line_no") = 3
    comInsertPOLine("qty") = 3
    comInsertPOLine("due_date") = DateAdd("d", 15, Date)
    comInsertPOLine.Execute
    comInsertPOLine("order_id") = "A10002"
    comInsertPOLine("line_no") = 1
    comInsertPOLine("part_id") = "CHUCKBURGER"
    comInsertPOLine("qty") = 3
    comInsertPOLine("price") = 15.01
    comInsertPOLine("due_date") = DateAdd("d", 10, Date)
    comInsertPOLine.Execute
    
    comInsertPOLine("order_id") = "A10003"
    comInsertPOLine("line_no") = 1
    comInsertPOLine("part_id") = "BACON"
    comInsertPOLine("qty") = 3
    comInsertPOLine("price") = 16.49
    comInsertPOLine("due_date") = DateAdd("d", 9, Date)
    comInsertPOLine.Execute
    
    comInsertPOLine("line_no") = 2
    comInsertPOLine("part_id") = "EGGS"
    comInsertPOLine("qty") = 19
    comInsertPOLine("price") = 0.49
    comInsertPOLine("due_date") = DateAdd("d", 9, Date)
    comInsertPOLine.Execute
    
    comInsertPOLine("order_id") = "A10004"
    comInsertPOLine("line_no") = 1
    comInsertPOLine("part_id") = "ROASTBEEF"
    comInsertPOLine("qty") = 200
    comInsertPOLine("price") = 11.99
    comInsertPOLine("due_date") = DateAdd("d", 30, Date)
    comInsertPOLine.Execute
    Set snpData = comSelectPO.Execute
    
    If (snpData.State <> 1) Then
        MsgBox "SQL statement did not execute."
    Else
        If (snpData.EOF = True) Then
            MsgBox "No rows returned from the database."
        Else
            Do While Not (snpData.EOF)
                comInsertPOTrans("vendor_id") = snpData("vendor_id")
                comInsertPOTrans("part_id") = snpData("part_id")
                comInsertPOTrans("qty") = snpData("qty") * 1.5
                comInsertPOTrans("due_date") = DateAdd("m", -1, snpData("due_date"))
                comInsertPOTrans("unknown") = Rnd * 100
                comInsertPOTrans.Execute
                
                snpData.MoveNext
            Loop
        End If
        
        snpData.Close
    End If
    
    If Err = 0 Then
        dbConnection.CommitTrans
        
        MsgBox "Processing finished successfully."
    Else
        dbConnection.RollbackTrans
        
        MsgBox "Could not update the database due to an error." & vbCrLf & Error
    End If
End Sub

Now to add the nearly equivalent C# code – the only major difference is that the C# code jumps out of the try block to the catch block when the first error is encountered, while in the VB 6.0 code, due to the usage of the On Error Resume Next command, the processing simply continues when an error occurs.  Note that it is possible to generate a 10046 trace at level 4 (with bind variables) by removing the // characters on the two lines just after the try block begins:

private void cmdProcess_Click(object sender, EventArgs e)
{
    Random rndSequence = new Random();
    OracleDataReader snpData;
    OracleTransaction oraTransaction;
    oraTransaction = Global.dbConnection.BeginTransaction(IsolationLevel.ReadCommitted);
 
    try
    {
        //OracleCommand comTrace = new OracleCommand("ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 4'", Global.dbConnection);
        //comTrace.ExecuteNonQuery();
        
        comInsertPO.Parameters["order_id"].Value = "A10000";
        comInsertPO.Parameters["vendor_id"].Value = "BUBBA GUMP";
        comInsertPO.Parameters["order_date"].Value = DateTime.Today.AddDays(-62);
        comInsertPO.Parameters["ship_to_id"].Value = 1;
        comInsertPO.ExecuteNonQuery();
        
        comInsertPO.Parameters["order_id"].Value = "A10001";
        comInsertPO.Parameters["order_date"].Value = DateTime.Today.AddDays(-25);
        comInsertPO.ExecuteNonQuery();
        
        comInsertPO.Parameters["order_id"].Value = "A10002";
        comInsertPO.Parameters["vendor_id"].Value = "POPPA GUMP";
        comInsertPO.Parameters["order_date"].Value = DateTime.Today.AddDays(-20);
        comInsertPO.Parameters["ship_to_id"].Value = 1;
        comInsertPO.ExecuteNonQuery();
        
        comInsertPO.Parameters["order_id"].Value = "A10003";
        comInsertPO.Parameters["order_date"].Value = DateTime.Today.AddDays(-2);
        comInsertPO.Parameters["ship_to_id"].Value = 1;
        comInsertPO.ExecuteNonQuery();
        
        comInsertPO.Parameters["order_id"].Value = "A10004";
        comInsertPO.Parameters["vendor_id"].Value = "POPPA SMURF";
        comInsertPO.Parameters["order_date"].Value = DateTime.Today.AddDays(-1);
        comInsertPO.Parameters["ship_to_id"].Value = 1;
        comInsertPO.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["order_id"].Value = "A10000";
        comInsertPOLine.Parameters["line_no"].Value = 1;
        comInsertPOLine.Parameters["part_id"].Value = "SPAM";
        comInsertPOLine.Parameters["qty"].Value = 5;
        comInsertPOLine.Parameters["price"].Value = 10.25;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(-1);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["line_no"].Value = 2;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(3);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["line_no"].Value = 3;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(6);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["order_id"].Value = "A10001";
        comInsertPOLine.Parameters["line_no"].Value = 1;
        comInsertPOLine.Parameters["part_id"].Value = "STEAK";
        comInsertPOLine.Parameters["qty"].Value = 3;
        comInsertPOLine.Parameters["price"].Value = 21.85;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(-20);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["line_no"].Value = 2;
        comInsertPOLine.Parameters["qty"].Value = 4;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(10);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["line_no"].Value = 3;
        comInsertPOLine.Parameters["qty"].Value = 3;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(15);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["order_id"].Value = "A10002";
        comInsertPOLine.Parameters["line_no"].Value = 1;
        comInsertPOLine.Parameters["part_id"].Value = "CHUCKBURGER";
        comInsertPOLine.Parameters["qty"].Value = 3;
        comInsertPOLine.Parameters["price"].Value = 15.01;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(10);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["order_id"].Value = "A10003";
        comInsertPOLine.Parameters["line_no"].Value = 1;
        comInsertPOLine.Parameters["part_id"].Value = "BACON";
        comInsertPOLine.Parameters["qty"].Value = 3;
        comInsertPOLine.Parameters["price"].Value = 16.49;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(9);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["line_no"].Value = 2;
        comInsertPOLine.Parameters["part_id"].Value = "EGGS";
        comInsertPOLine.Parameters["qty"].Value = 19;
        comInsertPOLine.Parameters["price"].Value = 0.49;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(9);
        comInsertPOLine.ExecuteNonQuery();
        
        comInsertPOLine.Parameters["order_id"].Value = "A10004";
        comInsertPOLine.Parameters["line_no"].Value = 1;
        comInsertPOLine.Parameters["part_id"].Value = "ROASTBEEF";
        comInsertPOLine.Parameters["qty"].Value = 200;
        comInsertPOLine.Parameters["price"].Value = 11.99;
        comInsertPOLine.Parameters["due_date"].Value = DateTime.Today.AddDays(30);
        comInsertPOLine.ExecuteNonQuery();
        
        snpData = comSelectPO.ExecuteReader();
        if (snpData.IsClosed)
        {
            MessageBox.Show("SQL statement did not execute.");
        }
        else
        {
            if (!snpData.HasRows)
            {
                MessageBox.Show("No rows returned from the database.");
            }
            else
            {
                //snpData.FetchSize = comSelectPO.RowSize * 100;
                while (snpData.Read())
                {
                    comInsertPOTrans.Parameters["vendor_id"].Value = snpData["vendor_id"];
                    comInsertPOTrans.Parameters["part_id"].Value = snpData["part_id"];
                    comInsertPOTrans.Parameters["qty"].Value = Convert.ToDouble(snpData["qty"]) * 1.5;
                    comInsertPOTrans.Parameters["due_date"].Value = Convert.ToDateTime(snpData["due_date"]).AddMonths(-1);
                    comInsertPOTrans.Parameters["unknown"].Value = rndSequence.NextDouble() * 100
                    comInsertPOTrans.ExecuteNonQuery();
                 } //while (snpData.Read())
            } //else of if (!snpData.HasRows)
 
            snpData.Close();
        }
        oraTransaction.Commit();
        MessageBox.Show("Processing finished successfully.");
    }
    catch (Exception ex)
    {
        oraTransaction.Rollback();
        MessageBox.Show("Could not update the database due to an error." + "\n" + ex.Message.ToString());
    }
}

With the above code in place, it should be possible to run the program and click the command button on the form to insert rows into the PO, PO_LINE, and PO_TRANS tables.  Of course, it is only possible to click that command button once without an error appearing on screen in either the VB 6.0 or C# programs, respectively:
SeeSharpObjectsUniqueConstraintVB  SeeSharpObjectsUniqueConstraint

To re-execute the code in the command button’s Click event, the tables must first be cleared of rows using SQL*Plus:

TRUNCATE TABLE PO;
TRUNCATE TABLE PO_LINE;
TRUNCATE TABLE PO_TRANS;

As a test, I executed the cmdProcess button in VB 6.0, and then the following SQL statement in SQL*Plus to see what ended up in the PO_TRANS table:

SELECT
  *
FROM
  PO_TRANS
ORDER BY
  TRANSACTION_ID;

Here is the output that I received (notice, no SPAM) from the processing in VB 6.0:

TRANSACTION_ID VENDOR_ID       PART_ID            QTY DUE_DATE     UNKNOWN
-------------- --------------- ----------- ---------- --------- ----------
          1057 BUBBA GUMP      STEAK              4.5 15-OCT-15    70.5547
          1058 BUBBA GUMP      STEAK                6 15-NOV-15    53.3424
          1059 BUBBA GUMP      STEAK              4.5 20-NOV-15    57.9519
          1060 POPPA GUMP      BACON              4.5 14-NOV-15    28.9562
          1061 POPPA GUMP      CHUCKBURGER        4.5 15-NOV-15    30.1948
          1062 POPPA GUMP      EGGS              28.5 14-NOV-15     77.474
          1063 POPPA SMURF     ROASTBEEF          300 04-DEC-15     1.4018

I then truncated the three tables and executed the cmdProcess button in C#.  Here is the output of the above SQL statement after executing the cmdProcess button in C#:

TRANSACTION_ID VENDOR_ID       PART_ID            QTY DUE_DATE     UNKNOWN
-------------- --------------- ----------- ---------- --------- ----------
          1064 BUBBA GUMP      STEAK              4.5 15-OCT-15    55.5915
          1065 BUBBA GUMP      STEAK                6 15-NOV-15    65.7347
          1066 BUBBA GUMP      STEAK              4.5 20-NOV-15    83.0905
          1067 POPPA GUMP      BACON              4.5 14-NOV-15    19.0149
          1068 POPPA GUMP      CHUCKBURGER        4.5 15-NOV-15    73.2245
          1069 POPPA GUMP      EGGS              28.5 14-NOV-15    61.9526
          1070 POPPA SMURF     ROASTBEEF          300 04-DEC-15    93.9571

The output is identical, with the exception of the UNKNOWN column, which is a pseudo-random number between 0 and 99.9999.  A job well done, with one complication.  What is the complication?  Think about the code logic for a minute before scrolling down.

Thought about the complication?  No, it is not that this blog article required approximately 11 hours to write (a large percentage of that time is due to WordPress’ auto-formatting of the article).  The code violates the fifth bullet point that is listed at the start of the article.  The insert into the PO_TRANS table could have been performed by sending a single SQL statement to the database, rather than using a while or Do While loop and performing an execution of the INSERT statement for each row in the resultset.

If you would like to use a free version of C#, Microsoft offers the Express edition with no apparent licensing restrictions and a more powerful Community edition that carries licensing restrictions.  Both versions may be downloaded from Microsoft’s website.

On a remotely related note, a week ago a Microsoft Windows technician called about a computer problem that I was having.  My first thought was, “how did he know I was translating a VB 6.0 program to C#?”  My second thought was, “which problem?”  Sadly, I immediately recognized this call as one of those “Microsoft” support scam calls, and decided that the person on the other end of the line was not equipped to walk with, let along run with, sharp objects.  As a result, I decided to terminate the call rather quickly:
“Microsoft” support scammer: I am calling about a problem with your computer.
Me: A computer? (in a concerned voice)
“Microsoft” support scammer: Yes, your laptop computer.
Me: I don’t have a laptop computer.
“Microsoft” support scammer: Oh, your Windows tower computer.
Me: I don’t have a computer.
“Microsoft” support scammer: (Click)

The scammer gave up a little too easily – I had planned to ask her if she knew how to troubleshoot sticky key problems on my typewriter.  That is the second Microsoft tech that gave up before we had a chance to troubleshoot the virtual machine.





Characterset Changes

9 10 2015

October 9, 2015

In 2006 I did a bit of experimentation with an Oracle characterset that supports Unicode (AL32UTF8 if I remember correctly) using Oracle Database 10.2.0.2.  That test, performed on a desktop computer with a bit less than 1GB of RDRAM (Rambus) memory, was a huge failure, and not just because the computer did not have enough memory to meet the minimum requirements for Oracle Database 10.2.0.2.  Sure, the database access was slower with a characterset that sometimes used one byte to store a single character, two bytes for other characters, and three bytes to store each of the remaining characters.  There was another, more serious problem.  I put together this article mostly as a reminder to myself to recall what I tested 9.5 years ago.

I am in the process of testing an upgrade to the most recent version of the ERP package used by the company where I work – in addition to being an Oracle DBA for the company, I am also the IT Manager and in-house expert for the ERP system.  Reading through the ERP installation documentation, I found the following:

“It is highly recommended that you perform the upgrade procedures in a test environment before you upgrade your production environment. The upgrade procedure potentially involves deleting your existing database instance. Make sure you understand the ramifications of deleting the database instance before you upgrade your production environment…  If you intend to use the same database instance name with your 8.0.0 installation as you did with your previous VISUAL installation, delete your existing database instance. Do not delete your existing instance until after you have verified that the data has been exported properly…”

“Create a new Oracle Instance. In 8.0.0, the AL32UTF8 character set and the Al16UTF16 universal national character set are used. You must create a new Oracle instance to specify these character sets.”

So, the ERP company would like for me to export the production WE8MSWIN1252 characterset (AL16UTF16 national characterset) database using Datapump, DROP DATABASE, create a new database with the AL32UTF8 characterset (AL16UTF16 national characterset), import the database using Datapump, and then start the ERP upgrade conversion utility.  The statement “you must create a new Oracle instance to specify these character sets,” was a bit of a red flag.  Something is not right with that statement.

If I followed those directions, something would certainly take a bite out of some of the data in a few critical tables:

SELECT
  *
FROM
  NLS_DATABASE_PARAMETERS
ORDER BY
  PARAMETER;
 
PARAMETER                VALUE
------------------------ ----------------------------
NLS_CALENDAR             GREGORIAN
NLS_CHARACTERSET         WE8MSWIN1252
NLS_COMP                 BINARY
NLS_CURRENCY             $
NLS_DATE_FORMAT          DD-MON-RR
NLS_DATE_LANGUAGE        AMERICAN
NLS_DUAL_CURRENCY        $
NLS_ISO_CURRENCY         AMERICA
NLS_LANGUAGE             AMERICAN
NLS_LENGTH_SEMANTICS     BYTE
NLS_NCHAR_CHARACTERSET   AL16UTF16
NLS_NCHAR_CONV_EXCP      FALSE
NLS_NUMERIC_CHARACTERS   .,
NLS_RDBMS_VERSION        11.2.0.2.0
NLS_SORT                 BINARY
NLS_TERRITORY            AMERICA
NLS_TIMESTAMP_FORMAT     DD-MON-RR HH.MI.SSXFF AM
NLS_TIMESTAMP_TZ_FORMAT  DD-MON-RR HH.MI.SSXFF AM TZR
NLS_TIME_FORMAT          HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT       HH.MI.SSXFF AM TZR

NLS_LENGTH_SEMANTICS: BYTE – so why is that a problem?

In this particular ERP system, which is designed for use by manufacturers, one of the critical tables in the database is named PART – there are quite a number of defined foreign keys in the database that point back to this particular table.  What do the columns in this table look like before the conversion?

SQL> DESC PART
 Name                    Null?    Type
 ----------------------- -------- ------------
 ID                      NOT NULL VARCHAR2(30)
 DESCRIPTION                      VARCHAR2(40)
 STOCK_UM                NOT NULL VARCHAR2(15)
 PLANNING_LEADTIME       NOT NULL NUMBER
 ORDER_POLICY            NOT NULL CHAR(1)
 ORDER_POINT                      NUMBER(14,4)
 SAFETY_STOCK_QTY                 NUMBER(14,4)
 FIXED_ORDER_QTY                  NUMBER(14,4)
...

Where is the potential problem with exporting everything, dropping the database, creating a new database with a different characterset, and then importing the data.  When I tested the procedure in 2006 I found that one or two rows would not re-import into the PART table due to the values of some columns being too long.  Too long?  Funny things (or not so funny things) happen when the foreign keys are created after the table data import if one or two rows referenced by other tables are not in the PART table.  It took me a couple of minutes to recreate the SQL statement that I set up in 2006 to see what caused the import problem – that SQL statement is shown below:

SELECT
  ID,
  DESCRIPTION,
  LENGTHB(DESCRIPTION) CUR_BYTE_LEN,
  LENGTHB(CONVERT(DESCRIPTION, 'AL32UTF8', 'WE8MSWIN1252')) NEW_BYTE_LEN
FROM
  PART
WHERE
  LENGTHB(CONVERT(DESCRIPTION, 'AL32UTF8', 'WE8MSWIN1252'))>LENGTHB(DESCRIPTION);

The above SQL statement instructs Oracle Database to take a database column’s values (DESCRIPTION), consider the text stored in that column to be in the WE8MSWIN1252 characterset, translate the column values to the AL32UTF8 characterset, and then compare the byte lengths of the translated column values with the byte lengths of the original column.  If there are differences, the primary key value (ID), DESCRIPTION, and column lengths of the DESCRIPTION column before and after conversion are output.  The output was slightly different using Oracle SQL Developer, SQL*Plus on a Windows client, and SQL*Plus on a Synology NAS connected by a Windows Telnet session, so I thought that I would show screen captures of the output from all three utilities.

Oracle SQL Developer:
CharactersetsSQLDeveloper

SQL*Plus on a Windows client:
CharactersetsWindowsSQLPlus

SQL*Plus on a Synology NAS connected by a Windows Telnet session:
CharactersetsSynologySQLPlus

The DESCRIPTION column output of Oracle SQL Developer matches what appears in the ERP system.  The degree character (°), center line/diameter character (Ø – might also be considered the empty set character), and registered trademark character (®) require a single byte for storage in the WE8MSWIN1252 characterset, but two bytes for storage in the AL32UTF8 characterset.  Once translated to the AL32UTF8 characterset, the description for part ID LINED021274 will be too long to fit into the maximum 40 byte long DESCRIPTION column.  Several utilities exist for scanning a database to identify similar characterset conversion issues.

The Oracle Database 11.2 documentation lists three ways to change the characterset of the database:

  • Migrating Character Data Using a Full Export and Import
  • Migrating a Character Set Using the CSALTER Script
  • Migrating Character Data Using the CSALTER Script and Selective Imports

I started wondering if it really was necessary, in this case, to change the database characterset?  I examined the table creation script that the ERP system installer would use to create a new database – below is a portion of the script to create the PART table:

CREATE TABLE PART(
 ID NVARCHAR2(30) NOT NULL,
 DESCRIPTION NVARCHAR2(40),
 STOCK_UM NVARCHAR2(15) NOT NULL,
 PLANNING_LEADTIME INTEGER DEFAULT 0  NOT NULL,
 ORDER_POLICY NCHAR(1) DEFAULT ' '  NOT NULL,
 ORDER_POINT NUMBER(14,4),
 ORDER_UP_TO_QTY NUMBER (14, 4),
 SAFETY_STOCK_QTY NUMBER(14,4),
 FIXED_ORDER_QTY NUMBER(14,4),
...

Well, that table creation script would generate a bit different table when compared with the DESC PART output that I posted above – the VARCHAR2 columns would be created as NVARCHAR2, and the CHAR columns would be created as NCHAR.  So, the new version of the ERP system will be working with NVARCHAR2 and NCHAR columns, which are already setup to use the AL16UTF16 national characterset.

I setup a test version of the database using the same Oracle Database version, and instructed the ERP conversion software to upgrade from version 6.5.2 to version 8.0 (temporarily passing the ERP database through versions 6.5.3, 6.5.4, 7.0, 7.1.1, and 7.1.2).  I measured the amount of redo generated to make certain that I would not run out of disk space in the fast (flash) recovery area during the production database upgrade (because I still had the test database in noarchivelog mode, I simply checked the redo size statistic in V$SYSSTAT at various points during the conversion).

The redo generated to upgrade from version 6.5.2 through version 7.1.2 was about 1.75 times the byte size of the tables and indexes in the ERP schema in the database.  The redo generated to upgrade from version 7.1.2 to 8.0 (the next version) required approximately 27.06 times the byte size of the tables and indexes in the ERP schema in the database!  To put those numbers in perspective with simple numbers, assume that the byte size of the tables and indexes in the ERP schema in the database is about 100GB.  The upgrade from version 6.5.2 to 7.1.2 will generate approximately 175GB of redo, so there must be at least that much space in the fast (flash) recovery area for the archived redo logs generated, if that is where the archived redo logs are placed.  The first RMAN backup might also backup the archived redo logs, and if those backups are also stored in the fast (flash) recovery area there will be a need for at least an additional 175GB in the fast (flash) recovery area.  For the upgrade from version 7.1.2 to 8.0 approximately 2706GB of redo would be generated, and again the first RMAN backup might also include those archived redo logs.

What is happening during the 7.1.2 to 8.0 upgrade?  The 7.1.2 to 8.0 upgrade finds all of the CHAR and VARCHAR2 columns in all SYSADM (schema owner) tables, and for each table, sorts those columns alphabetically, creates a new column for one of those columns (with a _NEW suffix in the column name) that is defined as NCHAR or NVARCHAR2, executes a SQL statement to set the new column value to be the same as the original column value, drops the original column, and renames the new column so that it matches the old column name.  This process is then repeated for each of the CHAR and VARCHAR2 columns in the table, before moving onto the next table.  This is likely NOT the ideal way to change the column data types as this method generates enormous amounts of redo due to the number of times each data block is modified (and also likely leads to a lot of row chaining), which if you are correctly running the database in archived redo log mode, could very well completely fill your fast (flash) recovery area several times over with the archived redo logs.  The approach implemented also causes the tables’ primary key columns to be scattered though the table definition – with Oracle Database it is faster to access the primary key columns during a full table scan if those columns are the first columns in the table definition (reference and the follow up article).

There are two other methods that I can think of right now that might have been used to reduce the amount of redo generated and preserve the column order in the tables during the 7.1.2 to 8.0 upgrade (I wonder if the ERP vendor would be willing to modify their approach?).  They could have done something like this to simply change all of the column definitions for a single table all at once in a single ALTER TABLE command:

ALTER TABLE MYCUSTOM.AUDIT_SHEET MODIFY(
  WORKORDER_TYPE NCHAR(1),
  WORKORDER_BASE_ID NVARCHAR2(30),
  WORKORDER_LOT_ID NVARCHAR2(3),
  WORKORDER_SPLIT_ID NVARCHAR2(3),
  WORKORDER_SUB_ID NVARCHAR2(3),
  PART_ID NVARCHAR2(30),
  CUSTOMER_ID NVARCHAR2(15)
  ...);

A second option is to create a new table (with a _NEW suffix in the name) with all of the columns from the original table (just with CHAR changed to NCHAR and VARCHAR2 changed to NVARCHAR2), perform a statement such as INSERT INTO PART_NEW SELECT * FROM PART;, DROP PURGE the old table, then rename the new table as the old table.

Due to the method implemented, the converted PART table after the upgrade to version 8.0 appears something like this, where the primary key column (ID) is now the 68th column in the table definition:

SQL> DESC PART
 Name                      Null?    Type
 ------------------------- -------- --------------
 PLANNING_LEADTIME         NOT NULL NUMBER
 ORDER_POINT                        NUMBER(14,4)
 SAFETY_STOCK_QTY                   NUMBER(14,4)
 FIXED_ORDER_QTY                    NUMBER(14,4)
 DAYS_OF_SUPPLY                     NUMBER
...
 DEF_SLS_TAX_GRP_ID                 NVARCHAR2(15)
 DEMAND_HISTORY            NOT NULL NCHAR(1)
 DESCRIPTION                        NVARCHAR2(40)
 DETAIL_ONLY               NOT NULL NCHAR(1)
 DIMENSIONS_UM                      NVARCHAR2(15)
...
 HTS_CODE                           NVARCHAR2(20)
 ID                        NOT NULL NVARCHAR2(30)
 INSPECTION_REQD           NOT NULL NCHAR(1)

Other tables, such as the OPERATION table, have composite primary keys that are composed of character and number columns – after the conversion to 8.0 the numeric column (the last column in the composite primary key – WORKORDER_TYPE, WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID, SEQUENCE_NO) is listed first in the table definition, with the other columns of the primary key scattered by alphabetic order after the numeric and date columns.

Thoughts and opinions about the above?  Am I being overly critical?  I did not mention that it took about five or six tries for a successful ERP upgrade – two tries failed due to a bug in the conversion utility, and the other failed attempts were caused by the conversion utility stumbling over extensions (mostly logging triggers) that I added to the database over the years (but disabled before the ERP upgrade).

This blog is currently at 1,001,992 views, having crossed the 1,000,000 mark a couple of days ago.  I would like to take this time to say thanks to the readers of the blog for visiting, and thanks to the ERP vendor for giving me something a bit different to think about, even if it was something that I thought about almost a decade ago.





Web Pages Not Databases – Part 2: Fail2ban, Apache, IP Addresses, Linux, SELinux

23 08 2015

August 23, 2015 (Modified August 31, 2015, September 14, 2015)

(Back to the Previous Article in this Series)

I started using Linux in 1999, specifically Red Hat Linux 6.0, and I recall upgrading to Red Hat Linux 6.1 after downloading the files over a 56k modem – the good old days.  I was a little more wise when I upgraded to another release a couple of months later – I found a site on the Internet that offered Red Hat Linux CD sets for a couple of dollars.  In late 2001/early 2002 I picked up a very good book about creating Linux-based IPTables firewalls, so I set up a dual firewall setup (with a DMZ in between) using a couple of spare computers.  That setup worked great in a corporate environment for several years – I even upgraded the hardware in 2006 to inexpensive Dell PowerEdge servers and installed the latest version of Red Hat Linux (I believe Fedora 5).  I was excited about the potential capabilities of this free operating system, even going so far in 2004 to use it as the operating system for the primary file servers (Red Hat Enterprise Linux 3, if I remember correctly) in an effort to save a few thousand dollars in Microsoft licensing fees (it almost worked too).

F.A.I.L.S.?  I must have put those keywords in the blog article title for a reason, or maybe not.  In 2003 I tried setting up the Frees/wan VPN server on a spare Linux computer as an alternative to having to use a 28k/33k dial up modem connection.  It was around that time that I learned the dark side of Linux and the “free” software that could be installed.  I found an old message thread that I posted in 2003 related to Frees/wan where I mentioned that I spent in excess of 2.5 months trying to make this free VPN solution work correctly.  There were several how-to articles returned by a Google search, some of which were written for other Linux variants, others did not use X.509 certificates, and others almost worked.  Making matters worse, the Red Hat Linux kernel at the time did not support X.509 certificates, so I eventually ended up installing the Working Overloaded Linux Kernel.  I recall desperately looking for a program called Setup.exe that would just take care of the problem, but no such program was found.  A couple of months after I had Frees/wan working, a security compromise was reported in all products like Frees/wan, and the Frees/wan development had been abandoned.  I learned a very important lesson that “free” software may not be free software when you consider the time that it takes to implement and maintain the free software.  I also learned another important lesson – Linux how-to articles that are more than a couple of months old may be misleading or nearly useless; Linux articles that are written for one of the other 790 Linux Distributions may be just as misleading or useless; and not everything on the Internet in a hot-to article is true/correct (this article is no exception).

With that long introduction out of the way, I thought that I would share a couple of notes that I collected along the way when I setup Fedora 22 Linux as a server for a website that uses Apache and WordPress.  I have the headache inspiring SELinux enabled on the server, as well as the latest version of Fail2ban to temporarily block IP addresses used by the clowns on the Internet that want to make the Linux server running WordPress their new best friend.  So far, Fail2ban is working great, once the how-to articles that apply to Fedora 21 or Fedora 20 are ignored, although the current version does output apparently incorrect error messages when certain commands are executed:

[fedora 22]# fail2ban-client reload wordpress-login
ERROR  NOK: ('Cannot change database when there are jails present',)

Protecting Fedora 22 Linux with a Firewall

In one of the recent 17 Fedora releases, there was a transition from directly calling iptables commands in a script to using a command called firewall-cmd to accomplish the same task.  So, on Fedora 22 you should no longer execute commands like this:

iptables -t nat -A PREROUTING -i $INET_INTERFACE -p esp -j DNAT --to $VPN_IPADDR
 
iptables -A FORWARD -i $INET_INTERFACE -o $DMZ_INTERFACE -p udp --sport 4500 --dport 4500 -d $VPN_IPADDR -j ACCEPT
 
iptables -A FORWARD -i $INET_INTERFACE -o $DMZ_INTERFACE -p esp -j ACCEPT

Instead, with Fedora 22 the commands that are used to control the firewall have an entirely different syntax (allow access to port http 80, https port 443, ssh port 22, and ftp ports 20/21, remove access to FTP ports 20/21, and then reload and activate the changed rules):

firewall-cmd --set-default-zone=public 
 
firewall-cmd --permanent --zone=public --add-service=http 
 
firewall-cmd --permanent --zone=public --add-service=https 
 
firewall-cmd --permanent --zone=public --add-service=ssh
 
firewall-cmd --permanent --zone=public --add-service=ftp
 
firewall-cmd --permanent --zone=public --remove-service=ftp
 
firewall-cmd --reload

The changes do not take effect until the reload command is executed.  If you are planning to setup a publically accessible website, and you do not want the server to respond to ping requests and similar icmp requests, you might add a couple of additional firewall rules:

firewall-cmd --permanent --zone=public --add-icmp-block=destination-unreachable
firewall-cmd --permanent --zone=public --add-icmp-block=echo-reply
firewall-cmd --permanent --zone=public --add-icmp-block=echo-request
firewall-cmd --permanent --zone=public --add-icmp-block=parameter-problem
firewall-cmd --permanent --zone=public --add-icmp-block=redirect
firewall-cmd --permanent --zone=public --add-icmp-block=router-advertisement
firewall-cmd --permanent --zone=public --add-icmp-block=router-solicitation
firewall-cmd --permanent --zone=public --add-icmp-block=source-quench
firewall-cmd --permanent --zone=public --add-icmp-block=time-exceeded
firewall-cmd --reload

You might also decide to block certain web content spiders that mercilessly drain your server’s Internet bandwidth without returning any benefit to your website.  I noticed that the Baiduspider web crawler is a frequent offender, using several ranges of IP addresses.  I put an end to a large portion of the bandwidth drain from this web content spider with a simple firewall rule that blocks the IP address range 180.76.15.1 through 180.76.15.254 (don’t forget to reload after):

firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='180.76.15.0/24' reject"

Note that you may see a message similar to the following when attempting to execute the reload command:

Error: 'NoneType' object has no attribute 'query_rule'

If you see the above error message when trying to reload the firewall rules, just shout “free Linux software” five times and execute this command to restart the firewall – this command should have the same end effect as the reload command, except that this command works:

systemctl restart firewalld

Now, assume that you have setup Fail2ban’s ssh jail.  After a couple of hours you have received over 200 emails from Fail2ban telling you that it has blocked 200+ computers wanting to be best ssh friends with your server.  Obviously, you skipped the step of setting up a different port for ssh.  Modify the sshd config file (if you forgot the basic vi commands: press i to be able to make changes in the file, Esc ZZ to save the changes and exit, Esc :q! to quit without saving changes):

vi /etc/ssh/sshd_config

Assume that you want to change the ssh port from 22 to 1492 (something about sailing the ocean blue?).  Below the #Port 22 heading, add:

Port 1492

Then save the file and exit vi.  Since SELinux is enabled, we need to instruct SELinux to behave correctly when an ssh client attaches to port 1492:

semanage port -a -t ssh_port_t -p tcp 1492

Note: Using the semanage command requires another package to be installed first:

dnf install policycoreutils-python

Note 2: If you think that SELinux is blocking something that should not be blocked, SELinux may be temporarily disabled with this command:

setenforce 0

To re-enable SELinux, either reboot the server or execute this command:

setenforce 1

Next, we need to add a firewall rule to permit connections on port 1492, and reload the firewall rules (note that I am using the command to restart the firewall daemon instead due to the error that appeared with the reload command):

firewall-cmd --permanent --zone=public  --add-port=1492/tcp
systemctl restart firewalld

As a final verification, make certain that the Linux firewall and SELinux recognize the new port:

firewall-cmd --list-ports
semanage port -l | grep ssh

If there are no apparent problems with the above output, restart the ssh daemon:

systemctl reload sshd.service

You may also wish to confirm which services are enabled for the Linux firewall:

firewall-cmd --list-services

Beating on a Linux box that lacks a monitor and keyboard is only so much fun (that old reboot joke, I guess).  If you have a Windows computer handy, the free Putty program will allow access to the ssh interface on the Linux server.  WinSCP is a helpful utility that provides Windows Explorer-like views through the ssh interface on the Linux server.

Protecting Fedora 22 Linux with Fail2ban

Fail2ban is a utility that monitors various log files on the server, looking for unexpected activity that typically originates from another computer on the network or on the Internet.  Fail2ban may be setup to take various actions when a problem is noticed, such as the same IP address failing to connect to SSH 10 times in 15 minutes.  The action may be to send an email to an administrator and/or to configure a firewall rule that temporarily blocks the offender’s IP address.  There are a few how-to articles found through Google searches that describe how to install and configure Fail2ban.  Shockingly (not really), some of those articles are more than a couple of months old (so the articles may not work with Fedora 22) and/or instruct people to modify files that explicitly state in the header:

# YOU SHOULD NOT MODIFY THIS FILE.

What to do?  What to do?

If you have not done so recently, make certain that the installed Fedora packages are up to date (dfn… another new command, what happened to the rpm command?):

dnf update

If the Apache web server is running on the server, there is a good chance that you execute commands similar to the following at some time in the past:

dnf install httpd
systemctl start httpd.service
systemctl enable httpd.service

Fail2ban is able to send emails using Sendmail, so if Sendmail is not installed, consider installing it:

dnf install sendmail
systemctl start sendmail
systemctl enable sendmail

While not directly applying to Fail2ban, SELinux, by default, blocks Apache from using Sendmail.  It is possible to verify that this is the case, and remove the restriction with these two commands:

sestatus -b | grep -i sendmail
setsebool -P httpd_can_sendmail 1

With Sendmail installed and running, we are able to proceed with the Fail2ban installation and configuration:

dnf install fail2ban ipset
dnf install whois fail2ban-sendmail
systemctl start fail2ban
systemctl enable fail2ban

The configuration file for Fail2ban that should be modified is /etc/fail2ban/jail.d/local.conf – but that file does not exist after installation.  The local.conf file references files in the /etc/fail2ban/filter.d/ directory that tell Fail2ban how to read the various log files and recognize problems using regular expressions (they look pretty irregular to me, but then I have not done much with regular expressions since that Turbo Pascal programming class years ago).  A starting point for the local.conf file with Fedora 22 and Sendmail, blocking ssh connection requests after a few incorrect login attempts from the same IP address within an hour, would look like the following (replace my.IP.address.here with your IP address so that Fail2ban will ignore your incorrect login attempts):

[DEFAULT]
bantime = 2592000
banaction = firewallcmd-ipset
backend = systemd
sender = emailaddress1@mydomain.com
destemail = emailaddress2@mydomain.com
action = %(action_mwl)s
ignoreip = 127.0.0.1 my.IP.address.here
 
[sshd]
enabled = true
findtime = 3600

The settings listed under the [DEFAULT] heading apply to all of the other sections in this file, unless those settings are also mentioned in the other sections of the file.  For example, the bantime (number of seconds to block an IP address) applies to the [sshd] section, as does the backend = systemd setting.  If we want Fail2ban to help protect WordPress, we will want Fail2ban to monitor a variety of log files, which cannot be done with the backend = systemd setting, so that setting will need to be modified in other sections for the file.  [sshd] describes the sshd jail, so we will need to select logical names for the sections of the file that will be added later.  The sshd jail was not defined (actually, not enabled – it is defined in another configuration file) when Fail2ban was first started, so we need to let Fail2ban know that it should load/reload the sshd jail configuration, and then verify that the jail is functional:

fail2ban-client reload sshd
fail2ban-client status sshd

If you wait a couple of minutes between executing the first of the above and second of the above commands, you may see output similar to this, which indicates that some candidates for blocking were identified and blocked, and a notification email was sent to the email address specified by the destemail setting:

Status for the jail: sshd
|- Filter
|  |- Currently failed: 0
|  |- Total failed:     0
|  `- Journal matches:  _SYSTEMD_UNIT=sshd.service + _COMM=sshd
`- Actions
   |- Currently banned: 307
   |- Total banned:     307
   `- Banned IP list:   1.215.253.186 101.78.2.106 103.15.61.138 103.224.105.7 103.248.234.3 103.253.211.244 ...

Protecting WordPress running on Fedora 22 with Fail2ban.

When an attempt is made to access the password protected /wp-admin section of a WordPress site, and a bad password is entered, by default WordPress silently destroys that failed connection attempt, so Fail2ban is not able to help by blocking repeat offenders.  A partial solution that I found on several websites is to add the following code near the start of the WordPress theme’s functions.php file:

add_action('wp_login_failed', 'log_wp_login_fail'); // hook failed login
function log_wp_login_fail($username) {
        error_log("WP login failed for username: $username");

Once that code is in place, some of the bad login attempts will be written to either the /var/log/httpd/error_log or /var/log/httpd/ssl_error_log file.  You might then start seeing errors such as these buried in those files:

[Thu Aug 13 10:17:43.578391 2015] [auth_basic:error] [pid 30933] [client 75.145.nnn.nnn:50683] AH01618: user admin not found: /wp-admin/css/login.min.css, referer: http://www.websitehere.com/wp-login.php
[Thu Aug 13 19:12:53.054913 2015] [:error] [pid 2060] [client 50.62.136.183:33789] WP login failed for username: k-mm
[Thu Aug 13 20:13:02.316777 2015] [:error] [pid 1873] [client 50.62.136.183:42677] WP login failed for username: k-mm
[Thu Aug 13 21:13:12.012160 2015] [:error] [pid 15701] [client 50.62.136.183:52432] WP login failed for username: k-mm.com
[Thu Aug 13 21:28:32.073261 2015] [:error] [pid 15697] [client 50.62.136.183:58571] WP login failed for username: k-mm.com
[Thu Aug 13 21:58:43.118303 2015] [:error] [pid 21245] [client 50.62.136.183:52059] WP login failed for username: k-mm.com
[Thu Aug 13 22:03:49.150456 2015] [:error] [pid 21244] [client 50.62.136.183:60540] WP login failed for username: k-mm.com
[Thu Aug 13 22:23:28.348351 2015] [:error] [pid 15688] [client 50.62.136.183:52911] WP login failed for username: k-mm.com
[Thu Aug 13 23:14:14.453002 2015] [:error] [pid 19632] [client 50.62.136.183:37700] WP login failed for username: admin
[Fri Aug 14 01:14:15.455095 2015] [:error] [pid 5085] [client 50.62.136.183:45656] WP login failed for username: administrator
[Fri Aug 14 02:14:16.478660 2015] [:error] [pid 4114] [client 50.62.136.183:53068] WP login failed for username: administrator

In the above, note the behavior of the computer at IP address 50.62.136.183 – that computer is slowly hitting the server with different username and password combination – slow so as not to set off blocking utilities like Fail2ban that might be configured to start blocking when there have been, for instance, five bad password attempt in an hour.  Note that I stated that the addition to the theme’s functions.php file would help to identify some of the bad login attempts – to see the others, the /var/log/httpd/access_log and /var/log/httpd/ssl_access_log files must also be monitored.  In those files you may see patterns such as these where a single IP address will try to rapidly and repeatedly post to the /wp-login.php file for more than eight hours straight:

85.97.41.164 - - [12/Aug/2015:17:17:34 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
85.97.41.164 - - [12/Aug/2015:17:17:35 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
85.97.41.164 - - [12/Aug/2015:17:17:36 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
85.97.41.164 - - [12/Aug/2015:17:17:37 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
85.97.41.164 - - [12/Aug/2015:17:17:38 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
85.97.41.164 - - [12/Aug/2015:17:17:38 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
85.97.41.164 - - [12/Aug/2015:17:17:40 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
85.97.41.164 - - [12/Aug/2015:17:17:42 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
85.97.41.164 - - [12/Aug/2015:17:17:43 -0400] "POST /wp-login.php HTTP/1.1" 200 1628 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
...
109.228.0.250 - - [13/Aug/2015:01:42:43 -0400] "POST /wp-login.php HTTP/1.0" 403 3030 "-" "-"
109.228.0.250 - - [13/Aug/2015:01:42:48 -0400] "POST /wp-login.php HTTP/1.0" 403 3030 "-" "-"
109.228.0.250 - - [13/Aug/2015:01:42:49 -0400] "POST /wp-login.php HTTP/1.0" 403 3030 "-" "-"
109.228.0.250 - - [13/Aug/2015:01:42:50 -0400] "POST /wp-login.php HTTP/1.0" 403 3030 "-" "-"
109.228.0.250 - - [13/Aug/2015:01:42:56 -0400] "POST /wp-login.php HTTP/1.0" 403 3030 "-" "-"
109.228.0.250 - - [13/Aug/2015:01:42:56 -0400] "POST /wp-login.php HTTP/1.0" 403 3030 "-" "-"

Obviously, the computers at those IP addresses were up to no good, and should also be blocked.  Another interesting pattern that might be seen in the access_log or ssl_access_log files is an attacker trying to retrieve the login of the first author username in WordPress, working slowly to try logging into the website so as not to trip protection utilities like Fail2ban that identify multiple failed logins from the same IP address in a short period of time:

185.93.187.69 - - [20/Aug/2015:00:38:16 -0400] "GET /?author=1 HTTP/1.1" 302 -
185.93.187.69 - - [20/Aug/2015:00:38:20 -0400] "GET /wp-login.php HTTP/1.1" 403 221
185.93.187.69 - - [20/Aug/2015:00:58:35 -0400] "GET /?author=1 HTTP/1.1" 302 -
185.93.187.69 - - [20/Aug/2015:00:58:37 -0400] "GET /wp-login.php HTTP/1.1" 403 221
185.93.187.69 - - [20/Aug/2015:01:19:20 -0400] "GET /?author=1 HTTP/1.1" 302 -
185.93.187.69 - - [20/Aug/2015:01:19:22 -0400] "GET /wp-login.php HTTP/1.1" 403 221
185.93.187.69 - - [20/Aug/2015:01:39:45 -0400] "GET /?author=1 HTTP/1.1" 302 -
185.93.187.69 - - [20/Aug/2015:01:39:46 -0400] "GET /wp-login.php HTTP/1.1" 403 221
185.93.187.69 - - [20/Aug/2015:01:59:59 -0400] "GET /?author=1 HTTP/1.1" 302 -
185.93.187.69 - - [20/Aug/2015:02:00:00 -0400] "GET /wp-login.php HTTP/1.1" 403 221

You might also see something like this in the access_log or ssl_access_log file:

220.163.10.250 - - [17/Aug/2015:21:03:43 -0400] "DELETE / HTTP/1.1" 400 226

I strongly suspect that the computer at IP address 220.163.10.250 had other uses in mind for my website.  From the documentation:

“The DELETE method requests that the origin server delete the resource identified by the Request-URI. This method MAY be overridden by human intervention (or other means) on the origin server. The client cannot be guaranteed that the operation has been carried out, even if the status code returned from the origin server indicates that the action has been completed successfully. However, the server SHOULD NOT indicate success unless, at the time the response is given, it intends to delete the resource or move it to an inaccessible location. “

A quick method to determine if a potential attacker tried to use the above DELETE request is to use the grep command to search within the ssl_access_log and access_log files:

grep "DELETE" /var/log/httpd/ssl_access_log*
grep "DELETE" /var/log/httpd/access_log*

Another set of attempted compromises that is not directed at WordPress sites are also visible in the ssl_access_log and access_log files:

162.246.61.20 - - [29/Jul/2015:02:13:11 -0400] "GET /cgi-bin/php HTTP/1.1" 404 209 "-" "-"
162.246.61.20 - - [29/Jul/2015:02:13:11 -0400] "GET /cgi-bin/php5 HTTP/1.1" 404 210 "-" "-"
162.246.61.20 - - [29/Jul/2015:02:13:11 -0400] "GET /cgi-bin/php-cgi HTTP/1.1" 404 213 "-" "-"
162.246.61.20 - - [29/Jul/2015:02:13:11 -0400] "GET /cgi-bin/php.cgi HTTP/1.1" 404 213 "-" "-"
162.246.61.20 - - [29/Jul/2015:02:13:11 -0400] "GET /cgi-bin/php4 HTTP/1.1" 404 210 "-" "-"
195.145.157.189 - - [30/Jul/2015:12:07:38 -0400] "GET /cgi-bin/test-cgi HTTP/1.1" 404 214 "-" "the beast"
37.144.20.31 - - [01/Aug/2015:09:34:10 -0400] "GET /tmUnblock.cgi HTTP/1.1" 400 226 "-" "-"
69.64.46.86 - - [03/Aug/2015:01:48:28 -0400] "GET /cgi-bin/rtpd.cgi HTTP/1.0" 404 214 "-" "-"
69.64.46.86 - - [14/Aug/2015:01:24:35 -0400] "GET /cgi-bin/rtpd.cgi HTTP/1.0" 404 214 "-" "-"
23:46.148.18.122 - - [16/Aug/2015:20:30:17 -0400] "GET /tmUnblock.cgi HTTP/1.1" 403 - "-" "-"
23:46.148.18.122 - - [16/Aug/2015:20:30:17 -0400] "GET /hndUnblock.cgi HTTP/1.1" 403 - "-" "-"
88.202.224.162 - - [23/Aug/2015:07:05:15 -0400] "GET //cgi-bin/webcm?getpage=../html/menus/menu2.html&var:lang=%26%20allcfgconv%20-C%20voip%20-c%20-o%20-%20../../../../../var/tmp/voip.cfg%20%2 HTTP/1.1" 404 211
80.82.65.186 - - [01/Aug/2015:08:42:51 -0400] "GET //cgi-bin/webcm?getpage=../html/menus/menu2.html&var:lang=%26%20allcfgconv%20-C%20voip%20-c%20-o%20-%20../../../../../var/tmp/voip.cfg%20%26 HTTP/1.1" 404 211
46.165.220.215 - - [16/Aug/2015:20:51:51 -0400] "GET /cgi-bin/webcm?getpage=../html/menus/menu2.html&var:lang=%26%20allcfgconv%20-C%20voip%20-c%20-o%20-%20../../../../../var/tmp/voip.cfg%20%26 HTTP/1.1" 404 211
46.165.220.215 - - [17/Aug/2015:03:09:59 -0400] "GET /cgi-bin/webcm?getpage=../html/menus/menu2.html&var:lang=%26%20allcfgconv%20-C%20voip%20-c%20-o%20-%20../../../../../var/tmp/voip.cfg%20%26 HTTP/1.1" 404 211

If any of the above appear to be interesting, you might try a Google search to see what the remote computers were attempting to compromise.

Far less obnoxious are entries that show your Nagios monitoring utility checking the website availability:

50.196.nnn.nnn - - [19/Aug/2015:09:30:54 -0400] "GET / HTTP/1.1" 200 57465 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
50.196.nnn.nnn - - [19/Aug/2015:09:31:07 -0400] "GET / HTTP/1.1" 200 57465 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
50.196.nnn.nnn - - [19/Aug/2015:09:31:42 -0400] "GET / HTTP/1.1" 200 57465 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
50.196.nnn.nnn - - [19/Aug/2015:09:31:47 -0400] "GET / HTTP/1.1" 200 57465 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"

As well as random computers trying to download a file named wpad.dat (in the webserver root directory execute touch wpad.dat to create a zero byte file for that name – this is important if your client computers should not be trying to retrieve such a file and you have a custom error page configured for the website that is a feature rich web page).  There is a chance that your client computers could be searching for this file due to a specific configuration setting:

WebNotDatabaseWPAD

Example output, showing repeated requests, is shown below:

76.29.115.160 - - [20/Aug/2015:02:07:40 -0400] "GET /wpad.dat HTTP/1.1" 200 - "-" "-"
76.29.115.160 - - [20/Aug/2015:02:07:46 -0400] "GET /wpad.dat HTTP/1.1" 200 - "-" "-"
76.29.115.160 - - [20/Aug/2015:02:08:03 -0400] "GET /wpad.dat HTTP/1.1" 200 - "-" "-"
76.29.115.160 - - [20/Aug/2015:02:08:14 -0400] "GET /wpad.dat HTTP/1.1" 200 - "-" "-"

Regular Expression Building Assistance:

If we intend to have Fail2ban help protect WordPress running on Apache on Fedora 22 Linux, we need to first create “filter” files that contain the regular expressions needed to recognize bad guy attempted access.  The filter files are located in the /etc/fail2ban/filter.d/ directory and all end with .conf, although the .conf portion of the filename is not specified in the /etc/fail2ban/jail.d/local.conf file that we created earlier.  I will create separate filter files for ssl and non-ssl log files, although that is not required.  The first filter file is apache-wp-login.conf:

vi /etc/fail2ban/filter.d/apache-wp-login.conf

I set that file to have four regular expressions to recognize a bad guy’s attempted access (one or two of the regular expressions below may be incorrect because I have not had enough recent practice at writing regular expressions):

[Definition]
failregex = [[]client <HOST>[]] WP login failed.*
            [[]client <HOST>[]] client denied.*wp-login.php
            .*\[auth_basic:error\] \[pid.*\] \[client <HOST>.*?
            .*\[:error\] \[pid.*\] \[client .*?(?P<host>\S+):\d+\] WP login failed.*
ignoreregex =

Save the file and exit vi.  Verification of the regular expression syntax is important.  The fail2ban-regex utility will process a Linux log file of your choice using one of the regular expression filters that you create in the /etc/fail2ban/filter.d/ directory.  For example, to test the filter than was created above, execute the following command:

fail2ban-regex --print-all-matched /var/log/httpd/error_log /etc/fail2ban/filter.d/apache-wp-login.conf

Your output may be similar to what appears below (note that I processed an error_log from a previous week:

Running tests
=============
 
Use   failregex filter file : apache-wp-login, basedir: /etc/fail2ban
Use         log file : /var/log/httpd/error_log-20150816
Use         encoding : UTF-8
 
 
Results
=======
 
Failregex: 40 total
|-  #) [# of hits] regular expression
|   3) [26] .*\[auth_basic:error\] \[pid.*\] \[client <HOST>.*?
|   4) [14] .*\[:error\] \[pid.*\] \[client .*?(?P<host>\S+):\d+\] WP login failed.*
`-
 
Ignoreregex: 0 total
 
Date template hits:
|- [# of hits] date format
|  [140] (?:DAY )?MON Day 24hour:Minute:Second(?:\.Microseconds)?(?: Year)?
`-
 
Lines: 144 lines, 0 ignored, 40 matched, 104 missed [processed in 0.24 sec]
|- Matched line(s):
...
|  [Thu Aug 13 22:03:49.150456 2015] [:error] [pid 21244] [client 50.62.136.183:60540] WP login failed for username: k-mm.com
|  [Thu Aug 13 22:23:28.348351 2015] [:error] [pid 15688] [client 50.62.136.183:52911] WP login failed for username: k-mm.com
|  [Thu Aug 13 23:14:14.453002 2015] [:error] [pid 19632] [client 50.62.136.183:37700] WP login failed for username: admin
|  [Fri Aug 14 01:14:15.455095 2015] [:error] [pid 5085] [client 50.62.136.183:45656] WP login failed for username: administrator
|  [Fri Aug 14 02:14:16.478660 2015] [:error] [pid 4114] [client 50.62.136.183:53068] WP login failed for username: administrator
|  [Fri Aug 14 13:02:10.181252 2015] [auth_basic:error] [pid 30239] [client 75.145.nnn.nnn:54787] AH01618: user test not found: /wp-admin/css/login.min.css, referer: http://www.mydomain.com/wp-login.php
|  [Fri Aug 14 13:02:12.819515 2015] [auth_basic:error] [pid 30239] [client 75.145.nnn.nnn:54787] AH01618: user test not found: /wp-admin/css/login.min.css, referer: http://www.mydomain.com/wp-login.php
|  [Fri Aug 14 13:02:14.880515 2015] [auth_basic:error] [pid 30239] [client 75.145.nnn.nnn:54787] AH01618: user test not found: /wp-admin/css/login.min.css, referer: http://www.mydomain.com/wp-login.php
|  [Fri Aug 14 13:02:29.497034 2015] [:error] [pid 3357] [client 75.145.nnn.nnn:54798] WP login failed for username: k-mm, referer: http://www.mydomain.com/wp-login.php
|  [Fri Aug 14 13:02:29.531482 2015] [auth_basic:error] [pid 3357] [client 75.145.nnn.nnn:54798] AH01618: user test not found: /wp-admin/css/login.min.css, referer: http://www.mydomain.com/wp-login.php
...

The /etc/fail2ban/filter.d/apache-wp-login-ssl.conf filter file that I created is identical to the /etc/fail2ban/filter.d/apache-wp-login.conf file:

[Definition]
failregex = [[]client <HOST>[]] WP login failed.*
            [[]client <HOST>[]] client denied.*wp-login.php
            .*\[auth_basic:error\] \[pid.*\] \[client <HOST>.*?
            .*\[:error\] \[pid.*\] \[client .*?(?P<host>\S+):\d+\] WP login failed.*
ignoreregex =

After saving the file and exiting vi, we are able to test the filter:

fail2ban-regex --print-all-matched /var/log/httpd/ssl_error_log /etc/fail2ban/filter.d/apache-wp-login-ssl.conf

The wordpress-login.conf and wordpress-login-ssl.conf filter files will be used to examine the /var/log/httpd/access_log and /var/log/httpd/ssl_access_log files, respectively.

The /etc/fail2ban/filter.d/wordpress-login.conf file (note once again that one or two of the regular expressions used for matching may need to be adjusted):

[Definition]
failregex = ^<HOST> .* "POST .*\/wp-login.php HTTP/1.0" 403 .*$
            ^<HOST> .* "POST .*\/wp-login.php HTTP/1.1" 403 .*$
            ^<HOST> .* "POST .*wp-login.php HTTP.1.*" 403
            ^<HOST> .* "POST .*wp-login.php HTTP.1.*" 200
            ^<HOST> .* "GET .*wp-login.php HTTP/1.*" 403 221
            ^<HOST> .* "GET ..author=1 HTTP/1.*" 302 -
ignoreregex =

The /etc/fail2ban/filter.d/wordpress-login-ssl.conf file:

[Definition]
failregex = ^<HOST> .* "POST .*\/wp-login.php HTTP/1.0" 403 .*$
            ^<HOST> .* "POST .*\/wp-login.php HTTP/1.1" 403 .*$
            ^<HOST> .* "POST .*wp-login.php HTTP.1.*" 403
            ^<HOST> .* "POST .*wp-login.php HTTP.1.*" 200
            ^<HOST> .* "GET .*wp-login.php HTTP/1.*" 403 221
            ^<HOST> .* "GET ..author=1 HTTP/1.*" 302 -
ignoreregex =

To test those two filters, use these commands:

fail2ban-regex --print-all-matched /var/log/httpd/access_log /etc/fail2ban/filter.d/wordpress-login.conf
fail2ban-regex --print-all-matched /var/log/httpd/ssl_access_log /etc/fail2ban/filter.d/wordpress-login-ssl.conf

Added August 31, 2015:

I have found that a couple of computers on the Internet are trying to access a variety of *.cgi files in rapid fashion, resulting in entries such as these being written to the /var/log/httpd/error_log file:

[Sun Aug 30 20:38:08.187093 2015] [cgi:error] [pid 6426] [client 64.15.155.177:53122] AH02811: script not found or unable to stat: /var/www/cgi-bin/webmap.cgi
[Sun Aug 30 20:38:08.271430 2015] [cgi:error] [pid 6230] [client 64.15.155.177:53316] AH02811: script not found or unable to stat: /var/www/cgi-bin/whois.cgi
[Sun Aug 30 20:38:08.599455 2015] [cgi:error] [pid 6094] [client 64.15.155.177:54035] AH02811: script not found or unable to stat: /var/www/cgi-bin/register.cgi
[Sun Aug 30 20:38:08.733852 2015] [cgi:error] [pid 6453] [client 64.15.155.177:54213] AH02811: script not found or unable to stat: /var/www/cgi-bin/download.cgi
[Sun Aug 30 20:38:09.048479 2015] [cgi:error] [pid 5353] [client 64.15.155.177:54516] AH02811: script not found or unable to stat: /var/www/cgi-bin/shop.cgi
[Sun Aug 30 20:38:09.533326 2015] [cgi:error] [pid 5673] [client 64.15.155.177:56107] AH02811: script not found or unable to stat: /var/www/cgi-bin/profile.cgi
[Sun Aug 30 20:38:09.736446 2015] [cgi:error] [pid 6455] [client 64.15.155.177:56274] AH02811: script not found or unable to stat: /var/www/cgi-bin/about_us.cgi
[Sun Aug 30 20:38:09.830315 2015] [cgi:error] [pid 6456] [client 64.15.155.177:56734] AH02811: script not found or unable to stat: /var/www/cgi-bin/php.fcgi
[Sun Aug 30 20:38:09.918823 2015] [cgi:error] [pid 4232] [client 64.15.155.177:56923] AH02811: script not found or unable to stat: /var/www/cgi-bin/calendar.cgi
[Sun Aug 30 20:38:10.013162 2015] [cgi:error] [pid 6423] [client 64.15.155.177:57115] AH02811: script not found or unable to stat: /var/www/cgi-bin/download.cgi
[Sun Aug 30 20:38:10.106597 2015] [cgi:error] [pid 6425] [client 64.15.155.177:57399] AH02811: script not found or unable to stat: /var/www/cgi-bin/light_board.cgi
[Sun Aug 30 20:38:10.193901 2015] [cgi:error] [pid 6426] [client 64.15.155.177:57574] AH02811: script not found or unable to stat: /var/www/cgi-bin/main.cgi
[Sun Aug 30 20:38:10.288724 2015] [cgi:error] [pid 6230] [client 64.15.155.177:57754] AH02811: script not found or unable to stat: /var/www/cgi-bin/search.cgi
[Sun Aug 30 20:38:10.516842 2015] [cgi:error] [pid 5349] [client 64.15.155.177:57949] AH02811: script not found or unable to stat: /var/www/cgi-bin/test.cgi
[Sun Aug 30 20:38:10.601953 2015] [cgi:error] [pid 6094] [client 64.15.155.177:58409] AH02811: script not found or unable to stat: /var/www/cgi-bin/file_up.cgi

If you have Fail2ban running on the webserver, and you are seeing entries like the above in the error_log file, consider creating a file named /etc/fail2ban/filter.d/apache-cgi-bin.conf with the following contents:

[Definition]
failregex   = ^.*\[cgi:error\] \[pid.*\] \[client .*?(?P<host>\S+):\d+\] AH02811: script not found or unable to stat: \/var\/www\/cgi-bin.*$
ignoreregex =

To test the above filter definition, execute this command:

fail2ban-regex --print-all-matched /var/log/httpd/error_log /etc/fail2ban/filter.d/apache-cgi-bin.conf

(Note that the steps that follow assume that the local.conf file has already been created, see the steps below.)  To set up the jail that uses the above filter, in the /etc/fail2ban/jail.d/local.conf file, you would then add the following lines, which will setup blocking when a search locates five or more matching entries from the same IP address within two days:

[apache-cgi-bin]
enabled  = true
filter   = apache-cgi-bin
logpath  = /var/log/httpd/error_log
bantime  = 2592000
findtime = 172800
port     = http,https
maxretry = 5
backend  = polling
journalmatch =

To activate the jail, execute:

fail2ban-client reload apache-cgi-bin

To see the jail status, execute:

fail2ban-client status apache-cgi-bin

Below is sample output for the above command:

Status for the jail: apache-cgi-bin
|- Filter
|  |- Currently failed: 1
|  |- Total failed:     111
|  `- File list:        /var/log/httpd/error_log
`- Actions
   |- Currently banned: 4
   |- Total banned:     4
   `- Banned IP list:   118.219.233.133 27.254.67.157 118.163.223.214 64.15.155.177

Added September 14, 2015:

I noticed a couple of additional suspicious access entries in the access_log file.  The first set of entries appears to be from a computer looking for a wide range of web server vulnerabilities:

185.25.48.89 - - [13/Sep/2015:22:57:49 -0400] "GET /wp-content/uploads/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:57:50 -0400] "GET /wp-content/uploads/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:57:55 -0400] "POST /uploadify/uploadify.php HTTP/1.1" 301 - "http://k-mm.com/uploadify/uploadify.php" "Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Firefox/3.5.0"
185.25.48.89 - - [13/Sep/2015:22:57:58 -0400] "GET /samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:57:59 -0400] "GET /samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:02 -0400] "POST /wp-admin/admin-ajax.php HTTP/1.1" 200 1 "http://k-mm.com/wp-admin/admin-ajax.php" "Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Firefox/3.5.0"
185.25.48.89 - - [13/Sep/2015:22:58:06 -0400] "GET /wp-content/plugins/revslider/temp/update_extract/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:06 -0400] "GET /wp-content/plugins/revslider/temp/update_extract/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:09 -0400] "POST /php-ofc-library/ofc_upload_image.php?name=sample.php HTTP/1.1" 301 - "/php-ofc-library/ofc_upload_image.php?name=sample.php" "Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Firefox/3.5.0"
185.25.48.89 - - [13/Sep/2015:22:58:12 -0400] "GET /tmp-upload-images/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:13 -0400] "GET /tmp-upload-images/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:13 -0400] "GET /large-machining-fabricating-capabilities/ HTTP/1.1" 200 50109 "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:17 -0400] "POST /components/com_creativecontactform/fileupload/index.php HTTP/1.1" 301 - "/components/com_creativecontactform/fileupload/index.php" "Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Firefox/3.5.0"
185.25.48.89 - - [13/Sep/2015:22:58:20 -0400] "GET /components/com_creativecontactform/fileupload/files/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:21 -0400] "GET /components/com_creativecontactform/fileupload/files/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:27 -0400] "GET /wp-content/uploads/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:28 -0400] "GET /wp-content/uploads/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:31 -0400] "HEAD /plugins/editor.zoho/agent/save_zoho.php HTTP/1.1" 301 - "-" "-"
185.25.48.89 - - [13/Sep/2015:22:58:32 -0400] "HEAD /sites/all/libraries/elfinder/elfinder.html HTTP/1.1" 301 - "-" "-"
185.25.48.89 - - [13/Sep/2015:22:58:33 -0400] "POST /wp-admin/admin-ajax.php?page=pmxi-admin-settings&action=upload&name=samplc.php HTTP/1.1" 200 1 "/wp-admin/admin-ajax.php?page=pmxi-admin-settings&action=upload&name=samplc.php" "Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Firefox/3.5.0"
@
185.25.48.89 - - [13/Sep/2015:22:58:27 -0400] "GET /wp-content/uploads/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:28 -0400] "GET /wp-content/uploads/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:31 -0400] "HEAD /plugins/editor.zoho/agent/save_zoho.php HTTP/1.1" 301 - "-" "-"
185.25.48.89 - - [13/Sep/2015:22:58:32 -0400] "HEAD /sites/all/libraries/elfinder/elfinder.html HTTP/1.1" 301 - "-" "-"
185.25.48.89 - - [13/Sep/2015:22:58:33 -0400] "POST /wp-admin/admin-ajax.php?page=pmxi-admin-settings&action=upload&name=samplc.php HTTP/1.1" 200 1 "/wp-admin/admin-ajax.php?page=pmxi-admin-settings&action=upload&name=samplc.php" "Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Firefox/3.5.0"
185.25.48.89 - - [13/Sep/2015:22:58:34 -0400] "GET /wp-content/plugins/wpallimport/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:35 -0400] "GET /wp-content/plugins/wpallimport/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:38 -0400] "POST /server/php/ HTTP/1.1" 301 - "/server/php/" "Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Firefox/3.5.0"
185.25.48.89 - - [13/Sep/2015:22:58:41 -0400] "GET /server/php/files/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"
185.25.48.89 - - [13/Sep/2015:22:58:42 -0400] "GET /server/php/files/samplc.php HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)"

The second set of entries appear to be from two different computers that are apparently trying to take advantage of a SQL injection attempt to deface a website, or something similar:

122.154.24.254 - - [14/Sep/2015:03:29:38 -0400] "GET /phpMyAdmin/scripts/setup.php HTTP/1.1" 301 - "-" "-"
122.154.24.254 - - [14/Sep/2015:03:29:41 -0400] "GET /pma/scripts/setup.php HTTP/1.1" 301 - "-" "-"
122.154.24.254 - - [14/Sep/2015:03:29:45 -0400] "GET /myadmin/scripts/setup.php HTTP/1.1" 301 - "-" "-"
122.155.190.132 - - [14/Sep/2015:07:52:22 -0400] "GET /phpMyAdmin/scripts/setup.php HTTP/1.1" 301 - "-" "-"
122.155.190.132 - - [14/Sep/2015:07:52:27 -0400] "GET /pma/scripts/setup.php HTTP/1.1" 301 - "-" "-"
122.155.190.132 - - [14/Sep/2015:07:52:33 -0400] "GET /myadmin/scripts/setup.php HTTP/1.1" 301 - "-" "-"

While the hacking attempts were unsuccessful, I decided that there is little point in wasting the server’s resources with similar attempts.  I created a new Fail2ban filter with the filename /etc/fail2ban/filter.d/apache-block-misc-php.conf and added the following lines to recognize the above entries in the Apache access_log file:

[Definition]
failregex = ^<HOST> .* "POST .*uploadify.php HTTP.1.*" .*$
            ^<HOST> .* "HEAD .*uploadify.php HTTP.1.*" .*$
            ^<HOST> .* "POST .*ofc_upload_image.php.*" .*$
            ^<HOST> .* "POST .*fileupload.index.php .*" .*$
            ^<HOST> .* "HEAD .*save_zoho.php .*" .*$
            ^<HOST> .* "POST .*save_zoho.php .*" .*$
            ^<HOST> .* "HEAD .*elfinder.html .*" .*$
            ^<HOST> .* "POST .*elfinder.html .*" .*$
            ^<HOST> .* "GET .*scripts.setup.php .*" .*$
            ^<HOST> .* "POST .*scripts.setup.php .*" .*$
            ^<HOST> .* "GET .*\/samplc.php .*" .*$
            ^<HOST> .* "GET .*\/?author=.*" .*$
            ^<HOST> .* "GET .*abdullkarem.*" .*$
            ^<HOST> .* "GET .*\/uploadify.php.*" .*$
            ^<HOST> .* "GET .*\/bin\/perl .*$
            ^<HOST> .* "GET .*wp-admin\/admin-ajax.php .*" .*$
            ^<HOST> .* "GET <title>phpMyAdmin HTTP.*$
            ^<HOST> .* "GET \/phpmyadmin.*$
            ^<HOST> .* "GET \/phpMyAdmin.*$
            ^<HOST> .* "GET \/PMA\/.*$
            ^<HOST> .* "GET \/pma\/.*$
            ^<HOST> .* "GET \/admin\/.*$
            ^<HOST> .* "GET \/dbadmin\/.*$
            ^<HOST> .* "GET \/mysql\/.*$
            ^<HOST> .* "GET \/myadmin\/.*$
            ^<HOST> .* "GET \/sqlmanager\/.*$
            ^<HOST> .* "GET \/mysqlmanager\/.*$
            ^<HOST> .* "GET \/wcd\/top.xml.*$
            ^<HOST> .* "GET \/wcd\/system_device.xml.*$
            ^<HOST> .* "GET \/wcd\/system.xml.*$
            ^<HOST> .* "GET \/openurgencevaccin\/index.php.*$
            ^<HOST> .* "GET \/zeuscms\/index.php.*$
            ^<HOST> .* "GET \/phpcoin\/license.php.*$
            ^<HOST> .* "GET \/authadmin\/.*$
            ^<HOST> .* "GET \/backup\/.*$
            ^<HOST> .* "GET \/backups\/.*$
            ^<HOST> .* "GET \/bak\/.*$
            ^<HOST> .* "GET \/cbi-bin\/.*$
            ^<HOST> .* "GET \/ccard\/.*$
            ^<HOST> .* "GET \/ccards\/license.php.*$
            ^<HOST> .* "GET \/cd-cgi\/.*$
            ^<HOST> .* "GET \/cfide\/.*$
            ^<HOST> .* "GET \/cgi\/.*$
            ^<HOST> .* "POST .*\/fileupload\/index.php.*$
            ^<HOST> .* "POST .*\/php\/index.php.*$
            ^<HOST> .* "GET .*wp-config.php.*$
            ^<HOST> .* "POST .*\/examples\/upload.php.*$
ignoreregex =

Once the new filter file is created, test the filter to see if it allows Fail2ban to find any matching lines in the access_log:

fail2ban-regex --print-all-matched /var/log/httpd/access_log /etc/fail2ban/filter.d/apache-block-misc-php.conf

If it appears that the filter is finding matching lines, add a new jail definition in the /etc/fail2ban/jail.d/local.conf file (note that maxretry is set to 2):

[apache-block-misc-php]
enabled = true
filter   = apache-block-misc-php
logpath  = /var/log/httpd/access_log
bantime = 2592000
findtime = 86400
port    = http,https
maxretry = 2
backend = polling
journalmatch =

To activate the new jail, execute the reload command:

fail2ban-client reload apache-block-misc-php

To check the status of the new jail, execute the status command:

fail2ban-client status apache-block-misc-php

Sample output is shown below:

Status for the jail: apache-block-misc-php
|- Filter
|  |- Currently failed: 0
|  |- Total failed:     30
|  `- File list:        /var/log/httpd/access_log
`- Actions
   |- Currently banned: 4
   |- Total banned:     4
   `- Banned IP list:   114.27.9.31 122.154.24.254 122.155.190.132 185.25.48.89

For Fail2ban to use the filters that were just created, we must add additional lines (jail descriptions) to the /etc/fail2ban/jail.d/local.conf file:

vi /etc/fail2ban/jail.d/local.conf

At the end of the file add the following four jail definitions (note that without the backend and journalmatch lines the jails will not work due to the settings in the [DEFAULT] section of this file):

[apache-wp-login]
enabled = true
filter   = apache-wp-login
logpath  = /var/log/httpd/error_log
bantime  = 2592000
findtime = 3600
port    = http,https
maxretry = 5
backend  = polling
journalmatch =
 
[apache-wp-login-ssl]
enabled = true
filter   = apache-wp-login-ssl
logpath  = /var/log/httpd/ssl_error_log
bantime  = 2592000
findtime = 3600
port    = http,https
maxretry = 5
backend  = polling
journalmatch =
  
[wordpress-login]
enabled = true
filter   = wordpress-login
logpath  = /var/log/httpd/access_log
bantime = 345600
findtime = 86400
port    = http,https
maxretry = 6
backend = polling
journalmatch =
 
[wordpress-login-ssl]
enabled = true
filter   = wordpress-login-ssl
logpath  = /var/log/httpd/ssl_access_log
bantime = 345600
findtime = 86400
port    = http,https
maxretry = 6
backend = polling
journalmatch =

Save the file and exit vi.  Next we need to instruct Fail2ban to recognize the four new jails:

fail2ban-client reload apache-wp-login
fail2ban-client reload apache-wp-login-ssl
fail2ban-client reload wordpress-login
fail2ban-client reload wordpress-login-ssl

As an alternative to the above, we could just restart Fail2ban, which will restart all of the jails, and potentially spam your inbox with ssh blocking notifications:

systemctl restart fail2ban.service

Checking the status of the jails is quite simple to accomplish:

fail2ban-client status apache-wp-login
fail2ban-client status apache-wp-login-ssl
fail2ban-client status wordpress-login
fail2ban-client status wordpress-login-ssl

You might be curious about the emails that Fail2ban sends.  Below is a portion of an actual email that I received from Fail2ban recently:

Hi,

The IP 46.119.117.47 has just been banned by Fail2Ban after
12 attempts against wordpress-login.

Here is more information about 46.119.117.47:

[Querying whois.ripe.net]
[whois.ripe.net]
% This is the RIPE Database query service.
% The objects are in RPSL format.
%
% The RIPE Database is subject to Terms and Conditions.
% See http://www.ripe.net/db/support/db-terms-conditions.pdf

% Note: this output has been filtered.
%       To receive output for a database update, use the “-B” flag.

% Information related to ‘46.118.0.0 – 46.119.255.255’

% Abuse contact for ‘46.118.0.0 – 46.119.255.255’ is ‘abuse@kyivstar.net’

inetnum:        46.118.0.0 – 46.119.255.255
descr:          Golden Telecom LLC
netname:        UA-SVITONLINE-20100517
org:            ORG-SOGT1-RIPE
country:        UA
admin-c:        GTUA-RIPE
tech-c:         GTUA-RIPE
status:         ALLOCATED PA
mnt-by:         RIPE-NCC-HM-MNT
mnt-lower:      GTUA-MNT
mnt-lower:      GTUA-WO-MNT
mnt-domains:    GTUA-ZONE-MNT
mnt-domains:    GTUA-MNT
mnt-routes:     GTUA-RT-MNT
mnt-routes:     GTUA-MNT
created:        2010-05-17T08:47:45Z
last-modified:  2011-08-04T15:58:57Z
source:         RIPE # Filtered

organisation:   ORG-SOGT1-RIPE
org-name:       Golden Telecom LLC
org-type:       LIR
address:        15/15/6 V. Khvojki str.
address:        04080
address:        Kiev
address:        UKRAINE
phone:          +380444900000
fax-no:         +380444900048
admin-c:        AEL17-RIPE
admin-c:        NP1533-RIPE
mnt-ref:        RIPE-NCC-HM-MNT
mnt-ref:        GTUA-MNT
mnt-by:         RIPE-NCC-HM-MNT
abuse-c:        GTL6-RIPE
created:        2004-04-17T12:09:58Z
last-modified:  2015-07-17T13:48:48Z
source:         RIPE # Filtered

role:           Golden Telecom Ukraine NOC
address:        Golden Telecom
address:        4 Lepse blvr
address:        Kiev, 03067, Ukraine
phone:          +380 44 4900000
fax-no:         +380 44 4900048
remarks:        All abuse notifications have to be sent on:
abuse-mailbox:  abuse@kyivstar.net
admin-c:        AEL17-RIPE
admin-c:        NP1533-RIPE
nic-hdl:        GTUA-RIPE
mnt-by:         GTUA-MNT
created:        2007-07-25T09:02:04Z
last-modified:  2014-06-17T08:24:26Z
source:         RIPE # Filtered

% Information related to ‘46.119.112.0/20AS15895’

route:          46.119.112.0/20
descr:          Kyivstar GSM, Kiev, Ukraine
origin:         AS15895
mnt-by:         GTUA-MNT
created:        2012-03-21T09:29:14Z
last-modified:  2012-03-21T09:29:14Z
source:         RIPE # Filtered

% This query was served by the RIPE Database Query Service version 1.80.1 (DB-2)
Lines containing IP:46.119.117.47 in /var/log/httpd/access_log

I am not sure why, but this particular email did not list the lines from the access_log that matched the filter rule.

Protecting WordPress running on Fedora 22 with .htaccess Files

One step that you may want to take is to password protect the /wp-admin directory on your web server.  To do that, you would create a new Linux user with a username and password that are difficult to guess based on your website name and WordPress users – the password should be at least eight characters long with upper and lower case letters, numbers, and punctuation marks.  Then, using tips from the last post in this message thread, create a file name .htaccess in the /wp-admin directory.  Inside that file, add the following lines (replace /full/path/to/your/wp-admin with the directory where you will later create a .htpasswd file):

AuthName "Admin Area"
AuthType Basic
AuthUserFile /full/path/to/your/wp-admin/.htpasswd
require valid-user
 
<Files admin-ajax.php>
    Order allow,deny
    Allow from all
    Satisfy any
</Files>

Next use the htpasswd generator website to create an encrypted version of the password for the Linux username.  For example, if you created the Linux user hillbillyforpresident with a password of GreatScott1TrumpIsAhead? the htpasswd website would instruct you to create a .htpasswd file with the following contents:

hillbillyforpresident:$apr1$gAgbX0SU$YjtXg5pAvXrD6i.F2lh6z1

Make certain that the .htaccess file (and possibly the .htpasswd file also) have read/write access for the owner, read access for the group in which Apache runs (the Apache user should not own the files), and that the files are not world readable.  For example:

chmod 640 /var/www/html/wp-admin/.htaccess

The wp-config.php file should also be protected with similar file permissions:

chmod 640 /var/www/html/wp-config.php

The .htaccess file in the web server’s root directory should also be adjusted to control which files may be accessed.  Below the # END WordPress line in the file, consider adding the following (once you understand what the lines accomplish – note that the entry containing 123\.123\.123\.123 should allow the IP address 123.123.123.123 to access the wp-login.php file):

# Block access to files.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^wp-admin/includes/ - [F,L]
RewriteRule !^wp-includes/ - [S=3]
RewriteRule ^wp-includes/[^/]+\.php$ - [F,L]
RewriteRule ^wp-includes/js/tinymce/langs/.+\.php - [F,L]
RewriteRule ^wp-includes/theme-compat/ - [F,L]
 
RewriteCond %{REQUEST_URI} ^(.*)?wp-login\.php(.*)$ [OR]
RewriteCond %{REQUEST_URI} ^(.*)?wp-admin$
RewriteCond %{REMOTE_ADDR} !^123\.123\.123\.123$
RewriteRule ^(.*)$ - [R=403,L]
</IfModule>
 
<files wp-config.php>
order allow,deny
deny from all
</files>
 
<Files .htaccess>
 order allow,deny
 deny from all
</Files>
 
# Stop Apache from serving .ht* files
<Files ~ "^\.ht">
Order allow,deny
Deny from all
</Files>
 
Options -Indexes

WordPress and SELinux – a Headache Waiting to Attack

From what I understand, everything in the webserver’s root directory is set by default to the httpd_sys_content_t SELinux context – and sometimes that context is not present when files are copied into various subdirectories that are accessible to Apache.  The following command resets the SELinux context to the default context:

chcon -R -v -t httpd_sys_content_t /var/www/

Using FTP integrated with WordPress to install updated plugins or new WordPress versions is a bit of a nightmare because different SELinux contexts are required for the different directories – I never did find a combination that worked.  As a result, I added the following line to the wp-config.php file so that FTP integration is not necessary:

define( 'FS_METHOD', 'direct');

Of course the WordPress upload directory must have the httpd_sys_rw_content_t SELinux context, so at some point the following command would need to be executed:

chcon -R -v -t httpd_sys_rw_content_t /var/www/html/wp-content/uploads/

The same command may also need to be executed for the WordPress plugins and upgrade directories (and probably a tempfiles directory) so that it is possible to install and update plugins using the WordPress interface.  Right now I do not permit WordPress to auto-update when a new version is released (this is due to the file system permissions that I use that only allow the apache user to read the files, not change the files).  I previously created a download directory in the /var directory.  Whenever I need to upgrade WordPress to a new version I use a script with the following contents (note that the script was pieced together based on what the WordPress release notes stated needed to be updated):

cd /var/downloads
rm -rf /var/downloads/wordpress
rm /var/downloads/wordpress.tar.gz
wget https://wordpress.org/latest.tar.gz
mv latest.tar.gz wordpress.tar.gz
tar -xzf wordpress.tar.gz
chcon -R -v -t httpd_sys_content_t /var/downloads/wordpress/
chown -R FileOwnerHere:ApacheGroupHere /var/downloads/wordpress/
find /var/downloads/wordpress/ -type d -exec chmod 2755 {} +
find /var/downloads/wordpress/ -type f -exec chmod 2644 {} +
cp -av /var/downloads/wordpress/wp-admin/* /var/www/html/wp-admin/
cp -av /var/downloads/wordpress/wp-includes/* /var/www/html/wp-includes/
cp -v /var/downloads/wordpress/wp-content/* /var/www/html/wp-content/
cp /var/downloads/wordpress/*.php /var/www/html/
cp /var/downloads/wordpress/*.txt /var/www/html/
cp /var/downloads/wordpress/*.html /var/www/html/

The above information is consolidated from weeks, maybe months, of hammering on a seemingly simple problem – 12 years later (OK, maybe 16 years later) and I am still in search of the Linux program named Setup.exe that configures everything that needs to be configured to get a job done quickly.  Oh, going out on a limb here, let’s ask for a GUI interface too that works with Putty.  Or, even further out on a limb, let’s ask for consistency of file paths, filenames, and commands across the 790+ Linux distributions and versions within each distribution so that a how-to article created two years ago is still valid today.  Stepping off the soap box… or SOAP box.

If any readers have comments or suggestions that improve upon the above information (or gently correct), please feel free to add a comment below.  Maybe someone else will find some of the above information useful to avoid putting a couple of extra dents in the top surface of their desk.





Hyper-Extended Oracle Performance Monitor 7.0 Beta

16 08 2015

August 16, 2015

Three and a half years ago I offered the Hyper-Extended Oracle Performance 6.0 Beta for download – that Beta version expired a year later.  This program has been somewhat of a pet project for the last 13 years (almost 10 years at the time of the previous Beta offering), so I was excited to read some of the feedback about the 6.0 Beta.  The minimal feedback had me wondering for a while, then the pet project was pushed to a dark corner for roughly three years.  I taught the pet a few new tricks on Windows 10, so I thought that I make the latest version available.  Unlike the previous Beta, Beta 7.0 does not have an expiration date.

NOTE August 17, 2015: The first two download links are working now – Wordpress was blocking the downloads.  Note that the first two downloads are actually compressed .zip files with a .doc extension – inside those .zip files are the files that are of interest.  Right-click the files, save the files to your computer, then rename the files to end with .zip.

  1. Program download (right-click Save As, then change the .doc extension to .zip): Hyper-ExtendedOraclePerformanceMonitor7.zip
  2. Documentation download (right-click Save As, then change the .doc extension to .zip – still incomplete, see old docs): Hyper-Extended Oracle Performance Monitor7.doc
  3. Old Documentation download: Hyper-Extended Oracle Performance Monitor3.doc
  4. Update August 20, 2015: If you have an old computer, you may need to put the MSCOMCTL.OCX file in your C:\Windows\SysWOW64 folder (on 64 bit Windows, or C:\Windows\System32 on 32 bit Windows).  You will then need to register (process) the file with REGSVR32 – see the command in the instructions below.  A recent version of MSCOMCTL.OCX may be downloaded here: MSCOMCTL.OCX (save the file, then rename the file as MSCOMCTL.OCX).

Requirements:

Windows 2000 through Windows 10, 32 bit Oracle Client, MDAC/ADO 2.8 (or greater – preinstalled starting with Windows XP), connection to the Oracle database by database SID (tnsnames.ora) using Oracle’s Oracle Provider for OLE DB (Oracle’s Oracle ODBC Driver is required for connections that are initiated by SYS – primarily for access to the Advanced Init Parameter functionality), Excel 2000-2013, Microsoft Grid control (provided in the download – put into the program’s folder), TIMED_STATISTICS set to TRUE. Most features require DBA permissions (SYSADM or SYS), or specific grants to views or packages. For example, sessions that use the DBMS Xplan and Trace functionality must have permission to execute ALTER SESSION, have execute permission on the DBMS_XPLAN package, and select permissions on V$SQL, V$SQL_PLAN, and V$SQL_PLAN_STATISTICS.  Configuring session tracing requires execute permission on the DBMS_SYSTEM package and/or the DBMS_MONITOR package.

Description:

The Hyper-Extended Oracle Performance Monitor provides a variety of functions to simplify working with and performance monitoring Oracle databases. Various types of simple data dictionary and database instance related reports may be generated in Excel by simply putting a check in a box on the program’s main window, and clicking the Report button. Generating DBMS Xplan output (using the raw information from a 10046 level 4/12 trace) with optional 10046/10053/other traces, creating data change logging triggers using the data dictionary, peeking at the hidden and non-hidden initialization parameters, and access to a quick Oracle keyword search are also a click away on the program’s main window. Performance monitoring and tracing tools for checking a cross-reference of the time model and wait event statistics (at the system and session levels), checking a cross-reference of the system statistics and wait event statistics (with drill-down to the session level), reviewing potentially high-load SQL statements and their execution plans, cross-referencing the current session waits with enqueues, and enabling various types of Oracle traces are also a click away on the program’s main window. Permanently recording of performance monitoring statistics in a Microsoft Access compatible database (Microsoft Access is not required) is also a click away on the program’s main window.

General Command Line Parameters:

  • -D   The Database instance SID to which the program should connect.
  • -U   The user name to be used for connecting to the database instance.
  • -P   The password to be used for connecting to the database instance.

Logging-Specific Command Line Parameters:

  • -LC 20   Specifies Force a Log Capture when CPU Usage Exceeds value to 20%
  • -LI 30   Specifies Force a Log Capture if No Log Captured in Minutes value to 30 minutes
  • -LB   Specifies the Force a Log Capture when a Blocking Lock is Detected value to checked
  • -LW  Specifies the Force a Log Capture when a Wait Reason is Detected value to checked
  • -LR  Specifies the Capture SQL Execution Statistics for Wait Reasons value to checked
  • -LD   Specifies the Capture Segment Change Statistics value to checked
  • -LO   Specifies the Capture Operating System and Time Model Statistics value to checked
  • -LH   Specifies the Capture High Load SQL Statement Statistics value to checked
  • -LT   Specifies the Capture High Load SQL Statement Text value to checked
  • -LP   Specifies the Capture High Load SQL Statement Plan value to checked
  • -LHC 60   Species the minimum CPU time that is considered high load to 60 seconds accum.
  • -LHE 90   Species the minimum elapsed time that is considered high load to 90 seconds accum.
  • -LS   Specifies that Smart Logging should begin as soon as the login completes
  • -LE 240   Specifies that Smart Logging should end after 240 minutes
  • -LQ   Specifies that the program should quit (end) when logging ends

Using the above command line parameters, you could create a program shortcut for each of the databases that you administer.  For example, create a shortcut, and set its Target to (assuming that the program is on the root of the F: drive, the Oracle SID is OR1122P, the username is myuser, the password for that account is pword, and the smart logging should default to logging when 10% of CPU utilization is observed):

"F:\Hyper-Extended Oracle Performance Monitor.exe" -D OR1122P -U myname -p pword -LC 10

What’s New:

The Hyper-Extended Oracle Performance Monitor runs on Windows 10, even though the touchpad on the Sony laptop sometimes stops working until the computer is put to sleep and then awoken.  The program should also work with Oracle Database 12c (and might still be compatible with Oracle Database 8i, although the Time Model Viewer will not work).

Hyper7HyperExtendedMainScreen

The Time Model Viewer main window is still essentially the same, showing the hierarchy of the time model statistics, with color-coded session-level time model statistics (color ranges from yellow to red to indicate the percentage of the total represented by the session).  The bottom of the window shows the system-wide wait events, and the sessions that contributed to those system level wait events:

Hyper7TimeModelViewer

When the Time Model Viewer is active, an Excel spreadsheet is created that shows the graphic history of the statistics for the 20 most recent time capture periods – if you find this view helpful, you might want to widen the DB Time and CPU chart:

Hyper7TimeModelViewerExcelCharts

The statistics also appear in numerical form on the Statistics tab in the Excel spreadsheet:

Hyper7TimeModelViewerExcelStatistics

The Real-Time Monitor is also essentially the same as in previous Beta versions:

Hyper7RealTimeMonitor

Like the Time Model Viewer, an Excel spreadsheet is created showing the graphical history of the statistics.  Which statistics?  Click a wait event or one of the white statistics boxes to add that statistic to the list for which charts are created:

Hyper7RealTimeMonitorExcelCharts

Previous Beta versions of the program wrote out a tab delimited file containing the statistic deltas after every 30 time capture periods (this version also creates such a file).  The Beta for version 7 writes those statistics to Excel after each time capture period to facilitate the chart creation:

Hyper7RealTimeMonitorExcelStatistics

Oracle’s OLEDB provider is used for connectivity, see this article for a description of how to fix the bug in the Oracle Database 11.2.0.3 Client installer that is related to the OLEDB provider (the problem may have been corrected in later releases).  I have not yet found a method for SYS to login using Oracle’s OLEDB provider when “AS SYSDBA” must be specified for the login to complete.  As such, Oracle’s ODBC driver is used when SYS tries to login and the login attempt using the OLEDB provider fails.  Oracle’s ODBC driver is also used when viewing the normally hidden parameters using the SYS login.  Because there could be multiple Oracle homes, you must create a 32 bit ODBC System DSN using the Oracle ODBC driver – name that DSN HYPEREXTEND:

Hyper7Login

Hyper7ODBC1

Hyper7ODBC2

Hyper7ODBC3

The ODBC connection driver information is used to login as the SYS user (AS SYSDBA) to view the Advanced Initialization Parameters:

Hyper7AdvancedInitParameters

Hyper-Extended Oracle Performance Monitor 7.0 Beta is also more compliant with restrictions placed on the computer by User Account Control (User Access Control), which prohibits programs from create folders in the root of the C:\ drive and writing files into that folder.  Version 7 instead creates the OracleLog folder in the current user’s profile (typically C:\Users\username\OracleLog\), and separates the files created by database.  The picture below shows a couple of Microsoft Access compatible logging database that were created, as well as some of the tab-delimited statistic files, and various other generated files:

Hyper7LoggingFolder

As mentioned above, the documentation is still incomplete, with several pages of the documentation containing nothing but pictures.  It takes a lot of effort and time  to put the documentation together, so I will try to piece it together over the next couple of months.  You might be able to refer to the older documentation if the pictures are not self-explanatory.  If you find the program useful, please leave a note.  If you need help understanding a program window or the generated output, leave a comment attached to this blog article and I will try to help.

Hyper7b3.zip (Save with a .zip extension, not a .doc extension.)





On the Topic of Technology… 8 – First 48 hours with Windows 10

3 08 2015

August 3, 2015 (Updated August 5, 2015, August 10, 2015, August 29, 2015)

(Back to the Previous Post in the Series)

I have been testing Windows 10 under the Windows Insider Program for several months using an old Dell Precision Core 2 Extreme computer (with NVidia graphics card) that was retired from engineering service a couple of years ago. I had some issues upgrading from one preview version to the next, such as the new Start menu failing to display – forcing a format and reinstall of the operating system. One version that automatically installed over the top of build 10074 caused the computer to continuously reboot. Odd, I thought, that a Windows update could cause that problem. I traced that issue back to disabling data execution protection (DEP) several years earlier on the engineering computer to keep some of the software from spontaneously crashing – turning DEP back on in the BIOS immediately resolved the reboot loop. I was still seeing occasional Start menu display glitches a week or so before the official release date for Windows 10 (July 29, 2015), but those problems were diminished with the application of Windows updates as the official release date neared.

—–

Quick note on August 12, 2015: I came across an article on Spiceworks that lead me down an interesting path since I am fighting driver issues on a Sony Vaio laptop.  In short, if you have a Sony computer, don’t expect Windows 10 compatible drivers for a while.  How long?  October or November 2015.

Windows10SonyWarning

In a related article, the following statement appears, apparently quoting an unconfirmed Sony source:

“In a message to Sony laptop owners, the firm has pleaded for patience when it comes to installing the new operating system (OS) due to the real risk of software or driver corruption that could result in a catastrophic data loss.

Sony’s advice goes on to say that if customers don’t wait for the test results then there is a chance that their computer “may no longer work as intended.” This can mean anything from the system crashing, becoming unresponsive or suffering from hardware damage. The bottom line is that tapping Sony to service your PC following any of these issues could result in “losing all of the data” on your computer.”

—–

A couple of years ago I bought a Sony SVE14AE13L (SVE14A27CXH) touch screen laptop as an incentive to learn how to use and troubleshoot Windows 8. With a Core i7 processor and 8GB of memory, I anticipated having an easy time working through Windows 8, and after watching several hours of how-to videos I was reasonably comfortable with Windows 8 – for someone who has worked with computers since the early 1980s it was a bit of a struggle, but not as bad as some people claimed (I also thought that Windows Vista was reasonably good before the release of Windows 7). That said, this computer was updated to Windows 8.1 on the day that the new version became available. This Sony laptop computer, however, had a couple of annoying behaviors. It would randomly wake up in the middle of the night, the touchpad would occasionally stop working, Internet Explorer on the computer was occasionally slow, and the touchpad would send random zoom in/out messages to the current application.

Windows 8.1 worked reasonably well on the Sony laptop, but I went ahead and reserved a copy of Windows 10 for the computer to get some experience with the release version of Windows 10. The first 48 hours of Windows 10 began this past Saturday. I grew tired of waiting for Microsoft to tell me that my copy of Windows 10 was available for installation, so I opened a Windows command line and executed this command: wuauclt.exe /updatenow – that seemed to kick start the upgrade, but it failed to install once roughly 3.5GB of installer files were downloaded. Another search of the Internet indicated that I should have first deleted everything in the C:\Windows\SoftwareDistribution\Download folder. After downloading another 3.5GB copy of the Windows 10 installer, I found that this attempt also failed after I clicked the Continue (or whatever that button showed) button in the Windows Update window. I started wondering what I was doing wrong, so I downloaded the ISO DVD image for Windows 10 from Microsoft.  Progress – writing the ISO DVD image to a DVD permitted the Windows 10 installer to start with a simple double-click.

Five or six hours in now, roughly 10.5GB downloaded from the Internet (I sure am happy that I no longer have the Internet connection with the 6GB monthly limit), and the Windows 10 installer is showing a message “Your PC will restart several times. Sit back and relax.” Don’t worry, be happy (click the picture for a larger view).

Windows10-95PercentComputerMayRestart

The install completed… with an error message, “The installation failed in the FIRST_BOOT phase with an error during MIGRATE_DATA operation.” Fantastic, at least the installer brought back the previous Windows 8.1 (click the picture for a larger view).

Windows10CouldNotInstallErrorFailedFirstBootMigrateData

That sure is an obscure error message. Google searches seemed to indicate that the problem could be caused by a corrupt profile, or a half dozen other issues.

I made another attempt at the upgrade, this time telling the installer NOT to install any Windows updates during the initial install. Perhaps eight or nine hours in, success – the Windows 10 Pro logon window is a keyboard tap away (note that the picture below was taken the following day).

Windows10FirstLogin

Great, Windows updates are automatically downloading and installing. Five minutes later, the laptop is apparently unhappy with Windows 10, a frown on a blue screen of death with the message “If you’d like to know more, you can search online later for this error: INTERNAL_POWER_ERROR

Windows10BlueScreenInternalPowerError

E-Moe-Gee! (emoji). Sure, collect your error info and restart. Recalling the message from the earlier picture: “Your PC will restart several times. Sit back and relax.” And thus started an automatic reboot loop, sometimes lasting just long enough to log into Windows 10 and display the Device Manager. In this case, that INTERNAL_POWER_ERROR message means that Windows 10 does not like the AMD Radeon HD 7600M series graphics card. Congratulations Microsoft and AMD, I have not had this level of difficulty with video card drivers since 1999 when I set up a triple boot on a computer with the latest NVidia graphics card, booting into Windows 98 (drivers easy to find), Windows NT 4.0 (a little more challenging), and Red Hat Linux with the Gnome X-Window desktop (on par with Windows 10 on this Sony laptop).

A couple of tips at this point. Pressing F8 during the initial Windows boot apparently does not display the old Windows 95 style boot menu that allows the computer to start in Safe Mode with limited functionality. If the computer is able to boot to the logon screen, there is a power button at the bottom-right of that screen. Hold down the Shift key on the keyboard, click the power button, and select Restart – if you continue to hold down the Shift key, you will have the option to Troubleshoot the computer. If you select Troubleshoot, then Advanced options, then Startup Settings, you will be able to select to start the computer in Safe Mode. Shutting off the computer when it is booting into Windows will often result in the next boot attempt taking the computer to the screen that allows selecting Troubleshoot, and eventually the option to start in Safe Mode. Once in Windows 10, holding the Windows flag key and pressing the X key on the keyboard has the same effect as right-clicking the Start button. A menu will appear permitting quick access to the Device Manager, Computer Management, Task Manager, and a variety of other computer administration tasks.

One of the tips that I found online for dealing with the INTERNAL_POWER_ERROR message was to quickly navigate to the Device Manager, and delete the Windows installed AMD Radeon HD 7600M item under the Display adapters heading. I tried that without success, and even tried deleting the Intel HD Graphics 4000 item under the Display adapters heading without success. Instructing Windows to scan for new devices resulted in two “Microsoft Basic Display Adapter” items being added under the Display adapters heading, leading to another blue screen after a couple of minutes.

Windows10DeviceManager1

Booting the computer to Safe Mode with Networking to do some troubleshooting, Windows 10 was stable. A web search using a different computer suggested that I needed to download the latest video card drivers for the AMD video card to fix the blue screen. I found that the new Edge web browser, which is the new default web browser on Windows 10, cannot start in Safe Mode, resulting in a message stating “This app can’t open. Microsoft Edge can’t be opened using the Built-in Administrator account. Sign in with a different account and try again.”  I was logged into the computer using my personal account, not the Built-In Administrator account, but I guess that detail did not matter.

Windows10CannotStartEdgeInSafeMode

Where did Microsoft hide Internet Explorer on Windows 10? Hold down the Windows key and press R. Type iexplore.exe and press the Enter key. Now, if only the wireless network worked in Safe Mode with Network Support! Trying again with the laptop connected by Cat 5e cable to a network switch, I managed to download the correct drivers. Sorry, cannot install AMD video card drivers in Safe Mode, “Failed to load detection driver.” Clicking OK caused the installer to hang at 100% complete. The previously registered Microsoft Word 2010 claimed that it could not verify that I was running an authentic version of Word 2010 while in Safe Mode, so the problem is not just with AMD when in Safe Mode.

Windows10CannotInstallATIDriversInSafeMode

Now what? Tell the computer to try reinstalling Windows 10 without installing updates – tried that, only to be greeted with a near instantaneous frowny face (a blue screen of death) upon completion. It took roughly 10 attempts to make it into the BIOS setup on the laptop (F2 did not work at the start of the boot) – I eventually found that applying about 200 pounds of pressure to the pink Assist button while powering on the Sony laptop allowed access to the BIOS setup. I found an option titled “Discrete Graphics Adapter” that was set to Enabled, so I changed that to Disabled, saved the changes, and managed to log into a stable copy of Windows 10… about 30 to 36 hours after the first download of Windows 10 initiated.

Windows10BIOSDisableDiscrete

After verifying that the computer worked fine with the Discrete Graphics Adapter disabled, I re-enabled that setting and made another attempt with the AMD graphics card.  The AMD graphics drivers installed without issue in Windows 10, but again resulted in a continuous reboot blue screen loop. I found that by quickly navigating to Device Manager after logon, I was able to set the AMD graphics device to Disabled before the computer would blue screen – once again Windows 10 was stable (and Microsoft Word 2010 worked OK too). Windows 10 installed various Windows updates, but simply refused to install the video driver for the Intel HD Graphics 4000 device.

Windows10WindowsUpdateCannotInstallIntelDrivers

As a result, Device Manager still shows a “Microsoft Basic Display Adapter” in Device Manager. Instructing Device Manager to update the drivers also failed.

Windows10DeviceManagerCannotInstallVideo

Even though the correct video card drivers are not installed, Windows 10 seems to work OK. Unfortunately, the problem where the touchpad would send random zoom in/out messages is still present – note the size of the zoomed icons in the desktop background in the following picture. I have yet to find a logical way to send the zoom back to normal on the desktop. The right side of the below picture shows the notification Action Center that is displayed by clicking the icon near the clock at the bottom-right of the screen.

Windows10VideoPlaybackNotifications2

The following picture shows that the Windows 8 new style apps are now able to appear on the desktop with regular Windows applications. I was shocked to see that Windows 8 would not permit overlapped windows for the new style apps – I think that the original release of Windows had the same problem, even though the Commodore Amiga from the same time period (mid to late 1980s) supported windows arranged on top of other windows. Cortana is shown at the left of the window in the picture below – apparently Cortana lost an argument with the Internet shortly before this screen capture was saved (Cortana is a bit evasive when asked “who are your programmers”). The Windows 10 scientific calculator appears at top-center, Microsoft Edge at bottom-center, Paint.Net at top-right, and the Microsoft Solitaire Collection app at the bottom right.

Windows10NewOldAppsNoInternet

As I stated above, overlapping Windows 8 apps are supported, and Cortana sent me to a web page about Easter Eggs – I thought that maybe that would be a good way to learn more about Cortana’s programmers. The speech input recognition seems to work very well in Cortana – as long as the Internet connection is not down.

Windows10OverlappingWindows

The picture below shows the new Start menu at the left – it is possible to stretch or narrow the Start menu to show more or less of the Windows 8 style live tiles on the Start menu. Changing settings is still a little confusing – for some settings it is necessary to click All Settings in the Action Center (the gray background window at the top-right), and in other cases it is necessary to use the Windows Vista style Control Panel (the white background window at the bottom-right).

Windows10StartMenuControlPanelAllSettings

The task view trick that was introduced with Windows Vista (Windows key and Tab key) still works, but has changed for the better. The Task View may also be opened by clicking the Task View button on the task bar at the bottom of the screen.

Windows10TaskView

The location of important settings seems to change with just about every new release of Windows. It would be nice if (almost) all settings could be found within a single interface. A trick that worked with older versions of Windows also works with Windows 10. Right-click the desktop (or inside any folder) and select New – Folder. Give that folder the following name:

GodMode.{ED7BA470-8E54-465E-825C-99712043E01C}

Double-clicking that folder to open it gives you just what you need (until the touchpad starts randomly sending zoom in/out messages).

Windows10GodMode1
Windows10GodMode2
Windows10GodMode3
Windows10GodMode4

Oh, it appears that the Oracle Database 11.2.0.3 database instance survived all of the upgrading nonsense, although it did take an exceptionally long time to start SQL Plus.

I have been working with computers for a long time – started back in the early 1980s. My new boss (almost five months now) at work for some reason thinks that I am a programmer… just a programmer? Last week I casually mentioned to him that I had read 30+ books about Oracle Database, and hinted that I co-authored another. I wonder if he has a clue about the books that I read about network administration (including a couple of CNE books in the late 1990s), operating systems, Microsoft Exchange (two for Exchange 2013, and other for previous versions), computer hardware troubleshooting, and a variety of other computer related skills. Well, at least he thinks that I am a programmer, and not just one who fixes computers (not that there is anything wrong with fixing computers, right Microsoft/Sony/AMD?).

As a suggestion, before you attempt a Windows 10 upgrade on a computer with an AMD video card, ask yourself if you have 24+ hours to spare and how much you like seeing frowns on blue screens. If the above is not enough of a warning, consider the before and after webcam photos captured by the Sony laptop’s webcam that follow.

Before picture: staring down Windows 8.1, trying to figure out why the window zoom keeps randomly changing:

Windows10Before

After Windows 10 is installed picture: note that more than the background changed.  I think that Cortana is stalking me now (why do I have a purple square target on my head?).  Just what is she doing in the middle of the night when the computer spontaneously turns on from a deep sleep?:

Windows10After

Humor aside, like Windows NT 4.0, once you get through the blue screens of death during the driver installs, Windows 10 will likely be rock solid stable.  I am already liking it more than Windows 8.1, even if it is not as fast on the same hardware.

Update August 4, 2015:

Below is a possible fix for the random zoom in/zoom out problem if your computer has a Synaptics touchpad.  This solution will probably work on Windows 8/8.1 also.  Near the bottom right of the screen, click the up pointing arrow to show all of the tray program icons.  There may be a tray icon named Synaptics Pointing Device that looks like a rectangle with two smaller rectangles below – click that tray icon, and then click Pointing Device Properties from the menu.  Note that if you also want to turn off the potentially annoying left-clicks due to touching the touchpad with a bit too much force, click the Tap to Click menu item to remove the checkmark in front of that item.

aWindows10FixUnintentionalZoom1

On the Device Settings tab, click the Settings button.  Remove the checkmark in front of Pinch Zoom, click the OK button, and then click the OK button in the Mouse Properties window.  This random zoom behavior has plagued the laptop for some time, but only became beyond bearable when the randomly activating feature affected the desktop icons in Windows 10.

Windows10FixUnintentionalZoom2

Update August 5, 2015:

If your desktop icons are super-sized as shown below (icons are shown at actual size) due to random zoom-in/zoom-out messages from your touchpad, there is a simple quick-fix (after shutting off this feature using the instructions above).  Right-click an open area of the desktop, select View and then Medium icons.

Windows10FixOverZoomedDesktopIcons

Microsoft released the first cumulative update KB3081424 for Windows 10 64 bit computers within the last hour – that update is being rolled out and installed automatically on all Windows 10 64 bit computers (Windows 10 computers that are joined to a domain may or may not automatically install this update).  After the update installed, I told the computer to go ahead and reboot.  After rebooting the computer, the Windows Start menu would not appear, Cortana would not respond, and the Action Center would not displayI saw this same behavior when testing the automatic updates in the various Technical Preview versions – in those cases I had to format the hard drive and reinstall Windows to recover.  Fortunately, in this case the functionality was restored by simply rebooting the computer (if this problem happens to you, there is no obvious way to tell Windows to reboot – just hold down Ctrl, then Alt, and press the Delete key once – a power option should appear near the bottom right of the screen to permit you to restart the computer).  This Start menu, Cortana, and Action Center functionality loss, if permanent and widespread, could cause a bit of havoc.

Windows10UpdateKillsStartMenu

Note that I am still fighting the Intel and AMD video card driver issues (Intel driver will not install, AMD driver installs with a blue screen).  I might have found a solution for the computer waking up unexpectedly in the middle of the night, but I will wait to post what I found until I confirm that the fix works.

Update 2 August 5, 2015:

Fixing the issue where the Intel HD Graphics 4000 device is listed as “Microsoft Basic Display Adapter” in Device Manager and Windows returns an error when installing the correct driver through Windows Update and Device Manager is a bit of a hassle.  The actual error message is not displayed when the driver fails to install.  If you experience this problem, immediately navigate to the C:\Windows\SoftwareDistribution\Download\Install folder, and locate a file that ends with .inf.  Right-click that file and select Install.  If Windows returns an error “There is no driver selected for the device information set or element.” that likely means that the company that set up the driver’s .inf file made an error in the file.

Windows10WindowsUpdateCannotInstallIntelDrivers2

The short-term solution for this problem is to download the Windows 7, 8, and 8.1 graphics driver package from the Intel website.  After a bit of searching, I found a suitable driver for the Intel HD Graphics 4000 device here.  After the installer completed, the computer had to restart – Device Manager then showed the HD Graphics 4000 device in place of the “Microsoft Basic Display Adapter” in Device Manager.  After the driver installation Windows Update still shows that the Windows 10 driver for the Intel graphics device fails to install, so there is still an issue that Microsoft/Intel need to rectify with the driver update.

Windows10FixIntel4000Driver

Now to tackle the blue screen of death caused by the AMD Radeon video driver…

Update August 10, 2015:

Peer-to-peer Windows Updates – potentially OK if the computer is connected to a trusted network, and the default Windows Update settings are adjusted; potentially dangerous at the default setting.  The default settings for Windows Updates essentially make your computer part of a world-wide Windows Update torrent.   If your computer is connected to a pay-as-you-go Internet provider, or if your Internet provider charges substantial fees for exceeding a certain monthly data cap (6GB per month, for instance), you should definitely modify the default update settings.  If you are concerned about your computer retrieving and automatically installing Windows updates from computers that have no association with Microsoft, other than the computers are running Windows 10, you should definitely modify the default update settings.  I was made aware of this particular issue when discussing Windows 10 with a relative who is an IT expert.  I also read a couple of articles today that described the torrent-like behavior of Windows 10 updates.

To fix the default setting, select Settings from the Start menu, then click Update & security.  Click Advanced options, then click Choose how updates are delivered.

Windows10DisableWindowsUpdateSharing1\

If your computer is always connected to a trusted network (never taken to a coffee shop, hotel, LAN party, etc.), consider changing the default setting to PCs on my local network – this setting could be beneficial if you have a pay-as-you-go or monthly capped Internet connection.  If you ever connect the computer to an untrusted network, consider changing the default On setting to Off.

Windows10DisableWindowsUpdateSharing2

Update August 29, 2015:

On August 5, 2015 I stated, ” I might have found a solution for the computer waking up unexpectedly in the middle of the night, but I will wait to post what I found until I confirm that the fix works.”  The fix that I implemented seems to have corrected this problem that has plagued the laptop for quite some time (possibly since new, nearly three years ago).  If your computer wakes up unexpectedly in the middle of the night, and you would like to end that behavior, click the Windows 10 Start Menu, then click Control Panel.  In the search box at the right type power and then click Edit power plan at the left.

Windows10DisableWakeUp1

In blue lettering near the top of the window you should see the words “Change settings for the plan: ” followed by the selected power plan name (High performance in the picture below) – make note of the selected power plan name.  Click Change advanced power settings.

Windows10DisableWakeUp2

Check to make certain that the same power plan appears selected in the list – if not, select the power plan that appeared in the previous window.  Click the + next to Sleep, then the + next to Allow wake timers.  Change both the On battery and Plugged in settings to Disable.  Finally, click OK to save the changed settings.  Note: it may be necessary to click the Change settings that are currently disabled link prior to making these changes.

Windows10DisableWakeUp3

I have now upgraded a computer from Windows 8.1 Pro (Ultimate?) to Windows 10 Pro (Sony laptop), another from Windows 8.1 Home to Windows 10 Home, another from Windows 7 Ultimate to Windows 10 Pro, and a fourth (and fifth) from Windows 7 Home to Windows 10 Home.  The Sony laptop has at least twice had a panic attack a couple of minutes after coming out of sleep.  During the panic attack the fan ramps up to maximum speed, the touch pad and keyboard stop responding, and the power button must be held in for 10 second to shut the computer off (the laptop does not go to sleep with a quick press of the power button).  The Sony computer has also crashed a couple of times with an irql_not_less_or_equal blue screen at least once in the last week.  The computer that was upgraded from Windows 7 Pro to Windows 10 Pro has spontaneously rebooted three times (twice within 30 minutes) while building Windows large icon thumbnails for a number of Apple Quicktime .mov video files.  The computer is connected to a true sign wave UPS, so I know that the reboot is not caused by a power problem (I was next to the computer each time the reboot happened, no blue screen, no notification entries in the Windows event logs except that the previous shut down was unexpected).  One of the computers upgraded from Windows 7 Home to Windows 10 Home is a Dell laptop where the G and H keys did not work correctly before or after the upgrade (some people have reported that this is a known intermittent problem with this particular Dell laptop model) – BIOS upgrades for the laptop refused to install following the upgrade to Windows 10 Home.  I do not yet have a status update from the other two upgraded computers.





To N or Not to N, is SYS_OP_C2C the Question; Oracle NVARCHAR Slow?

8 02 2015

February 8, 2015

I was recently contacted about an odd performance issue with a software product named TDM, where certain SQL statements submitted to an Oracle 9.2.x database required roughly 30 seconds to execute when a nearly instantaneous response was expected.  The person who reported the issue to me provided a sample SQL statement that appeared something similar to the following:

SELECT
  DATA
FROM 
  MVIS_DATA 
WHERE 
  TOOLID = N'112' 
  AND DATATYPE = 0 
  AND COMMISSIONID IS NULL 
  AND OPERATIONID IS NULL 
  AND COMMISSIONLISTPOS IS NULL;

In the SQL statement, notice the N character that is immediately before ‘112’ in the SQL statement.  The person indicated that the SQL statement executed quickly if that N character were removed from the SQL statement.  At this time the developer of that application is unwilling to release a bug fix to remove the N character from this (and likely other) SQL statements.

I did not initially have the table datatype descriptions (retrieved with DESC MVIS_DATA), so I made a couple of guesses about the datatypes.  What if the TOOLID column was defined as a number, and is it the primary key column for the table (indicating that there must be an index on that column)?  It might be the case that the developer of the application decided that in all SQL statements that are submitted with literal values (rather than using bind variables), that all numbers would be submitted in single quotes.  I created a testing table for a mock up in Oracle Database 11.2.0.2:

CREATE TABLE MVIS_DATA_NUM (
  TOOLID NUMBER,
  DATATYPE NUMBER,
  COMMISSIONID NUMBER,
  OPERATIONID NUMBER,
  COMMISSIONLISTPOS NUMBER,
  DATA VARCHAR2(100),
  PRIMARY KEY (TOOLID));
 
INSERT INTO
  MVIS_DATA_NUM
SELECT
  ROWNUM TOOLID,
  MOD(ROWNUM,2) DATATYPE,
  NULL COMMISSIONID,
  DECODE(MOD(ROWNUM,2),0,NULL,MOD(ROWNUM,2)) OPERATIONID,
  DECODE(MOD(ROWNUM,2),0,NULL,MOD(ROWNUM,2)) COMMISSIONLISTPOS,
  LPAD('A',100,'A') DATA
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;
 
COMMIT;
 
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'MVIS_DATA_NUM',CASCADE=>TRUE,NO_INVALIDATE=>FALSE)

With the testing table created with 100,000 rows, and statistics gathered for the table and primary key index, I then tried executing a query and retrieving the execution plan for that query so that I could determine if the Predicate Information section of the execution plan provided any clues.  I executed the following, the first SQL statement retrieved one row, and the second SQL statement retrieved the execution plan for the first SQL statement:

SET LINESIZE 140
SET PAGESIZE 1000
 
SELECT
  DATA
FROM 
  MVIS_DATA_NUM
WHERE 
  TOOLID = N'112' 
  AND DATATYPE = 0 
  AND COMMISSIONID IS NULL 
  AND OPERATIONID IS NULL 
  AND COMMISSIONLISTPOS IS NULL;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

The execution plan output is as follows:

SQL_ID  gzzrppktqkbmu, child number 0
-------------------------------------
SELECT   DATA FROM   MVIS_DATA_NUM WHERE   TOOLID = N'112'   AND
DATATYPE = 0   AND COMMISSIONID IS NULL   AND OPERATIONID IS NULL   AND
COMMISSIONLISTPOS IS NULL
 
Plan hash value: 1080991
 
---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| MVIS_DATA_NUM |     1 |   113 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | SYS_C0050817  |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("OPERATIONID" IS NULL AND "COMMISSIONLISTPOS" IS NULL AND
              "DATATYPE"=0 AND "COMMISSIONID" IS NULL))
   2 - access("TOOLID"=112)

Nothing too unusual in the above execution plan, the N’112′ portion of the SQL statement was simply changed by the query optimizer to the number 112, which permitted the use of the table’s primary key index.  Obviously, Oracle Database 11.2.0.2 contains a few optimizations that are not available in Oracle Database 9.2.x, so maybe the outcome would be different in Oracle Database 9.2.x.  However, when a number value is compared to a character (for example VARCHAR2) value, Oracle Database will attempt to implicitly convert the character value to a number value when performing the comparison, so the outcome should be the same on Oracle Database 9.2.x.

What if that TOOLID column were defined as VARCHAR?  Below is another test table with that column defined as VARCHAR2:

CREATE TABLE MVIS_DATA (
  TOOLID VARCHAR2(15),
  DATATYPE NUMBER,
  COMMISSIONID NUMBER,
  OPERATIONID NUMBER,
  COMMISSIONLISTPOS NUMBER,
  DATA VARCHAR2(100),
  PRIMARY KEY (TOOLID));
 
INSERT INTO
  MVIS_DATA
SELECT
  TO_CHAR(ROWNUM) TOOLID,
  MOD(ROWNUM,2) DATATYPE,
  NULL COMMISSIONID,
  DECODE(MOD(ROWNUM,2),0,NULL,MOD(ROWNUM,2)) OPERATIONID,
  DECODE(MOD(ROWNUM,2),0,NULL,MOD(ROWNUM,2)) COMMISSIONLISTPOS,
  LPAD('A',100,'A') DATA
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;
 
COMMIT;
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'MVIS_DATA',CASCADE=>TRUE,NO_INVALIDATE=>FALSE)

With the new test table created, let’s try the SQL statement again.  A 10053 trace file will be enabled in the event that you are interested in examining any potential automatic transformations of the SQL statement:

SET LINESIZE 140
SET PAGESIZE 1000
 
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SQL_10053V';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';
 
SELECT
  DATA
FROM 
  MVIS_DATA 
WHERE 
  TOOLID = N'112' 
  AND DATATYPE = 0 
  AND COMMISSIONID IS NULL 
  AND OPERATIONID IS NULL 
  AND COMMISSIONLISTPOS IS NULL;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));
 
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

The first of the above SQL statements output one row.  Here is the execution plan that was output:

SQL_ID  5pkwzs079jwu2, child number 0
-------------------------------------
SELECT   DATA FROM   MVIS_DATA WHERE   TOOLID = N'112'   AND DATATYPE =
0   AND COMMISSIONID IS NULL   AND OPERATIONID IS NULL   AND
COMMISSIONLISTPOS IS NULL
 
Plan hash value: 353063534
 
-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |       |       |   227 (100)|          |
|*  1 |  TABLE ACCESS FULL| MVIS_DATA |   122 | 13908 |   227   (3)| 00:00:01 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("OPERATIONID" IS NULL AND "COMMISSIONLISTPOS" IS NULL
              AND "DATATYPE"=0 AND SYS_OP_C2C("TOOLID")=U'112' AND "COMMISSIONID" IS
              NULL))

Notice the SYS_OP_C2C function in the Predicate Information section of the execution plan, that is a clue that there might be a performance problem lurking.  Also notice that the INDEX UNIQUE SCAN operation was replaced with a TABLE ACCESS FULL operation, that is also a clue that a performance problem may be lurking.  This section of the execution plan also indicates that the N’112′ portion of the SQL statement was changed to U’112′.  Consulting the 10053 trace file indicates that the query optimizer rewrote the submitted SQL statement to the following:

SELECT
  "MVIS_DATA"."DATA" "DATA"
FROM
  "TESTUSER"."MVIS_DATA" "MVIS_DATA"
WHERE
  SYS_OP_C2C("MVIS_DATA"."TOOLID")=U'112'
  AND "MVIS_DATA"."DATATYPE"=0
  AND "MVIS_DATA"."COMMISSIONID" IS NULL
  AND "MVIS_DATA"."OPERATIONID" IS NULL
  AND "MVIS_DATA"."COMMISSIONLISTPOS" IS NULL

SYS_OP_C2C is an internal characterset conversion function.

What happens to the execution plan if the N character is removed from the SQL statement?

SELECT
  DATA
FROM 
  MVIS_DATA 
WHERE 
  TOOLID = '112' 
  AND DATATYPE = 0 
  AND COMMISSIONID IS NULL 
  AND OPERATIONID IS NULL 
  AND COMMISSIONLISTPOS IS NULL;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

Below is the execution plan that was output:

SQL_ID  d70jxj3ypy60g, child number 0
-------------------------------------
SELECT   DATA FROM   MVIS_DATA WHERE   TOOLID = '112'   AND DATATYPE =
0   AND COMMISSIONID IS NULL   AND OPERATIONID IS NULL   AND
COMMISSIONLISTPOS IS NULL
 
Plan hash value: 1051843381
 
--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| MVIS_DATA    |     1 |   114 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | SYS_C0050814 |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("OPERATIONID" IS NULL AND "COMMISSIONLISTPOS" IS NULL AND
              "DATATYPE"=0 AND "COMMISSIONID" IS NULL))
   2 - access("TOOLID"='112')

Notice that the SYS_OP_C2C function does not appear in the Predicate Information section of the execution plan this time, and that the primary key index is used, rather than requiring a full table scan.  Unfortunately, the DBMS_XPLAN.DISPLAY_CURSOR function does not exist in Oracle Database 9.2.0.x, otherwise the reason for the performance problem may have been much more readily apparent to the person who reported the issue to me.

So, what is the purpose of that N character in the SQL statement?  I recall seeing SQL statements similar to this one in the past, which converts a character string to a date:

SELECT DATE'2015-02-08' FROM DUAL;
 
DATE'2015
---------
08-FEB-15

After a fair amount of digging through the Oracle documentation, I located the following note about that N character:

“The TO_NCHAR function converts the data at run time, while the N function converts the data at compilation time.”

Interesting.  That quote suggests that the author of the SQL statement may have been trying to convert ‘112’ to a NVARCHAR2 (or NCHAR).  Time for another test, the below script creates a table with the TOOLID column defined as NVARCHAR2, populates the table with 100,000 rows, and then collects statistics on the table and its primary key index:

CREATE TABLE MVIS_DATA_N (
  TOOLID NVARCHAR2(15),
  DATATYPE NUMBER,
  COMMISSIONID NUMBER,
  OPERATIONID NUMBER,
  COMMISSIONLISTPOS NUMBER,
  DATA VARCHAR2(100),
  PRIMARY KEY (TOOLID));
 
INSERT INTO
  MVIS_DATA_N
SELECT
  TO_CHAR(ROWNUM) TOOLID,
  MOD(ROWNUM,2) DATATYPE,
  NULL COMMISSIONID,
  DECODE(MOD(ROWNUM,2),0,NULL,MOD(ROWNUM,2)) OPERATIONID,
  DECODE(MOD(ROWNUM,2),0,NULL,MOD(ROWNUM,2)) COMMISSIONLISTPOS,
  LPAD('A',100,'A') DATA
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;
 
COMMIT;
 
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'MVIS_DATA_N',CASCADE=>TRUE,NO_INVALIDATE=>FALSE)

Executing the query against this table also returns one row:

SELECT
  DATA
FROM 
  MVIS_DATA_N 
WHERE 
  TOOLID = N'112' 
  AND DATATYPE = 0 
  AND COMMISSIONID IS NULL 
  AND OPERATIONID IS NULL 
  AND COMMISSIONLISTPOS IS NULL;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

The execution plan follows:

SQL_ID  1yuzz9rqkvnpv, child number 0
-------------------------------------
SELECT   DATA FROM   MVIS_DATA_N WHERE   TOOLID = N'112'   AND DATATYPE
= 0   AND COMMISSIONID IS NULL   AND OPERATIONID IS NULL   AND
COMMISSIONLISTPOS IS NULL
 
Plan hash value: 1044325464
 
--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| MVIS_DATA_N  |     1 |   119 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | SYS_C0050815 |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("OPERATIONID" IS NULL AND "COMMISSIONLISTPOS" IS NULL AND
              "DATATYPE"=0 AND "COMMISSIONID" IS NULL))
   2 - access("TOOLID"=U'112')

Notice in the above Predicate Information section that the SYS_OP_C2C function does not appear, and the N’112′ portion of the SQL statement was still changed to U’112′.  The execution plan also shows that the primary key index was used, while a full table scan was required when the TOOLID column was defined as a VARCHAR2.

The person who reported the issue to me later provide the output of DESC MVIS_DATA, which indicated that the TOOLID column was in fact defined as a VARCHAR2 column.  If this person were running a more recent version of Oracle Database, he might be able to create a function based index that uses the SYS_OP_C2C function on the TOOLID column.  Such an index might look something like this:

CREATE INDEX IND_TOOLID_FIX ON MVIS_DATA (SYS_OP_C2C("TOOLID"));

Gathering statistics on the table and its indexes, executing the original SQL statement, and outputting the execution plan:

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'MVIS_DATA',CASCADE=>TRUE,NO_INVALIDATE=>FALSE)
 
SELECT
  DATA
FROM 
  MVIS_DATA 
WHERE 
  TOOLID = N'112' 
  AND DATATYPE = 0 
  AND COMMISSIONID IS NULL 
  AND OPERATIONID IS NULL 
  AND COMMISSIONLISTPOS IS NULL;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

The first of the above queries output the expected one row, while the second query output the following execution plan:

SQL_ID  5pkwzs079jwu2, child number 1
-------------------------------------
SELECT   DATA FROM   MVIS_DATA WHERE   TOOLID = N'112'   AND DATATYPE =
0   AND COMMISSIONID IS NULL   AND OPERATIONID IS NULL   AND
COMMISSIONLISTPOS IS NULL
 
Plan hash value: 1497912695
 
----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| MVIS_DATA      |     1 |   125 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_TOOLID_FIX |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("OPERATIONID" IS NULL AND "COMMISSIONLISTPOS" IS NULL AND "DATATYPE"=0
              AND "COMMISSIONID" IS NULL))
   2 - access("MVIS_DATA"."SYS_NC00007$"=U'112')

In the Predicate Information section of the execution plan, notice the absence of the SYS_OP_C2C function on the TOOLID column – those values are pre-computed in the virtual column (SYS_NC00007$) created for the function based index.  An index range scan is reported in the execution plan, rather than an index unique scan (the function based index, when created, was not declared as unique), so the SQL statement should execute much faster than the roughly 30 seconds required by the SQL statement without the function based index.

So, what are the options that were mentioned above?

  • Have the application programmer fix the SQL statements.
  • Upgrade to a version of Oracle Database (if that version is supported by the application) that supports the SYS_OP_C2C function, and create a function based index using that function.
  • If the TOOLID column only contains numbers, that column should be defined as NUMBER, rather than VARCHAR2.  Consider redefining that column as a NUMBER datatype.
  • If that N character is always used when this column is referenced, that column probably should be defined as NVARCHAR2 rather than VARCHAR2.  Consider redefining that column as a NVARCHAR2 datatype.
  • Consider that the application is working as designed, and that it is nice to receive 30 second breaks every now and then.
  • Take a DUMP and share it with the application developer.  On second thought, such an approach may have limited success in helping to solve this problem.

Here is a little experiment with the DUMP function, which reveals Oracle’s internal representation of data – refer to the Internal datatypes and datatype codes in the Oracle documentation for help in decoding the Typ= values.

SELECT DUMP(112) A FROM DUAL;
 
A
---------------------
Typ=2 Len=3: 194,2,13
 
/* ------------------- */
SELECT DUMP(TO_CHAR(112)) A FROM DUAL;
 
A
---------------------
Typ=1 Len=3: 49,49,50
 
/* ------------------- */
SELECT DUMP('112') A FROM DUAL;
 
A
----------------------
Typ=96 Len=3: 49,49,50
 
/* ------------------- */
SELECT DUMP(N'112') A FROM DUAL;
 
A
----------------------------
Typ=96 Len=6: 0,49,0,49,0,50
 
/* ------------------- */
SELECT DUMP(SYS_OP_C2C('112'))  A FROM DUAL;
 
A
----------------------------
Typ=96 Len=6: 0,49,0,49,0,50
 
/* ------------------- */
SELECT DUMP(TO_NCHAR('112')) A FROM DUAL;
 
A
---------------------------
Typ=1 Len=6: 0,49,0,49,0,50

It is possibly interesting to note that the internal representation for N’112′ is CHAR (or NCHAR), while the internal representation for TO_NCHAR(‘112’) (and TO_NCHAR(112)) is VARCHAR2 (or NVARCHAR2).

This blog’s statistics indicate that the search engine search term Oracle NVARCHAR slow resulted in two page views of this blog yesterday.  I can’t help but wonder if the person who performed that search might have been helped by some of the above analysis.








Follow

Get every new post delivered to your Inbox.

Join 174 other followers