Monthly Archives: February 2012

Oracle SCN Bug

Oracle community seems  to be in some what confused state after Infoworld  published an article about Oracle SCN flaw. Questions like “Am I affected” or “When should I apply the patch”  are common.  Personally I think this is a low risk bug but applying the patch to mitigate the risk is highly recommended for following reasons

  1. The patch also fixes other SCN unpublished bugs. so this patch is highly recommended even if you are doing hot backups with RMAN.
  2. Best practice to apply latest security patch.
  3. The patch introduces new checks/threshold against SCN contamination from db links.   Oracle has not disclosed the  threshold but very useful if you are not running the SCN health check on a daily routine. With the patch you are protected from databases with SCNs close to soft limits   via db links.
.
Oracle SCN
.
System Change Number or SCN  is internal timestamp used by Oracle database to comply with ACID properties of database.  Every transaction in the database is assigned a  (unique) SCN value.  Generally the SCN values assigned for each transaction is supposed to be unique but  quite often we have noticed transactions with duplicate SCN values  mostly under high concurrency.
.
Oracle SCN is incremented by 16384 for every second since 1/1/1988. This number is also called soft Limit and increase every second.  The maximum value or hard limit of SCN supported by Oracle is 281 trillion, good enough to last 500 years. At any point in time, the database SCN cannot be greater than SCN Soft limit.  This might temporarily make the database unavailable or  may cause the  database to crash with ORA-600 errors. The database will start to function normally as you start to accumulate SCNs with time. Without any bugs, it is possible but rare to use up 16384 SCN’s in 1 second. In unusual circumstances  and if required, you can also fool the database by  changing the system date to sometime in future but this may introduce data issues.
.
SCN soft limit can be calculated with following query
.

COLUMN SCN FORMAT 999999999999999999999999999

SELECT (SYSDATE – TO_DATE(’01-JAN-1988′)) * 24*60*60 *16384 AS SCN FROM DUAL;

.

Here is simple example to understand working of SCN and for sake of simplicity , lets assume the current Value of SCN at 10:00:01 AM is X.

Click on the diagram for larger view.


Now lets talk about the hot backup bug that impacts 11g databases

The hot backup bug( 12371955 ) causes SCN values to jump very high causing it to  reach the soft limit earlier than how Oracle intended.  This bug is not exposed with hot RMAN backups. As per Oracle, the bug is categorized as  hot backup bug which seems to indicate that both  “ALTER DATABASE/TABLESPACE …. BEGIN/END BACKUP command  instead of  “ALTER DATABASE BEGIN/END BACKUP”  command only.    There are other unknown triggers that may cause this jump too.

Database links and SCN Jump

Now introduce databases interconnected via database link.  When databases are interconnected via database link, Oracle synchronizes the SCN value of the database with the highest SCN to all the other interconnected databases whenever they communicate with each other. So even if one of your databases is infected with this bug, it impacts all interconnected databases irrespective of the version of database with respect to propagating the SCN values.

From the above diagram, databases db1 to db8 are interconnected by db links. So  even if one of the database gets infected with the bug, it propagates to all the databases.

The Fix

Apply Oracle recommended PSU or CPU patch. The patch fix resolves  the hot backup bug and many other unknowns but you are still not fully protected from SCN hacks or other unknown triggers. So it is recommended to run SCN health check at regular intervals to prevent any runaway growth.  The patch also collects additional diagnostic information if the issues occur due to some other undocumented reason.  Refer to  metalink doc 1393363.1 to download the  SCN health check.

Please refer to https://swervedba.wordpress.com/2011/05/30/oracle-patch-sets-psu-and-cpu  for differences between CPU and PSU

Advertisements

Improving performance of GoldenGate Replicat

It has be quite few years that I have been working with GoldenGate.  I am really impressed with this replication tool.  Its very easy to install, manage and learn this tool. In this blog, I am writing about how to improve the performance  of replicat.

As time permits , I will be adding more information regarding this topic.

Replicate can operate in three modes.

  1. Source mode: Transaction is applied in the same order as they occur in source database
  2. Normal mode: Multiple transaction are grouped together and applied.
  3. Batch mode:  Multiple similar SQL statements from different transactions are grouped together and applied.

You get better performance with batch mode off course there are some caveats.  The performance improvement is more for smaller row changes than large row changes. It basically follows law of diminishing returns where in he performance improvement decreases as the size of row change increases.

Batch mode should be avoided when the chances of data collision is more because replicat reverts back to normal mode when errors are encountered. If errors still persists , Replicat will apply transaction as they occurred in source database.

Trace Analyzer Explained

I am writing this blog to  discuss my experience  with Trace Analyzer.  I frequently use this tool due to nature of my job.  Sometimes , trace analyzer  is a silver bullet and sometimes it is your DBA experience or instinct.  This tool is basically an enhanced version of tkprof that not only formats Oracle trace files generated by  setting of  10046 event. It also  provides  diagnostic information like top SQL, Hot IO blocks ,  IO wait summary and  SQL  Genealogy to list  few of them. The output generated is both in text and HTML format.

Installation.

Download  trca.zip from metalink. Refer to  metalink doc ID  [ID 224270.1] for detailed instructions. Once you unzip trca.zip , you will find the following folders along with ” instructions.txt”.

  1. dict
  2. doc
  3. install
  4. instructions.txt
  5. run
  6. utl

All you need to do is “cd install” and execute “tacreate.sql”. The script will create user called “trcanlzr“. You  will be prompted for password, defaulted tablespace and  temporary tablespace.  The tool can be uninstalled  by executing “tadrop.sql” i.e.. user “trcanlzr” is dropped. More detailed instructions can be found in “instructions.txt”

One key advantage of  “trcanlzr” schema is that it stores  diagnostic/detailed information of the trace files in its own schema and therefore you can write your own queries to mine whatever data you want. The trcanlzr schema also captures data dictionary information and stores it within the schema. The downside of storing all that data in the schema is that  running diagnostics on a trace file takes longer as data volume increases. Tell me about tuning the  tool itself.  I have resolved this slowness by backing up existing data , uninstall and then install the tool once again.

How to generate formatted output?

Logon to database as trcanlzr and execute “start trcanlzr tracfile_name” from the run directory of trace analyzer.  The tool will create formatted  text and HTML output in the database udump directory.  A zip file with formated text, HTML, error and tkprof  will also get created in the run folder.

About using the tool itself, I generally start with Top SQL, look at each of the queries, their execution plan , top wait events for the query, bind variables used, any segment contention etc.

.
Sample output

Top SQL