Search

Top 60 Oracle Blogs

Recent comments

June 2011

Index Rebuilds – A new Myth arrives

Just over 9 years ago now, before usenet died , I posted a list of common myths regarding the Oracle RDBMS. You can find that list and the ensuing discussion here. Number 2 in my list was

Free Space in an index is never reused.

At the time it was commonly thought that indexes needed regular rebuilding to avoid fragmentation. The best work I am aware of that describes why this is not the case is Richard Foote’s Index Internals: Rebuilding the Truth paper. Over time more and more people have come to quote this work and it is often cited in discussions on the forums at OTN and elsewhere.  Recently I have noticed a tendency for people to produce answers like the one in this answer. Namely

In Enterprise Manager, the section named “Segment Advisor” can to help you

This it seems to me is a fundamental misunderstanding of what the segment advisor is showing you, and sadly is my second myth rising anew from the dead. The official documentation on the segment advisor describes it as

Over time, updates and deletes on objects within a tablespace can create pockets of empty space that individually are not large enough to be reused for new data. This type of empty space is referred to as fragmented free space.

Objects with fragmented free space can result in much wasted space, and can impact database performance. The preferred way to defragment and reclaim this space is to perform an online segment shrink. This process consolidates fragmented free space below the high water mark and compacts the segment. After compaction, the high water mark is moved, resulting in new free space above the high water mark. That space above the high water mark is then deallocated. The segment remains available for queries and DML during most of the operation, and no extra disk space need be allocated.

You use the Segment Advisor to identify segments that would benefit from online segment shrink. Only segments in locally managed tablespaces with automatic segment space management (ASSM) are eligible. Other restrictions on segment type exist. For more information, see “Shrinking Database Segments Online”.

If a table with reclaimable space is not eligible for online segment shrink, or if you want to make changes to logical or physical attributes of the table while reclaiming space, then you can use online table redefinition as an alternative to segment shrink. Online redefinition is also referred to as reorganization. Unlike online segment shrink, it requires extra disk space to be allocated. See “Redefining Tables Online” for more information.

In other words the segment advisor is all about reclaiming space. But a B*-Tree index will always tend to have free space within it – and will nearly always be able to reuse that space. Fragmentation as described above, that is unusable free space, does not apply to B*-Tree indexes in general in Oracle (the point of my previous post was to show one situation where it can be an issue) . To run the segment advisor on an index then is fundamentally misguided. As the next section describes the advisor is really intended for objects that actually store data (most commonly heap tables).

The Segment Advisor generates the following types of advice:

  • If the Segment Advisor determines that an object has a significant amount of free space, it recommends online segment shrink. If the object is a table that is not eligible for shrinking, as in the case of a table in a tablespace without automatic segment space management, the Segment Advisor recommends online table redefinition.
  • If the Segment Advisor determines that a table could benefit from compression with the OLTP compression method, it makes a recommendation to that effect. (Automatic Segment Advisor only. See “Automatic Segment Advisor”.)
  • If the Segment Advisor encounters a table with row chaining above a certain threshold, it records that fact that the table has an excess of chained rows.

Please don’t lets reinvent the old myths – there’s plenty of room for new ones out there.

Why is my SYSAUX Tablespace so Big? Statistics_level=ALL

One of my most popular postings is about why your SYSTEM tablespace could be rather large. Recently I’ve had issues with a SYSAUX tablespace being considerably larger than I expected, so I thought I would do a sister posting on the reason.

The client I was working with at the time was about to go live with a new application and database. For various reasons I was a little anxious about how the Java application (the User Interface) would actually call the stored PL/SQL code I had helped develop. Initial workloads would be low and so I asked that the STATISTICS_LEVEL be set to ALL, so that bind variables (amongst other things) would be gathered. This is on version 10.2.0.4, btw, enterprise edition and 4-node RAC.

We went live, issues were encountered and resolved, the usual way these things work. Then, a few weeks in and when everything was still very “exciting” from a problem resolution perspective, I got an odd email from the DBA team. Would they like us to add another datafile to the SYSAUX tablespace. Huh? I checked. I’d been watching the size of our application’s tablespaces but not the others {well, I was not supposed to be a DBA and I was watching an awful lot of other things}. Our SYSAUX tablespace was around 160GB in size, having pretty much filled it’s 5th datafile. Why? I checked to see what was taking up the space in the tablespace:

 select * from
 (select owner,segment_name||'~'||partition_name segment_name,bytes/(1024*1024) size_m
 from dba_segments
 where tablespace_name = 'SYSAUX'
 ORDER BY BLOCKS desc)
 where rownum < 40

OWNER              SEGMENT_NAME                                             SIZE_M
------------------ -------------------------------------------------- ------------
SYS                WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_3911            27,648
SYS                WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_3911         26,491
SYS                WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_3537            23,798
SYS                WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_3537         22,122
SYS                WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_4296            17,378
SYS                WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_4296         16,818
SYS                WRH$_ACTIVE_SESSION_HISTORY~WRH$_ACTIVE_14459270_3          136
                   911
SYS                WRH$_SQLSTAT~WRH$_SQLSTA_14459270_3911                       96
SYS                WRH$_SQLSTAT~WRH$_SQLSTA_14459270_3537                       72
SYS                WRH$_SQLSTAT~WRH$_SQLSTA_14459270_4296                       47
SYS                WRH$_LATCH_MISSES_SUMMARY_PK~WRH$_LATCH__14459270_           45
                   3537
SYS                I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST~                              41
SYS                WRH$_SYSMETRIC_SUMMARY~                                      40
SYS                WRH$_LATCH_MISSES_SUMMARY_PK~WRH$_LATCH__14459270_           37

As you can see, almost all the space is being taken up by WRH$_LATCH_CHILDREN and WRH$_LATCH_CHILDREN_PK partitions. They are massive compared to other objects. A quick goggle did not come up with much other than many hits just listing what is in SYSAUX and the odd person also seeing SYSAUX being filled up with these objects and suggested ways to clear down space, nothing about the cause.

I had a chat with the DBAs and we quickly decided that this was going to be something to do with AWR given the name of objects – “WRH$_” objects are the things underlying AWR. The DBA suggested my settings of 15 minute intervals and 35 day retention was too aggressive. I knew this was not the case, I’ve had more aggressive snapshot intervals and longer retention periods on far busier systems than this. I did not have access to Metalink at that point so I asked the DBAs to raise a ticket, which they duly did.

Oracle support cogitated for a couple of days and came back with the advice to reduce the retention period. Hmmmm. Via the DBA I asked Oracle support to explain why those objects were so large when I had not seen this issue on several other systems. Was it a bug? I had by now corroborated with a friend from a previous site with 5 minute snapshot intervals and two months retention period and their SYSAUX tablespace was about 10GB all in. I did not want to go changing things if we did not know it would fix the issue as we really wanted to stop the growth of SYSAUX as soon as possible, not just try a hunch.

As you probably realise from the title of this blog, the issue was not the snapshot interval or retention period but the STATISTICS_LEVEL=ALL. The one and only hit you get in metalink if you search on WRH$_LATCH_CHILDREN is note 874518.1. From V10.1.0.2 to V11.1.0.7 setting this parameter to ALL is known to create a lot of data about Latch children and not clear it down when the AWR data is purged (Bug 8289729). The advice was to change STATISTICS_LEVEL and make the snapshot interval larger. I’d suggest you just need to alter the STATISTICS_LEVEL, unless you really, really need that extra information gathered. It seemed to take Oracle Support an extra day or two to find that note for us. {I’ve since checked out Metalink directly to confirm all this}.

So with a known issue we felt confident that altering the initialisation parameter would solve the issue. It took a while for us to change the STATISTICS_LEVEL on the production system – Change Control for that site is rather robust. This allowed us to see some other impacts of this issue.

The mmon process which looks after AWR data was becoming a top session in our OEM performance screens. In particular, a statement with SQL id 2prbzh4qfms7u that inserted into the WRH$_LATCH_CHILDREN table was taking several seconds to run each time and was running quite often {I include the SQL ID as it may be the same on many oracle V10 systems as it is internal code}:


The internal SQL inserting into wrh$_latch_children was becoming demanding

This was doing a lot of IO, by far the majority of the IO on our system at the time – it was a new system and we had been able to tune out a lot of the physical IO.


The physical IO requirements and 15-20 second elapsed time made this out most demanding statement on the system

We also now started to have issues with mmon running out of undo space when it ran at the same time as our daily load. This was particularly unfortunate as it coincided in a period of “intense management interest” in the daily load…

What was happening to the size of the SYSAUX tablespace?

Enter the tablespace (or leave null)> sys

TS_NAME              ORD      SUM_BLKS        SUM_K  MAX_CHNK_K NUM_CHNK
-------------------- ----- ----------- ------------ ----------- --------
SYSAUX               alloc  58,187,904  465,503,232  33,553,408       14
                     free       10,728       85,824      21,504       20
SYSTEM               alloc     128,000    1,024,000   1,024,000        1
                     free       68,360      546,880     546,752        3

4 rows selected.

select * from
(select owner,segment_name||'~'||partition_name segment_name,bytes/(1024*1024) size_m
 from dba_segments
 where tablespace_name = 'SYSAUX'
 ORDER BY BLOCKS desc)
where rownum < 40

OWNER    SEGMENT_NAME                                                     SIZE_M
-------- ------------------------------------------------------------ ----------
SYS      WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_6201                     30262
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_5817                     29948
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_5435                     28597
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_4675                     28198
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_3911                     27648
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_5817                  27144
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_6585                     26965
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_6201                  26832
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_4675                  26741
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_3911                  26491
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_4296                     26307
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_5435                  26248
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_4296                  25430
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_6585                  25064
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_5058                     24611
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_5058                  23161
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_6966                      9209
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_6966                   8462
         WRH$_SYSMETRIC_SUMMARY~                                             152
         WRH$_ACTIVE_SESSION_HISTORY~WRH$_ACTIVE_14459270_3911               136
         WRH$_SQLSTAT~WRH$_SQLSTA_14459270_3911                               96

@sysaux_conts

OWNER                          OBJ_PART_NAME                                SIZE_M
------------------------------ ---------------------------------------- ----------
SYS                            WRH$_LATCH_CHILDREN-WRH                  231745.063
SYS                            WRH$_LATCH_CHILDREN_PK-WRH               215573.063
SYS                            WRH$_SQLSTAT-WRH                           711.0625
SYS                            WRH$_LATCH_MISSES_SUMMARY_PK-WRH           439.0625
SYS                            WRH$_ACTIVE_SESSION_HISTORY-WRH            437.0625
SYS                            WRH$_LATCH_PARENT-WRH                      292.0625
SYS                            WRH$_LATCH-WRH                             276.0625
SYS                            WRH$_LATCH_MISSES_SUMMARY-WRH              273.0625
SYS                            WRH$_SEG_STAT-WRH                          268.0625
SYS                            WRH$_LATCH_PARENT_PK-WRH                   239.0625
SYS                            WRH$_SYSSTAT_PK-WRH                        237.0625

Yes, that is close to half a terabyte of SYSAUX and it is all used, more partitions have appeared and the total size of the largest segments in SYSAUX show how WRH$_LATCH_CHILDREN and WRH$_LATCH_CHILDREN_PK make up the vast majority of the space used.

Shortly after, we finally got permission to change the live system. The impact was immediate, mmon dropped from being the most demanding session, that SQL code dropped down the rankings and the issues with running out of undo ceased.

I was anxious to see if the old data got purged, as the Metalink note had suggested the data would not be purged. Thankfully, that was not the case. The space was slowly released as normal purging of data outside the retention period took place and after just over a month, the SYSAUX tablespace contained a lot less information and was mostly free space:

OWNER      OBJ_PART_NAME                                SIZE_M
---------- ---------------------------------------- ----------
SYS        WRH$_LATCH_MISSES_SUMMARY_PK-WRH           512.0625
SYS        WRH$_LATCH_MISSES_SUMMARY-WRH              350.0625
SYS        WRH$_LATCH-WRH                             304.0625
SYS        WRH$_SQLSTAT-WRH                           280.0625
SYS        WRH$_LATCH_PK-WRH                          259.0625
SYS        WRH$_SYSSTAT_PK-WRH                        247.0625
SYS        WRH$_SERVICE_STAT_PK-WRH                   228.0625
SYS        WRH$_PARAMETER_PK-WRH                      201.0625
SYS        WRH$_PARAMETER-WRH                         169.0625
SYS        WRH$_SYSSTAT-WRH                           169.0625
SYS        WRH$_SEG_STAT-WRH                          161.0625
SYS        WRH$_SYSTEM_EVENT_PK-WRH                   156.0625
SYS        WRH$_SYSMETRIC_SUMMARY-                         152
SYS        WRH$_SYSTEM_EVENT-WRH                      133.0625
SYS        WRH$_SERVICE_STAT-WRH                      123.0625
SYS        WRH$_ACTIVE_SESSION_HISTORY-WRH            115.0625

TS_NAME              ORD      SUM_BLKS        SUM_K  MAX_CHNK_K NUM_CHNK
-------------------- ----- ----------- ------------ ----------- --------
SYSAUX               alloc  58,251,904  466,015,232  33,553,408       15
                     free   57,479,400  459,835,200   4,063,232    1,208
SYSTEM               alloc     128,000    1,024,000   1,024,000        1
                     free       68,048      544,384     544,320        2

Now, how do we get that space back? I left that with the DBA team to resolve.

Oh, one last thing. I mentioned the above to a couple of the Oaktable lot in the pub a few weeks back. Their instant response was to say “You set STATISTICS_LEVEL to ALL on a live system?!? You are mad!”
:-)

{Update, I’ve just spotted this posting by Colbran which is related. Last time I googled this I just got a stub with no information}

Tuning Log and Trace levels for Clusterware 11.2

With the introduction of Clusterware 11.2 a great number of command line tools have either been deprecated ($ORA_CRS_HOME/bin/crs_* and others) or merged into other tools. This is especially true for crsctl, which is now the tool to access and manipulate low level resources in Clusterware.

This also implies that some of the notes on Metalink are no longer applicable to Clusterware 11.2, such as the one detailing how to get more detailed information in the logs. Not that the log information wasn’t already rather comprehensive if you asked me…

And here comes a warning: don’t change the log levels unless you have a valid reason, or under the instructions of support. Higher log levels than the defaults tend to generate too much data, filling up the GRID_HOME and potentially killing the node.

Log File Location

The location for logs in Clusterware hasn’t changed much since the unified log structure was introduced in 10.2 and documented in “CRS and 10g/11.1 Real Application Clusters (Doc ID 259301.1)”. It has been extended though, and quite dramatically so in 11.2, which is documented as well in one of the better notes from support: “11gR2 Clusterware and Grid Home – What You Need to Know (Doc ID 1053147.1)”

The techniques for getting debug and trace information as described for example in “Diagnosability for CRS / EVM / RACG (Doc ID 357808.1)” doesn’t really apply any more as the syntax changed.

Getting Log Levels in 11.2

If you are interested at which log level a specific Clusterware resource operates, you can use the crsctl get log resource resourceName call, as in this example:

# crsctl get log res ora.asm
Get Resource ora.asm Log Level: 1

Don’t forget to apply the “-init” flag when you want to query resources which are part of the lower stack:

[root@lonengbl312 ~]# crsctl get log res ora.cssd
CRS-4655: Resource ‘ora.cssd’ could not be found.
CRS-4000: Command Get failed, or completed with errors.
[root@lonengbl312 ~]# crsctl get log res ora.cssd –init
Get Resource ora.cssd Log Level: 1

Interestingly, most Clusterware daemons start leaving a detailed message in their log file as to which module uses what logging level. Take CSSD for example:

2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = CSSD, LogLevel = 2, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = GIPCNM, LogLevel = 2, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = GIPCGM, LogLevel = 2, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = GIPCCM, LogLevel = 2, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = CLSF, LogLevel = 0, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = SKGFD, LogLevel = 0, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = GPNP, LogLevel = 1, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = OLR, LogLevel = 0, TraceLevel = 0

This doesn’t match the output of the previous command. Now how do you check this on the command line? crsctl to the rescue again-it now has a “log” option:

# crsctl get log -h
Usage:
  crsctl get {log|trace} {mdns|gpnp|css|crf|crs|ctss|evm|gipc} ",..."
 where
   mdns          multicast Domain Name Server
   gpnp          Grid Plug-n-Play Service
   css           Cluster Synchronization Services
   crf           Cluster Health Monitor
   crs           Cluster Ready Services
   ctss          Cluster Time Synchronization Service
   evm           EventManager
   gipc          Grid Interprocess Communications
   , ...    Module names ("all" for all names)

  crsctl get log res 
 where
        Resource name

That’s good information, but the crucial bit about the and other parameters is missing. The next question I asked myself was “how do I find out which sub-modules make are part of say, crsd. Apart from looking at the trace file”. That information used to be documented in the OTN reference, but it’s now something you can evaluate from Clusterware as well:

[root@lonengbl312 ~]# crsctl lsmodules -h
Usage:
  crsctl lsmodules {mdns|gpnp|css|crf|crs|ctss|evm|gipc}
 where
   mdns  multicast Domain Name Server
   gpnp  Grid Plug-n-Play Service
   css   Cluster Synchronization Services
   crf   Cluster Health Monitor
   crs   Cluster Ready Services
   ctss  Cluster Time Synchronization Service
   evm   EventManager
   gipc  Grid Interprocess Communications

Back to the question as to which modules make up CSSD, I got this answer from Clusterware, matching the log file output:

# crsctl lsmodules css
List CSSD Debug Module: CLSF
List CSSD Debug Module: CSSD
List CSSD Debug Module: GIPCCM
List CSSD Debug Module: GIPCGM
List CSSD Debug Module: GIPCNM
List CSSD Debug Module: GPNP
List CSSD Debug Module: OLR
List CSSD Debug Module: SKGFD

To get detailed information about the log level of CSSD:module, I can now finally use the crsctl get {log|trace} daemon:module command. My CSSD logfile stated the CSSD:OLR had log level 0 and trace level 0, confirmed by Clusterware:

# crsctl get log css “OLR”
Get CSSD Module: OLR  Log Level: 0

Note that the name of the module has to be in upper case. If you are lazy like me you could use the “all” keyword instead of the module name to list all the information in one command:

# crsctl get log css all
Get CSSD Module: CLSF  Log Level: 0
Get CSSD Module: CSSD  Log Level: 2
Get CSSD Module: GIPCCM  Log Level: 2
Get CSSD Module: GIPCGM  Log Level: 2
Get CSSD Module: GIPCNM  Log Level: 2
Get CSSD Module: GPNP  Log Level: 1
Get CSSD Module: OLR  Log Level: 0
Get CSSD Module: SKGFD  Log Level: 0

Setting Log Levels

Let’s have a look how to increase log levels for the critical components. I am picking CSSD here, simply because I want to know if Oracle records the chicken-and-egg problem when having voting files in ASM in the ocssd.log file. (CSSD needs to voting files to start, and ASM needs CSSD to run, but if the voting files are in ASM, then we have gone full circle). With all the information provided in the section above, that’s actually quite simple to do.

# crsctl set log -h
Usage:
  crsctl set {log|trace} {mdns|gpnp|css|crf|crs|ctss|evm|gipc} "=,..."
 where
   mdns          multicast Domain Name Server
   gpnp          Grid Plug-n-Play Service
   css           Cluster Synchronization Services
   crf           Cluster Health Monitor
   crs           Cluster Ready Services
   ctss          Cluster Time Synchronization Service
   evm           EventManager
   gipc          Grid Interprocess Communications
   , ...    Module names ("all" for all names)
   , ...     Module log/trace levels

  crsctl set log res =
 where
      Resource name
                     Agent log levels

The object I would like to research is the voting file discovery. I know that the ASM disks have an entry in the header indicating whether or not a disk contains a voting file, in the kfdhdb.vfstart and kfdhdb.vfend fields as shown here:

# kfed read /dev/oracleasm/disks/OCR1  | grep kfdhdb.vf
kfdhdb.vfstart:                     256 ; 0x0ec: 0×00000100
kfdhdb.vfend:                       288 ; 0x0f0: 0×00000120

With the default log information, I can see the voting disk discovery happening as shown in this cut down version of the ocssd.log:

2011-06-02 09:06:19.941: [    CSSD][1078901056]clssnmvDDiscThread: using discovery string  for initial discovery
2011-06-02 09:06:19.941: [   SKGFD][1078901056]Discovery with str::
2011-06-02 09:06:19.941: [   SKGFD][1078901056]UFS discovery with ::
2011-06-02 09:06:19.941: [   SKGFD][1078901056]OSS discovery with ::
2011-06-02 09:06:19.941: [   SKGFD][1078901056]Discovery with asmlib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: str ::
2011-06-02 09:06:19.941: [   SKGFD][1078901056]Fetching asmlib disk :ORCL:OCR1:

2011-06-02 09:06:19.942: [   SKGFD][1078901056]Handle 0x58d2a60 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:OCR1:
2011-06-02 09:06:19.942: [   SKGFD][1078901056]Handle 0x58d3290 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:OCR2:
2011-06-02 09:06:19.942: [   SKGFD][1078901056]Handle 0x58d3ac0 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:OCR3:

2011-06-02 09:06:19.953: [    CSSD][1078901056]clssnmvDiskVerify: discovered a potential voting file
2011-06-02 09:06:19.953: [   SKGFD][1078901056]Handle 0x5c22670 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk  :ORCL:OCR1:
2011-06-02 09:06:19.954: [    CSSD][1078901056]clssnmvDiskVerify: Successful discovery for disk ORCL:OCR1, UID 4700d9a1-93094ff5-bf8b96e1-7efdb883, Pending CIN 0:1301401766:0, Committed CIN 0:1301401766:0
2011-06-02 09:06:19.954: [   SKGFD][1078901056]Lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: closing handle 0x5c22670 for disk :ORCL:OCR1:

2011-06-02 09:06:19.956: [    CSSD][1078901056]clssnmvDiskVerify: Successful discovery of 3 disks
2011-06-02 09:06:19.956: [    CSSD][1078901056]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2011-06-02 09:06:19.957: [    CSSD][1078901056]clssnmCompleteVFDiscovery: Completing voting file discovery

Now I’m setting the new log level for SKGFD, as this seems to be responsible for the disk discovery.

# crsctl set log css “SKGFD:5″
Set CSSD Module: SKGFD  Log Level: 5

Does it make a difference? Restarting Clusterware should give more clues. Looking at the logfile I could tell there was no difference. I tried using the trace level now for the same module:

# crsctl get trace css SKGFD
Get CSSD Module: SKGFD  Trace Level: 5

Interestingly that wasn’t picked up!

$ grep “Module = SKGFD” ocssd.log | tail -n1
2011-06-02 11:31:06.707: [    CSSD][427846944]clsu_load_ENV_levels: Module = SKGFD, LogLevel = 2, TraceLevel = 0

So I decided to leave it at this point as I was running out of time for this test. I’d be interested if anyone has successfully raised the log level for this component. On the other hand, I wonder if it is ever necessary to do so. I can confirm however that raising the log level for the CSSD module created a lot more output, actually so much that it seems to equal a system wide SQL trace-so don’t change log levels unless support directs you to.


Insert Error on Oracle 11g but Not on 10g

June 2, 2011 The most popular article on this blog includes a script that demonstrates generating a deadlock on Oracle Database 11g R1 and R2 when the exact same script on Oracle Database 10g R2 does not trigger a deadlock.  I came across another interesting change between Oracle Database 10.2.0.5 and 11.2.0.2 (the change also [...]

Micron C400 SSD登場

僕はおそらく日本一のMicron社SSDのファンです:

なのでMicron社の正規代理店株式会社エスティ トレードの方から最新C400を頂戴しました:

PCI Express 3.0は9月にサンプル出荷されるそうです

Computex TAIPEIに行ってきました。

PCIe Gen3のMBの写真です。聞いたところによるとスペックがMBメーカに渡されるのは9月なので、単なるサンプルとのことです。(拡大するとPCI Express 3.0とプリントされています)

audsid

Here’s an example of how the passing of time can allow a problem to creep up on you.

A recent client had a pair of logon/logoff database triggers to capture some information that wasn’t available in the normal audit trail, and they had been using these triggers successfully for many years, but one day they realised that the amount of redo generated per hour had become rather large, and had actually been quite bad and getting worse over the last few months for no apparent reason. (You’ve heard this one before … “honest, guv, nothing has changed”).

Here’s a clue to their problem. Their logoff trigger updated their own auditing table with the following code (although I’ve removed some columns) :

begin
        if sys_context('USERENV', 'SESSIONID') = 0 then
                update system.user_audit a
                set
                        logoff_time = sysdate
                where    audsid = sys_context('USERENV', 'SESSIONID')
                and     logon_time = (
                                select  max(logon_time)
                                from    system.user_audit b
                                where   b.audsid = a.audsid
                        );
        else
                update system.user_audit
                set
                        logoff_time = sysdate
                where   audsid = sys_context('USERENV', 'SESSIONID');
        end if;
end;

The table was suitably indexed to make the correlated update (“most recent of” subquery) operate very efficiently, so that wasn’t the problem.

You might question the validity of updating just the rows with the most recent date when the sys_context(‘userenv’,'sessioned’) is zero, and why the SQL to do the update doesn’t then use literal zeros rather than calling the sys_context() function and using a correlation column – but those are minor details.

You might wonder why zero is a special case, of course, but then you may recall that when a user connects as SYS the audsid is zero – so this code is attempting to limit the update to just the row created by the most recent logon by SYS, which may introduce a few errors but really SYS shouldn’t be connecting to a production system very often.

At this point you might ask which version of Oracle the client was running. They had started using Oracle a long time ago, but this system had gone through Oracle 9i, and was now running 10g; and if you’ve kept an eye on things like autotrace you may have noticed that the audsid used by any SYS session changed from zero to 4294967295 in the upgrade to 10g. So this code is no longer treating SYS as a special case.

By the time I was on site, the client’s audit table held about 87,000 rows for the audsid 4294967295, and every time SYS logged off the session would update every single one of them. (It took a few seconds and about 45 MB of redo for SYS to log off – and SYS sessions connected rather more frequently than expected).

Warning:

If you still have to get through the upgrade from 8i or 9i to 10g, then watch out for code that does things with the deprecated userenv(‘sessionid’) or sys_context(‘userenv’,'sessionid’).

Paralyzed by a Full To-Do List…

The subject of to-do lists has come up in conversation a few times recently. I am a big fan of them. In fact, I find it kinda difficult to keep track of stuff without them. Over the years I have observed a specific trait in myself, which is a kind of mental paralysis brought on by an overly full to-do list. It doesn’t relate to the complexity of the tasks, just the number of them. Put enough silly little things together and I start to go into meltdown. My current to-do list is getting very full and I’m getting close to critical mass…

The trick for me is to move a few quick things to the top of the list. It sounds dumb and I know it’s not going to affect the total amount of work that needs doing, but there is something so gratifying about ticking items off the list.

I think I can half the number of entries in a couple of days and then maybe I can deal with something that requires more substantial effort.

Cheers

Tim…




A more user friendly multipath.conf

During some recent work I did involving a stretched 11.2.0.2 RAC for a SAP implementation at a customer site I researched TimeFinder/Clone backups. As part of this exercise I have been able to experiment with RHEL (OEL) 5.6 and the new device mapper multipath package on the mount host. I have been very pleasantly surprise about this new feature which I’d like to share.

Background of this article

Device Mapper Multipath is the “native” Linux multipathing software, as opposed to vendor-supplied multipathing such as EMC’s Power Path or Hitachi’s HDLM.

My customer’s setup is rather unique for a SAP environment as it uses Oracle Enterprise Linux and not Solaris/SPARC or AIX on the Power platform with an active/passive solution. Well if that doesn’t make it sound unique, the fact that there is a plan to run Oracle 11.2.0.2 RAC potentially across sites using ASM and ACFS certainly makes this deployment stand out from the rest.

The reason I mention this is simple-it requires a lot of engineering effort to certify components for this combination. For example: it was not trivial to get vendor support for Solutions Enabler the storage engineering uses for connectivity with the VMAX arrays, and so on. After all, Oracle Enterprise is a fairly new platform and Red Hat certainly has an advantage when it comes to vendor certification.

What hasn’t been achieved was a certification of the EMC Power Path software for use with SAP on Linux, for reasons unknown to me. The result was simple: the setup will use the device-mapper multipath package that comes with the Linux distribution.

Configuring multipath

Now with this said I started looking at MOS to get relevant support notes about Linux’s native multipath and found some. The summary of this research is available on this blog, I have written about it in these articles:

What I didn’t know up to date was the fact that the multipath.conf file allows you to define the ownership and mode of a device. As an ASM user this is very important to me. Experience taught me that incorrect permissions are one of the main reasons for ASM failing to start. Remember that root owns block devices by default.

Consider this example:

multipaths {
...
multipath {
wwid        360a98000486e58526c34515944703277
alias       ocr01
mode        660
uid         501
gid         502
}
...
}

The above entry in the multipaths section translates into English as follows:

  • If you encounter a device with the WWID 36a…277
  • Give it an alias name of “OCR01” in /dev/mapper
  • Set the mode to 660 (i.e. rw-rw—)
  • Assign device ownership to the user with UID 501 (maps to “grid” on my system)
  • Assign the group of the device to 502 (maps to “asmdba” on my system)

The path settings are defined globally and don’t need to be mentioned explicitly for each device unless you prefer to override them. I like to use an alias although it isn’t really necessary since ASM relies on a 4k header in a block device to store its identity. If you don’t chose to alias a device I recommend you use the user friendly name instead, mainly for aesthetic reasons.

Why is this really cool? For me it implies two things:

  • I can now be independent of ASMLib which provided device name stability and set the permissions on the block devices correctly for ASM
  • I don’t need to create udev rules to set the permissions (or /etc/rc.local or whichever other way you chose before to set permissions)

Nice! One less headache, as I have to say that I didn’t really like udev…