Oakies Blog Aggregator

iASH–my “infinite ASH” routine

I love Active Session History (ASH) data because a lot of the work I’ve done in my consulting life was “after the fact” diagnosis.  By this I mean that many of us have been in a similar circumstance where the customer will contact you not when a problem is occurring, but only when you contact them for some other potentially unrelated reason.  At which point you hear will that dreaded sentence:

“Yeah, the Order Entry screen was really slow a couple of hours ago

And this is where ASH is an awesome resource.  With the ASH data available, there is a good chance you will be able to diagnose the issue without having to make an embarrassing request for the customer to repeat the task so that you can trace the underlying database activity.  Because no-one likes to be the person that says:

“Yeah that performance must have really sucked for you … Hey, let’s do it again!”

But even ASH has it’s limitations because sometimes the customer sentence is phrased slightly differently Smile

“Yeah, the Order Entry screen was really slow last Tuesday

In this case, it is unlikely that the ASH data will still be available.  Whilst a subset of the invaluable ASH data is retained in DBA_HIST_ACTIVE_SESS_HISTORY, I would prefer to keep the complete set of ASH data available for longer than the timespan for which it is practical (due to the limitations of memory). So I wrote myself a simple little routine that keep all of the ASH data around for longer.  It’s hardly rocket science – just a little partitioned table to capture V$ACTIVE_SESSION_HISTORY at regular intervals.  Let’s walk through it so you can understand it and adapt it for your own use.

First I’ll create a partitioned table to hold the ASH data.  I’m using partitioning to avoid the need to index the table, so the insertion cost is minimal.  I’m partitioning by day and the code assumes this, so take care if you intend to modify it.


SQL> create table ash_hist
  2  partition by range (sample_time)
  3  interval( numtodsinterval(1,'day'))
  4  (partition p1 values less than (timestamp' 2017-01-01 00:00:00'))
  5  as select * from sys.gv_$active_session_history;

Table created.

Here is my procedure to capture the data.  The essentials of the routine are:

  • Starting with the most recent partition, find the last recorded entry in ASH_HIST.  We’ll look back up to 10 days to find our starting point (hence the daily partitions).
  • If there is no data for the last 10 days, we’ll bomb out, because we haven’t been running the routine frequently enough.
  • Copy all the ASH data from this point to now into ASH_HIST using a nice efficient INSERT-APPEND, but we’ll skip the session that is doing the copying. (You can include it if you want just by removing line 8)
  • Once per week (you can control this by tinkering with the IF conditions on line 34) we’ll drop the oldest partitions.  By default I keep 90 days, but you can set this by altering “l_retention” on line 5.

SQL>
SQL> CREATE OR REPLACE procedure save_ash_hist is
  2    l_hi_val_as_string varchar2(1000);
  3    l_hi_val_as_date   date;
  4    l_max_recorded     timestamp;
  5    l_retention        number := 90;
  6
  7  begin
  8    dbms_application_info.set_module('$$SAVE_ASH$$','');
  9    -- we are looping to take advantage
 10    -- of partition elimination
 11
 12    for i in 0 .. 10 loop
 13       select max(sample_time)
 14       into   l_max_recorded
 15       from   ash_hist
 16       where  sample_time > systimestamp - i;
 17
 18       exit when l_max_recorded is not null;
 19    end loop;
 20
 21    if l_max_recorded is null then
 22      raise_application_error(-20000,'No max sample time with 10 days');
 23    end if;
 24    dbms_output.put_line('Last copied time was '||l_max_recorded);
 25
 26    insert /*+ append */ into ash_hist
 27    select *
 28    from sys.gv_$active_session_history
 29    where sample_time > l_max_recorded
 30    and   ( module != '$$SAVE_ASH$$' or module is null );
 31    dbms_output.put_line('Copied '||sql%rowcount||' rows');
 32    commit;
 33
 34    if to_char(sysdate,'DYHH24') between 'TUE01' and 'TUE06' then
 35
 36      begin
 37        execute immediate 'alter table ash_hist set interval ()';
 38      exception
 39        when others then null;
 40      end;
 41      execute immediate 'alter table ash_hist set interval (NUMTODSINTERVAL(1,''DAY''))';
 42
 43      for i in ( select *
 44                 from   user_tab_partitions
 45                 where  table_name = 'ASH_HIST'
 46                 and    partition_position > 1
 47                 order by partition_position )
 48      loop
 49        l_hi_val_as_string := i.high_value;
 50        execute immediate 'select '||l_hi_val_as_string||' from dual' into l_hi_val_as_date;
 51
 52        if l_hi_val_as_date < sysdate - l_retention then
 53          execute immediate 'alter table ash_hist drop partition '||i.partition_name;
 54        else
 55          exit;
 56        end if;
 57
 58      end loop;
 59    end if;
 60  end;
 61  /

Procedure created.

And that is all there is to it.  Each time we run the procedure, we’ll grab all the ASH data since the last time we ran and keep it in ASH_HIST.


SQL>
SQL> select count(*) from ash_hist;

  COUNT(*)
----------
       792

1 row selected.


SQL>
SQL> exec save_ash_hist

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from ash_hist;

  COUNT(*)
----------
       819

1 row selected.

A simple scheduler job to run the routine every couple of hours (I’m assuming your SGA holds at least 2 hours of samples in V$ACTIVE_SESSION_HISTORY – if not, you’d need to adjust the frequency) and you’re off and running.


SQL>
SQL>
SQL> BEGIN
  2      dbms_scheduler.create_job (
  3         job_name           =>  'ASH_CAPTURE',
  4         job_type           =>  'PLSQL_BLOCK',
  5         job_action         =>  'save_ash_hist;',
  6         start_date         =>  CAST((TRUNC(SYSDATE,'HH') + (1/24) + (55/24/60)) AS TIMESTAMP), -- job commences at 55 mins past the next hour
  7         repeat_interval    =>  'FREQ=HOURLY; INTERVAL=2',
  8         enabled            =>  true,
  9         comments           =>  'Permanent record of ASH data');
 10  END;
 11  /

PL/SQL procedure successfully completed.

You can assume all of the standard disclaimers here. Use at own risk, blah blah, no warranty, blah blah, etc

The hang manager

Recently I was looking in the trace directory in the diag dest of my (12.2.0.1) instance, and found some trace files which were different from the others:

$ ls | grep dia0
test_dia0_9711_base_1.trc
test_dia0_9711_base_1.trm
test_dia0_9711_lws_1.trc
test_dia0_9711_lws_1.trm
test_dia0_9711.trc
test_dia0_9711.trm

The dia0 ‘test_dia0_9711.trc’ file is fully expected. But what are these ‘..lws_1.trc’ and ‘..base_1.trc’ files? And ‘base’ is something that I understand, but what would ‘lws’ mean? Lunatics Without Skateboards?

First, let’s look into the normally named trace file of the dia0 process:

$ cat test_dia0_9711.trc
HM: Early Warning and/or Long Waiting Session (LWS) is enabled.
    The output can be found in trace files of the form:
    test_dia0_9711_lws_n.trc where n is a number between 1 and 5.

*** 2017-11-26T12:18:25.849759+00:00
HM: Moving the default trace file to: test_dia0_9711_base_1.trc


*** 2017-11-26T16:45:33.193729+00:00
HM: Tracing moved from base trace file: test_dia0_9711_base_1.trc

HM: This is a single non-RAC instance.  Hang detection is enabled.
HM: Hang resolution scope is OFF.
HM: Cross Boundary hang detection is enabled.
HM: Cross Cluster hang detection is enabled.
HM: CPU load is not being monitored.
HM: IO load is NOT abnormally high - 0% outliers.

HM: Tracing is now being performed to this file.
2017-11-26 16:45:33.193 :kjzdshn(): Shutting down DIAG

Ah! So LWS means ‘Long Waiting Sessions’. This seems to be functionality that produces trace files for execution which do not perform optimally.

Let’s look into the ‘base’ trace file:

*** 2017-11-26T12:18:25.849773+00:00
HM: Tracing moved from default trace file: test_dia0_9711.trc

HM: This is a single non-RAC instance.  Hang detection is enabled.
HM: Hang resolution scope is OFF.
HM: Cross Boundary hang detection is enabled.
HM: Cross Cluster hang detection is enabled.
HM: CPU load is not being monitored.
HM: IO load is NOT abnormally high - 0% outliers.

*** 2017-11-26T12:18:25.849903+00:00
HM: Early Warning and/or Long Waiting Session (LWS) is enabled.
    The output can be found in trace files of the form:
    test_dia0_9711_lws_n.trc where n is a number between 1 and 5.

*** 2017-11-26T12:18:30.870494+00:00
HM: Detected incompatible setting 'INSTANCE' for Hang Resolution Scope
    - disabling hang resolution.

I get the feeling this is originally made for RAC, but now also working for single instance. Some RAC related features seem to be turned off. It seems that the actual session information is stored in the ‘lws’ trace files.

This is a sample of what is in a lws trace file:

*** 2017-11-26T13:14:25.053171+00:00
HM: Early Warning - Session ID 58 serial# 53774 OS PID 30760 (FG) is
     'not in a wait' for 33 seconds
     last waited on 'resmgr:cpu quantum'
     p1: 'location'=0x3, p2: 'consumer group id'=0x4a2f, p3: ' '=0x0

                                                     IO
 Total  Self-         Total  Total  Outlr  Outlr  Outlr
  Hung  Rslvd  Rslvd   Wait WaitTm   Wait WaitTm   Wait
  Sess  Hangs  Hangs  Count   Secs  Count   Secs  Count Wait Event
------ ------ ------ ------ ------ ------ ------ ------ -----------
     1      0      0      0      0      0      0      0 not in wait


HM: Dumping Short Stack of pid[71.30760] (sid:58, ser#:53774)
Short stack dump:
ksedsts()+346<-ksdxfstk()+71<-ksdxcb()+912<-sspuser()+217<-__sighandler()<-sslnxadd()+38<-pfrinstr_ADDN()+103<-pfrrun_no_tool()+60<-pfrrun()+919<-plsql_run()+756<-peicnt()+288<-kkxexe()+717<-opiexe()+22531<-kpoal8()+2679<-opiodr()+1229<-ttcpip()+1257<-opitsk()+1940<-opiino()+941<-opiodr()+1229<-opidrv()+1021<-sou2o()+145<-opimai_real()+455<-ssthrdmain()+417<-main()+262<-__libc_start_main()+245

HM: Current SQL: declare
        x number := 0;
begin
        for c in 1..1000000000 loop
                x := x + c;
        end loop;
end;

I got a sample PLSQL loop running for testing load (which is really silly), and this running for 33 seconds apparently generates an ‘Early Warning’. It shows what it was doing from a wait based analysis perspective (‘not in a wait’, ‘resmgr: cpu quantum’), it shows some statistics for the hang analyser, it dumps a short stack, which is fully expected; kdsests() to __sighandler() are the routines for generating the stack dump, the actual function that was executing was sslnxadd(), which is a system specific linux add function, which is called by pfrinstr_ADDN(), which is the PLSQL add function, which is called by pfrrun_no_tool(), which is the PLSQL fast interpretation loop function, and so on. The last thing it dumps is the current SQL, which is the silly PLSQL code.

Is there anything inside the Oracle database that shows anything related to this? Yes, it seems there are:

V$HANG_INFO
V$HANG_SESSION_INFO
V$HANG_STATISTICS
DBA_HANG_MANAGER_PARAMETERS

Doing some simple tests, I could not get V$HANG_INFO and V$HANG_SESSION_INFO populated. V$HANG_STATISTICS contains 87 statistics, and displays current data, of which most in my single instance test database are 0.

One thing that confused me at first was the abbreviation ‘HM’ for hang monitor, because the very same abbreviation is used in the diagnostic repository for ‘health monitor’.

It seems the hang manager is a quite extensive framework for detecting hangs and hang related issues; I peeked at the documented and undocumented parameters containing “hang” in them, of which there are quite an impressive amount quite likely belonging to the hang manager framework:

SQL> @parms
Enter value for parameter: hang
Enter value for isset:
Enter value for show_hidden: Y
_global_hang_analysis_interval_secs                10                                                                     TRUE     FALSE      FALSE
_hang_allow_resolution_on_single_nonrac            FALSE                                                                  TRUE     FALSE      FALSE
_hang_analysis_num_call_stacks                     3                                                                      TRUE     FALSE      FALSE
_hang_base_file_count                              5                                                                      TRUE     FALSE      FALSE
_hang_base_file_space_limit                        100000000                                                              TRUE     FALSE      FALSE
_hang_blocked_session_percent_threshold            10                                                                     TRUE     FALSE      FALSE
_hang_bool_spare1                                  TRUE                                                                   TRUE     FALSE      FALSE

NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
_hang_bool_spare2                                  TRUE                                                                   TRUE     FALSE      FALSE
_hang_cross_boundary_hang_detection_enabled        TRUE                                                                   TRUE     FALSE      FALSE
_hang_cross_cluster_hang_detection_enabled         TRUE                                                                   TRUE     FALSE      FALSE
_hang_delay_resolution_for_libcache                TRUE                                                                   TRUE     FALSE      FALSE
_hang_detection_enabled                            TRUE                                                                   TRUE     FALSE      FALSE
_hang_enable_processstate                          TRUE                                                                   TRUE     FALSE      FALSE
_hang_hang_analyze_output_hang_chains              TRUE                                                                   TRUE     FALSE      FALSE
_hang_hiload_promoted_ignored_hang_count           2                                                                      TRUE     FALSE      FALSE
_hang_hiprior_session_attribute_list                                                                                      TRUE     FALSE      FALSE
_hang_hung_session_ewarn_percent                   34                                                                     TRUE     FALSE      FALSE
_hang_ignored_hang_count                           1                                                                      TRUE     FALSE      FALSE

NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
_hang_ignored_hangs_interval                       300                                                                    TRUE     FALSE      FALSE
_hang_ignore_hngmtrc_count                         1                                                                      TRUE     FALSE      FALSE
_hang_int_spare1                                   0                                                                      TRUE     FALSE      FALSE
_hang_int_spare2                                   0                                                                      TRUE     FALSE      FALSE
_hang_log_io_hung_sessions_to_alert                FALSE                                                                  TRUE     FALSE      FALSE
_hang_log_verified_hangs_to_alert                  FALSE                                                                  TRUE     FALSE      FALSE
_hang_long_wait_time_threshold                     0                                                                      TRUE     FALSE      FALSE
_hang_lws_file_count                               5                                                                      TRUE     FALSE      FALSE
_hang_lws_file_space_limit                         100000000                                                              TRUE     FALSE      FALSE
_hang_max_session_hang_time                        96                                                                     TRUE     FALSE      FALSE
_hang_monitor_archiving_related_hang_interval      300                                                                    TRUE     FALSE      FALSE

NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
_hang_msg_checksum_enabled                         TRUE                                                                   TRUE     FALSE      FALSE
_hang_resolution_allow_archiving_issue_termination TRUE                                                                   TRUE     FALSE      FALSE
_hang_resolution_confidence_promotion              FALSE                                                                  TRUE     FALSE      FALSE
_hang_resolution_global_hang_confidence_promotion  FALSE                                                                  TRUE     FALSE      FALSE
_hang_resolution_percent_hung_sessions_threshold   300                                                                    TRUE     FALSE      FALSE
_hang_resolution_policy                            HIGH                                                                   TRUE     FALSE      FALSE
_hang_resolution_promote_process_termination       FALSE                                                                  TRUE     FALSE      FALSE
_hang_resolution_scope                             OFF                                                                    TRUE     FALSE      FALSE
_hang_running_in_lrg                               FALSE                                                                  TRUE     FALSE      FALSE
_hang_short_stacks_output_enabled                  TRUE                                                                   TRUE     FALSE      FALSE
_hang_signature_list_match_output_frequency        10                                                                     TRUE     FALSE      FALSE

NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
_hang_statistics_collection_interval               15                                                                     TRUE     FALSE      FALSE
_hang_statistics_collection_ma_alpha               30                                                                     TRUE     FALSE      FALSE
_hang_statistics_high_io_percentage_threshold      25                                                                     TRUE     FALSE      FALSE
_hang_terminate_session_replay_enabled             TRUE                                                                   TRUE     FALSE      FALSE
_hang_trace_interval                               32                                                                     TRUE     FALSE      FALSE
_hang_verification_interval                        46                                                                     TRUE     FALSE      FALSE

For the sake of experimenting with the hang manager I locked a table using ‘lock table t2 in exclusive mode’ and another session doing an insert. The ‘base’ trace file showed the following information:

*** 2017-12-11T06:25:10.364037+00:00
HM: Hung Sessions (local detect) - output local chains
===============================================================================
Non-intersecting chains:

-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (test.test)
                   os id: 7526
              process id: 38, oracle@memory-presentation.local (TNS V1-V3)
              session id: 27
        session serial #: 32599
    }
    is waiting for 'enq: TM - contention' with wait info:
    {
                      p1: 'name|mode'=0x544d0003
                      p2: 'object #'=0x57e6
                      p3: 'table/partition'=0x0
            time in wait: 1 min 35 sec
           timeout after: never
                 wait id: 74
                blocking: 0 sessions
            wait history:
              * time between current wait and wait #1: 0.000123 sec
              1.       event: 'SQL*Net message from client'
                 time waited: 26.493425 sec
                     wait id: 73               p1: 'driver id'=0x62657100
                                               p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000025 sec
              2.       event: 'SQL*Net message to client'
                 time waited: 0.000002 sec
                     wait id: 72               p1: 'driver id'=0x62657100
                                               p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000005 sec
              3.       event: 'SQL*Net break/reset to client'
                 time waited: 0.000055 sec
                     wait id: 71               p1: 'driver id'=0x62657100
                                               p2: 'break?'=0x0
    }
    and is blocked by
 => Oracle session identified by:
    {
                instance: 1 (test.test)
                   os id: 7481
              process id: 24, oracle@memory-presentation.local (TNS V1-V3)
              session id: 55
        session serial #: 9654
    }
    which is waiting for 'SQL*Net message from client' with wait info:
    {
                      p1: 'driver id'=0x62657100
                      p2: '#bytes'=0x1
            time in wait: 1 min 39 sec
           timeout after: never
                 wait id: 6714
                blocking: 1 session
            wait history:
              * time between current wait and wait #1: 0.000006 sec
              1.       event: 'SQL*Net message to client'
                 time waited: 0.000004 sec
                     wait id: 6713             p1: 'driver id'=0x62657100
                                               p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000203 sec
              2.       event: 'SQL*Net message from client'
                 time waited: 17.115734 sec
                     wait id: 6712             p1: 'driver id'=0x62657100
                                               p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000007 sec
              3.       event: 'SQL*Net message to client'
                 time waited: 0.000004 sec
                     wait id: 6711             p1: 'driver id'=0x62657100
                                               p2: '#bytes'=0x1
    }

Chain 1 Signature: 'SQL*Net message from client'<='enq: TM - contention'
Chain 1 Signature Hash: 0x163c4cba
-------------------------------------------------------------------------------

===============================================================================

If you wait a little longer, there is another type of dia0 trace file that appears: ‘..vfy_1.trc’, which is a file containing ‘verified hangs’. They appear one time for a given hang situation, and looks like this:

One or more possible hangs have been detected on your system.
These could be genuine hangs in which no further progress will
be made without intervention, or it may be very slow progress
in the system due to high load.

Previously detected and output hangs are not displayed again.
Instead, the 'Victim Information' section will indicate that
the victim is from an 'Existing Hang' under the 'Existing or'
column.

'Verified Hangs' below indicate one or more hangs that were found
and identify the final blocking session and instance on which
they occurred. Since the current hang resolution state is 'OFF',
there will be no attempt to resolve any of these hangs unless
hang resolution is enabled.  Terminating this session may well
free up the hang. Be aware that if the column named 'Fatal BG'
is marked with 'Y', terminating the process will cause the
instance to crash.

Any hangs with a 'Hang Resolution Action' of 'Unresolvable'
will be ignored. These types of hangs will either be resolved
by another layer in the RDBMS or cannot be resolved as they may
require external intervention.

Deadlocks (also named cycles) are currently NOT resolved even if
hang resolution is enabled.  The 'Hang Type' of DLCK in the
'Verified Hangs' output identifies these hangs.

Below that are the complete hang chains from the time the hang
was detected.

The following information will assist Oracle Support Services
in further analysis of the root cause of the hang.


*** 2017-12-11T07:05:07.989085+00:00
HM: Hang Statistics - only statistics with non-zero values are listed

                     number of hangs detected 1
                        number of local hangs 1
  hangs ignored due to application contention 1
      hangs:total number of impacted sessions 2
    hangs:minimum number of impacted sessions 2
    hangs:maximum number of impacted sessions 2
                      current number of hangs 1
            current number of active sessions 1
              current number of hung sessions 2

*** 2017-12-11T07:05:07.989122+00:00
Verified Hangs in the System
                     Root       Chain Total               Hang
  Hang Hang          Inst Root  #hung #hung  Hang   Hang  Resolution
    ID Type Status   Num  Sess   Sess  Sess  Conf   Span  Action
 ----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
     1 HANG VICSELTD    1    58     2     2    LOW  LOCAL IGNRD:HngRslnScpOFF
 Hang Ignored Reason: Hang resolution is disabled.

  Inst  Sess   Ser             Proc  Wait    Wait
   Num    ID    Num      OSPID  Name Time(s) Event
  ----- ------ ----- --------- ----- ------- -----
      1     45  5781     27709    FG     137 enq: TM - contention
      1     58 15676     27700    FG     146 SQL*Net message from client

Wait event statistics for event 'SQL*Net message from client'

Self-Resolved Hang Count                        : 0
Self-Resolved Hang Total Hang Time in seconds   : 0
Self-Resolved Hang Minimum Hang Time in seconds : 0
Self-Resolved Hang Maximum Hang Time in seconds : 0
Resolved Hang Count                             : 0
Total Hung Session Count                        : 1
Total Wait Time in micro-seconds                : 59799468
Total Wait Count                                : 67
Moving Average of Wait Time in micro-seconds    : 5220981

Victim Information
                                                         Existing or  Ignored
  HangID  Inst#  Sessid  Ser Num      OSPID  Fatal BG       New Hang    Count
  ------  -----  ------  -------  ---------  --------  -------------  -------
       1      1      58    15676      27700     N           New Hang        1

It appears Oracle adds more and more useful information in trace files, which only can be fetched from the trace files. Yet it doesn’t seem that any monitoring is actually picking this up, which, to my knowledge includes Oracle’s own Enterprise Manager.

The hang manager tracing talks about outliers, upon grepping through the dia0 trace files I found an explanation what dia0 is considering an outlier:

*** 2017-11-26T16:45:33.193479+00:00
HM: Hang Statistics - only statistics with non-zero values are listed

            current number of active sessions 1
  instance health (in terms of hung sessions) 100.00%

Dumping Wait Events with non-zero hung session counts
                       or non-zero outliers wait counts
(NOTE: IO outliers have wait times greater than 32 ms.
       non-IO outliers have wait times greater than 64 s.
       IDLE wait events are not output.)

                                                     IO
 Total  Self-         Total  Total  Outlr  Outlr  Outlr
  Hung  Rslvd  Rslvd   Wait WaitTm   Wait WaitTm   Wait
  Sess  Hangs  Hangs  Count   Secs  Count   Secs  Count Wait Event
------ ------ ------ ------ ------ ------ ------ ------ -----------
     0      0      0   1366      2      0      0      1 log file parallel write
     1      0      0   2206 4294966594      1 4294966272      0 resmgr:cpu quantum
     0      0      0      1      1      0      0      1 external table read
     1      0      0      0      0      0      0      0 not in wait


*** 2017-11-26T16:45:33.193686+00:00
TotalCPUTm:4230 ms TotalRunTm:16027340 ms %CPU: 0.03%

Tagged: hang monitor, HM, oracle, trace, tracefile

SQL_ADAPTIVE_PLAN_RESOLVED Is Broken

You can use the V$SQL.IS_RESOLVED_ADAPTIVE_PLAN column to know whether the execution plan associated to a child cursor is adaptive or not. Specifically, to know whether the query optimizer selected either an adaptive join method or an adaptive star transformation (notice that it is not set when the hybrid hash distribution is involved).

The column takes one of the following values:

  • NULL: the execution plan associated to the cursor is not adaptive
  • N: the execution plan is adaptive but the final execution plan is not yet determined
  • Y: the execution plan is adaptive and the final execution plan was determined

If you are interested in such an information for a SQL statement that is neither cached in the library cache nor present in AWR, as of version 12.1.0.2 you might want to check the SQL_ADAPTIVE_PLAN_RESOLVED column provided by the views V$ACTIVE_SESSION_HISTORY and DBA_HIST_ACTIVE_SESS_HISTORY. In fact, according to the documentation, it provides the following information:

Indicates whether the SQL plan of the sampled database session is a resolved adaptive plan or not

Even though the documentation provides no information about the valid values for that column, given that according to my observations only the values “0” and “1” are used, I conclude that adaptive plans should be marked with the value “1”.

Unfortunately, this is not the case. In fact, also executions that are not using an adaptive plan are marked with the value “1”!

I reproduced the issue with the following test case (notice that even though the OPTIMIZER_ADAPTIVE_PLANS parameter is set to FALSE and the cursor has the IS_RESOLVED_ADAPTIVE_PLAN column set to NULL, according to the SQL_ADAPTIVE_PLAN_RESOLVED column an execution involving an adaptive plan took place):

SQL> CREATE TABLE t1 (id PRIMARY KEY, n, pad)
  2  AS
  3  SELECT rownum, CASE WHEN rownum <= 10000 THEN rownum ELSE 666 END, lpad('*',100,'*')
  4  FROM dual
  5  CONNECT BY level <= 10150;

SQL> execute dbms_stats.gather_table_stats(user,'t1')

SQL> CREATE TABLE t2 (id PRIMARY KEY, n, pad)
  2  AS
  3  SELECT rownum, rownum, lpad('*',100,'*')
  4  FROM dual
  5  CONNECT BY level <= 10000;

SQL> execute dbms_stats.gather_table_stats(user,'t2')

SQL> CREATE OR REPLACE FUNCTION burn_cpu(p_wait IN INTEGER) RETURN INTEGER IS
  2    l_begin NUMBER;
  3  BEGIN
  4    l_begin := dbms_utility.get_time();
  5    LOOP
  6      EXIT WHEN l_begin+(p_wait*100) < dbms_utility.get_time();
  7    END LOOP;
  8    RETURN 1;
  9  END burn_cpu;
 10  /

SQL> ALTER SYSTEM FLUSH SHARED_POOL;

SQL> show parameter optimizer_adaptive

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_adaptive_plans             boolean     FALSE
optimizer_adaptive_reporting_only    boolean     FALSE
optimizer_adaptive_statistics        boolean     FALSE

SQL> SELECT count(t1.pad), count(t2.pad)
  2  FROM t1, t2
  3  WHERE t1.id = t2.id
  4  AND t1.n = 666
  5  AND burn_cpu(t1.id/t1.id) = 1;

COUNT(T1.PAD) COUNT(T2.PAD)
------------- -------------
            1             1

SQL> SELECT sql_id, child_number, is_resolved_adaptive_plan, elapsed_time
  2  FROM v$sql
  3  WHERE (sql_id, child_number) = (SELECT prev_sql_id, prev_child_number
  4                                  FROM v$session
  5                                  WHERE sid = sys_context('userenv','sid'));

SQL_ID        CHILD_NUMBER IS_RESOLVED_ADAPTIVE_PLAN ELAPSED_TIME
------------- ------------ ------------------------- ------------
8ygb8f7cycp4f            0                              152687501

SQL> SELECT sql_id, sql_adaptive_plan_resolved, count(*)
  2  FROM v$active_session_history
  3  WHERE session_id = sys_context('userenv','sid')
  4  AND sample_time > systimestamp - INTERVAL '150' SECOND
  5  GROUP BY sql_id, sql_adaptive_plan_resolved;

SQL_ID        SQL_ADAPTIVE_PLAN_RESOLVED   COUNT(*)
------------- -------------------------- ----------
8ygb8f7cycp4f                          1        150

Even though I still have to open a service request, in my opinion this is a bug.

Welcome to pgdfv: PostgreSQL data file visualizer

Introduction

As you may know the PostgreSQL database page contains a lot of informations that is documented here. A great study has been done by Frits Hoogland in this series of blogposts. I strongly recommend to read Frits series before to read this blog post (unless you are familiar with PostgreSQL block internals).

By reading the contents of a page we can extract:

  • The percentage of free space within a page
  • The percentage of current rows within a page
  • The percentage of unused rows within a page

Welcome to pgdfv

pgdfv stands for: PostgreSQL data file visualizer. It helps to visualize the data file pages in a easy way.

For each block:

  • A color is assigned (depending of the percentage of free space)
  • The color can be displayed in 2 ways (depending if more than 50 percent of the rows are current)
  • A number is assigned (based on the percentage of unused rows)

At the end the utility provides a summary for all the blocks visited. As a picture is worth a thousand words, let’s see some examples.

Examples

Let’s create a table, insert 4 rows in it and inspect its content:

pgdbv=# create table bdtable(id int not null, f varchar(30) );
CREATE TABLE
pgdbv=# insert into bdtable ( id, f ) values (1, 'aaaaaaaaaa');
INSERT 0 1
pgdbv=# insert into bdtable ( id, f ) values (2, 'aaaaaaaaaa');
INSERT 0 1
pgdbv=# insert into bdtable ( id, f ) values (3, 'aaaaaaaaaa');
INSERT 0 1
pgdbv=# insert into bdtable ( id, f ) values (4, 'aaaaaaaaaa');
INSERT 0 1
pgdbv=# checkpoint;
CHECKPOINT
pgdbv=# select * from heap_page_items(get_raw_page('bdtable',0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |              t_data
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+----------------------------------
  1 |   8152 |        1 |     39 |   1945 |      0 |        0 | (0,1)  |           2 |       2050 |     24 |        |       | \x010000001761616161616161616161
  2 |   8112 |        1 |     39 |   1946 |      0 |        0 | (0,2)  |           2 |       2050 |     24 |        |       | \x020000001761616161616161616161
  3 |   8072 |        1 |     39 |   1947 |      0 |        0 | (0,3)  |           2 |       2050 |     24 |        |       | \x030000001761616161616161616161
  4 |   8032 |        1 |     39 |   1948 |      0 |        0 | (0,4)  |           2 |       2050 |     24 |        |       | \x040000001761616161616161616161
(4 rows)

In PostgreSQL, each table is stored in a separate file. When a table exceeds 1 GB, it is divided into gigabyte-sized segments.

Let’s check which file contains the table:

pgdbv=# SELECT pg_relation_filepath('bdtable');
 pg_relation_filepath
----------------------
 base/16416/16448

Let’s use the utility on this file:

https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 150w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 300w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 768w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1024w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1526w" sizes="(max-width: 1100px) 100vw, 1100px" />

So one block is detected, with more than 75% of free space (so the green color), more than 50% of the rows are current (100% in our case as tx_max = 0 for all the rows) and unused are less than 10% (0 is displayed) (0% in our case as no rows with lp_flags = 0).

Let’s delete, 3 rows:

pgdbv=# delete from bdtable where id <=3;
DELETE 3
pgdbv=# checkpoint;
CHECKPOINT
pgdbv=# select * from heap_page_items(get_raw_page('bdtable',0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |              t_data
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+----------------------------------
  1 |   8152 |        1 |     39 |   1945 |   1949 |        0 | (0,1)  |        8194 |        258 |     24 |        |       | \x010000001761616161616161616161
  2 |   8112 |        1 |     39 |   1946 |   1949 |        0 | (0,2)  |        8194 |        258 |     24 |        |       | \x020000001761616161616161616161
  3 |   8072 |        1 |     39 |   1947 |   1949 |        0 | (0,3)  |        8194 |        258 |     24 |        |       | \x030000001761616161616161616161
  4 |   8032 |        1 |     39 |   1948 |      0 |        0 | (0,4)  |           2 |       2306 |     24 |        |       | \x040000001761616161616161616161
(4 rows)

and launch the tool again:

https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 150w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 300w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 768w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1024w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1514w" sizes="(max-width: 1100px) 100vw, 1100px" />

As you can see we still have more than 75% of free space in the block but the way to display the color has been changed (because now less than 50% of the rows are current aka tx_max = 0) and unused is still less than 10% (0 is displayed).

Let’s vacuum the table:

pgdbv=# vacuum bdtable;
VACUUM
pgdbv=# checkpoint;
CHECKPOINT
pgdbv=# select * from heap_page_items(get_raw_page('bdtable',0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |              t_data
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+----------------------------------
  1 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |       |
  2 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |       |
  3 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |       |
  4 |   8152 |        1 |     39 |   1948 |      0 |        0 | (0,4)  |           2 |       2306 |     24 |        |       | \x040000001761616161616161616161
(4 rows)

and launch the utility:

https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 150w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 300w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 768w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1024w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1528w" sizes="(max-width: 1100px) 100vw, 1100px" />

As you can see we still have more than 75% of free space, less than 50% of the rows are current (only one in our case) and now there is between 70 and 80% of unused rows in the block (so 7 is displayed).

The legend and summary are dynamic and depend of the contents of the scanned blocks.

For example, on a newly created table (made of 355 blocks) you could end up with something like:

https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 150w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 300w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 768w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1024w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 2114w" sizes="(max-width: 1100px) 100vw, 1100px" />

Then, you delete half of the rows:

https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 150w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 300w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 768w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1024w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 2126w" sizes="(max-width: 1100px) 100vw, 1100px" />

Then, once the table has been vacuum:

https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 150w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 300w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 768w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1024w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 2120w" sizes="(max-width: 1100px) 100vw, 1100px" />

And once new rows have been inserted:

https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 150w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 300w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 768w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 1024w, https://bdrouvot.files.wordpress.com/2017/12/screen-shot-2017-12-10-at-1... 2122w" sizes="(max-width: 1100px) 100vw, 1100px" />

Remarks

  • The tool is available in this repository.
  • The tool is 100% inspired by Frits Hoogland blogpost series and by odbv (written by Kamil Stawiarski) that can be used to visualize oracle database blocks.

Conclusion

pgdfv is a new utility that can be used to visualize PostgreSQL data file pages.

#UKOUG_TECH17

Award

ukoug_tech17_award_paul_fitton.jpgI’ve received an award for an article I’ve written last year, the Most Read Oracle Scene Article in 2016 – Technology. I like to write for Oracle Scene magazine. It is rare today to have a magazine both available in a paper version, and also publicly available on the web. And I must say that as an author, all the people behind are doing a great work. Thanks to them and thanks for the award. Seeing that what I write helps is the motivation to do so.

The article is: Is your AWR/Statspack Report Relevant.

This is the occasion to link to other articles I’ve written for the UKOUG Magazine. Statspack, because not everybody has Enterprise Edition with optional Diagnostic Pack: Improving Statspack Experience. One on the parameter I hate the most: CBO Choice between Index & Full Scan: the Good, the Bad & the Ugly parameters. Another about the statistics that should replace this awful parameter: Demystifying WORKLOAD System Statistics Gathering.

ukoug_tech17_award_keynoteTwo interesting features in 12c: 12c Online Statistics Gathering & Session Private Statistics. A big change that happened in 12.2.0.2 concerning availability: I/O Error on Datafile: Instance Crash or Datafile Offline?. My SLOB performance tests when ODA Lite came with MVMe SSD: Performance for All Editions on ODA X6-2S/M. And finally, the article on the great Oracle features for physical transport/clone/migration: From Transportable Tablespaces to Pluggable Databases

I’ve stolen a few pictures to illustrate this blog post, from UKOUG photo albums, and from friends.

Presentations

Pres1The article on Transportable Tablespaces and Pluggable Databases is actually based one one presentation I did. I was in that big Hall 1 [I realize that some may do a joke on this, but I’m talking about the room] where I look like Ant-Man from the back of the room. But doing live demos is great on this huge screen.

The idea for this presentation came 1 year ago when preparing a 5 minutes talk for Oracle Open World 2016 ACED talks (blog post about this) and this subject is so exciting that I wanted to share more about it. Online PDB clone and relocate will be the features you will like the most when going to Multitenant.

pres2 My second presentation was more developer focused, exposing all Join Methods that can magically construct your query result in a record time, or, when bad Join Method is chosen, make your 2-second query still running after one hour. I explained the join methods by 3 ways: theory with .ppt animation, execution plan with dbms_xplan and in live when running queries, with SQL Monitor, showing the different operations and A-Rows increasing in live.

I was also co-presenting in a roundtable on Oracle Cloud PaaS, sharing my little experience on DBaaS. Everybody talks about Cloud and it is good to talk about problems encountered and how to deal with it.

Round tables

Despite the hard concurrency of good sessions, I also attended a few round tables. Those conferences are a good opportunity to meet and share other users and product managers. Oracle is a big company, and we sometimes think that they care only about their biggest customers, but that is not exact. There are several product managers who really listen to customers. A great one was the discussion about something that slowly changes for a few years: the documentation which was really precise in the past – in explaining the When, Why, and How – is now more vague.
IMG_4806

Community

Geeking at the Oracle Developer Community lounge, Eating something decent near the conference center, drinking while discussing technical stuff, the ACE dinner (and the first ACED briefing out of US), posing with award winners, and the amazing chocolate testing…

ukoug_tech17_geeksbulgugiIMG_4812aceDQTvPKSUIAADCqVachocolate1

 

Cet article #UKOUG_TECH17 est apparu en premier sur Blog dbi services.

Shooting the DBA isn’t a Silver Bullet to the Cloud

We’ve all been watching numerous companies view value in bypassing the Database Administrator and other critical IT roles in an effort to get IT faster to the cloud.  It may look incredibly attractive to sales, but the truth of it is, it can be like setting up land mines in your own yard.

Having the right people in the right roles, doing the right tasks is essential to having a complete cloud deployment.  Easy access without any checks and balances is just asking for trouble, which I am quickly realizing as I discuss this topic with more folks on my travels.  One company, who originally just sold cloud solutions, started offering audits of cloud environments.  They have experienced and incredible demand for this type of review and upon inspection, discovered over 80% of existing cloud projects they audited, failed many basic security and standard environment requirements that having a Database Administrator’s involvement would have assisted in avoiding.

Some of the most common failures?

  • Common best practice for multifactor authentication, often embedding credentials into the application or database layer.  They often found the developers expected to address it at a later time and then never doing so due to time limitations or missing review steps.
  • Weak or missing encryption at the production layer and access to critical data in non-production.  This is a sore point for me, as no one needs access to critical data in a non-production environment with the number of advanced masking tools available on the market.  Some will complain about the cost, but what is the cost to your company with one breach.  Out of business anyone?
  • APIs are often the most exposed layer of any multi-tier environment and if you’ve skipped deep penetration testing of this layer, you’re just asking to be some hacker’s weekend project.
  • Account hijacking is all too common these days.  There are advanced database, as well as application monitoring systems that will note potential fraudulent activity and notify those responsible.  DBAs understand this responsibility, where developers goal is to build out functionality.  Understand the difference in goals.

The expense of putting IT processes in place to discover and repair vulnerabilities is small compared to the potential damage and understanding why different mindsets and roles are designated for a reason is important to the success of the cloud.  Even if sales is focused on getting the people who will use the product without much scrutiny doesn’t mean that IT should stop taking full advantage of the DBA role which may be standing between them and the cloud to stay in business for the long run.



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Shooting the DBA isn't a Silver Bullet to the Cloud], All Right Reserved. 2017.

The post Shooting the DBA isn’t a Silver Bullet to the Cloud appeared first on DBA Kevlar.

Changing Physical Index Attributes without Down Time

Normally, we make an index invisible before we drop it, in order to see whether we can manage without it, and if not we can make it visible again.  In this blog, I will demonstrate how to use index invisibility to introduce an index change that I cannot do with an online operation.  I am also able to reverse it immediately, whereas an on-line operation would take time.

Problem Statement

I have a large partitioned table, but the primary key index on it was not partitioned.  Testing has shown that performance would improve if the index was locally partitioned.  It is not possible to introduce the partitioning by rebuilding the index online.  I cannot afford the downtime to drop and rebuild the index, and anyway I want an easy way back to the original state in case of a problem.

Demonstration

I encountered this problem in a PeopleSoft system with a unique index, but here I will demonstrate it with a primary key constraint also.  I will create and populate a simple range partitioned table with a primary key constraint.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">DROP TABLEt PURGE;

CREATE TABLE t
(a number not null
,b number not null
,c number
,d number)
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">partition by range (a) interval (42)
(partition values less than (42))
/
ALTER TABLE t ADD CONSTRAINT t_pk PRIMARY KEY (a,b)
/
TRUNCATE TABLE t
/
INSERT INTO t
WITH x AS (select rownum n from dual connect by level <= 100)
SELECT a.n, b.n, a.n*b.n, a.n+b.n
FROM x a, x b
/

Note that table is partitioned, but the unique index is not. I haven't had to explicitly build it without partitioning.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set long 5000
select dbms_metadata.get_ddl('INDEX','T_PK')
from dual
/
CREATE UNIQUE INDEX "SCOTT"."T_PK" ON "SCOTT"."T" ("A", "B")
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
TABLESPACE "USERS"

I cannot just create the new index because the column list is already indexed.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE UNIQUE INDEX SCOTT.T_PK_NEW ON SCOTT.T (A, B)
*
ERROR at line 1:
ORA-01408: such column list already indexed

So, now I will create a new unique index on the same columns, but I will create it invisible. I will also create it online so that I do not block the application from performing DML on the table while the index is built.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE UNIQUE INDEX SCOTT.T_PK_NEW ON SCOTT.T (A, B)
LOCAL INVISIBLE ONLINE
/

So now I have two unique indexes, one visible, one invisible

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set autotrace off
column constraint_name format a20
column table_name format a10
column index_name format a10
SELECT index_name, visibility FROM user_indexes WHERE table_name = 'T'
/

INDEX_NAME VISIBILIT
---------- ---------
T_PK VISIBLE
T_PK_NEW INVISIBLE

I cannot make the new index visible while the original index is also visible because I cannot have two unique indexes

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">ALTER INDEX SCOTT.T_PK_NEW VISIBLE
/
*
ERROR at line 1:
ORA-14147: There is an existing VISIBLE index defined on the same set of columns.

Instead, I have to make the original index invisible first. However, even if both indexes are invisible the unique constraint is still enforced.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">ALTER INDEX SCOTT.T_PK INVISIBLE
/
INSERT INTO t VALUES (1,2,3,4)
/
*
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.T_PK) violated

ALTER INDEX SCOTT.T_PK_NEW VISIBLE
/

Now my new index is visible.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT index_name, visibility FROM user_indexes WHERE table_name = 'T'
/

INDEX_NAME VISIBILIT
---------- ---------
T_PK INVISIBLE
T_PK_NEW VISIBLE

The constraint still points to the original index.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">INSERT INTO t VALUES(1,2,3,4)
/
*
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.T_PK) violated

SELECT constraint_name, table_name, index_name
FROM user_constraints
WHERE constraint_name = 'T_PK'
/

CONSTRAINT_NAME TABLE_NAME INDEX_NAME
-------------------- ---------- ----------
T_PK T T_PK

However, queries now use the new index.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set autotrace on lines 160
SELECT * FROM t WHERE a = 1 and b=2
/
Plan hash value: 3024084754
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE | | 1 | 52 | 2 (0)| 00:00:01 | 1 | 1 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| T | 1 | 52 | 2 (0)| 00:00:01 | 1 | 1 |
|* 3 | INDEX UNIQUE SCAN | T_PK_NEW | 1 | | 1 (0)| 00:00:01 | 1 | 1 |
---------------------------------------------------------------------------------------------------------------

The original index can now be removed. However, unique indexes used by primary key constraints cannot be dropped directly. Instead the modifying the primary key constraint drops the original index.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">ALTER TABLE scott.t 
MODIFY CONSTRAINT t_pk USING INDEX t_pk_new
/

SELECT constraint_name, table_name, index_name
FROM user_constraints
WHERE constraint_name = 'T_PK'
/
CONSTRAINT_NAME TABLE_NAME INDEX_NAME
-------------------- ---------- ----------
T_PK T T_PK_NEW

SELECT index_name, visibility FROM user_indexes WHERE table_name = 'T'
/

INDEX_NAME VISIBILIT
---------- ---------
T_PK_NEW VISIBLE

Finally, the new index can be renamed.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">ALTER INDEX t_pk_new RENAME TO t_pk
/

UKOUG and Microsoft Marathon Webinar

I’m here for the last day of #0000ff;">UKOUG Tech17 and it’s been an awesome event.  This is my first time to this event while in Birmingham and although quite the party town on the weekends, its a great city to have an event like UKOUG’s trio of events during the week.

I was welcomed to great turnouts to both my presentations and then enjoyed numerous sessions by other’s, such as #0000ff;">#0000ff;" href="https://twitter.com/richardfootecon?lang=en">Richard Foote,#0000ff;" href="https://twitter.com/JLOracle?lang=en"> Jonathan Lewis, #0000ff;" href="https://twitter.com/RogerMacNicol?lang=en">Roger MacNicol and others that one might not get a chance to see in at most US Oracle User Group events.

My biggest challenge in my second session was that I was in the Hall 1, (aka the big arse room) where I couldn’t see anything past the main lights, so decided quickly to skip any questions during the session and asked folks to come see me afterwards.  Even with microphones half way up the theater, I was picturing the comedy of me trying to figure out where the question was coming from and what was being said beyond the echo in the huge chamber.

https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/12/kgorman_ukoug... 300w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/12/kgorman_ukoug... 768w" sizes="(max-width: 840px) 100vw, 840px" data-recalc-dims="1" />

There were numerous social get-togethers around the conference, but with so many November events and travel, I found that I was a little less social than I would be normally.  I still had plenty of opportunity to catch up with folks and it was lovely to see everyone.

This is the last event for the year with travel for me, so next week I’m home and have #0000ff;">Microsoft’s Pass Marathon virtual event, where I and a number of other experts will be sharing our knowledge on Linux for the SQL Server DBA.

If you haven’t had the chance to work with SQL Server 2017 on Linux, it’s a simple process to install Docker, (who doesn’t have Docker on their laptop these days??) and add the#0000ff;">#0000ff;" href="https://hub.docker.com/r/microsoft/mssql-server-linux/"> image for SQL Server 2017, so seriously consider the option to get familiar with the newest version of SQL Server with the new OS.

 



Tags:  , , , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [UKOUG and Microsoft Marathon Webinar], All Right Reserved. 2017.

The post UKOUG and Microsoft Marathon Webinar appeared first on DBA Kevlar.

SQL Server 2017 on Linux- Processes

So while finishing up my slides for the Microsoft Pass Linux Marathon webinar on Dec. 13th, Tim and I started to discuss the apples vs. oranges comparison of SQL Server on Linux process management vs. an Oracle instance.

We discussed the differences one might expect to see using different tools with the two database platforms and the first question surrounded information pmap would provide.  We commonly use pmap to map out processes, including child and sub-processes of a particular PID.   As my current images have multiple database platforms on each of them,  I quickly spun up a Docker image of SQL Server 2017 to ensure I had a clean image to work from.

First, what processes are running?

$ ps -ef
root         1     0  0 22:22 ?        00:00:00 /bin/sh -c /opt/mssql/bin/sqlservr
root         #ff0000;">7     1  0 22:22 ?        00:00:00 #ff0000;">/opt/mssql/bin/sqlservr
root         #ff0000;">9     7  3 22:22 ?        00:00:24 #ff0000;">/opt/mssql/bin/sqlservr
root       165     0  0 22:24 pts/0    00:00:00 bash
root       300   165  0 22:34 pts/0    00:00:00 ps -ef

Unlike Oracle, there aren’t separate background processes like the pmon, log writer, etc. that was present for us to identify, as you see in the following example:

$ ps -ef | grep orcl
delphix   2927     1  0 Nov22 ?        00:06:49 ora_vkrm_orcl
delphix   4542     1  0 18:48 ?        00:00:00 ora_w000_orcl
delphix   9331     1  0 Oct18 ?        00:04:24 ora_pmon_orcl
delphix   9333     1  0 Oct18 ?        00:05:21 ora_psp0_orcl
delphix   9335     1  0 Oct18 ?        05:45:03 ora_vktm_orcl
delphix   9339     1  0 Oct18 ?        00:01:01 ora_gen0_orcl
delphix   9341     1  0 Oct18 ?        00:01:44 ora_diag_orcl
delphix   9343     1  0 Oct18 ?        00:01:37 ora_dbrm_orcl
delphix   9345     1  0 Oct18 ?        00:16:50 ora_dia0_orcl
delphix   9347     1  0 Oct18 ?        00:00:57 ora_mman_orcl
delphix   9349     1  0 Oct18 ?        00:03:55 ora_dbw0_orcl
delphix   9351     1  0 Oct18 ?        00:18:27 ora_lgwr_orcl
delphix   9353     1  0 Oct18 ?        00:05:58 ora_ckpt_orcl
delphix   9355     1  0 Oct18 ?        00:01:36 ora_smon_orcl
delphix   9357     1  0 Oct18 ?        00:00:19 ora_reco_orcl
delphix   9359     1  0 Oct18 ?        00:05:22 ora_mmon_orcl
delphix   9361     1  0 Oct18 ?        00:12:19 ora_mmnl_orcl
delphix   9363     1  0 Oct18 ?        00:00:26 ora_d000_orcl
delphix   9365     1  0 Oct18 ?        00:00:24 ora_s000_orcl
delphix   9886     1  0 Oct18 ?        00:58:21 ora_arc0_orcl
delphix   9888     1  0 Oct18 ?        00:58:44 ora_arc1_orcl
delphix   9890     1  0 Oct18 ?        00:00:34 ora_arc2_orcl
delphix   9892     1  0 Oct18 ?        00:58:21 ora_arc3_orcl
delphix   9894     1  0 Oct18 ?        00:01:11 ora_ctwr_orcl
delphix   9896     1  0 Oct18 ?        00:00:26 ora_qmnc_orcl
delphix   9911     1  0 Oct18 ?        00:06:54 ora_cjq0_orcl
delphix  10082     1  0 Oct18 ?        00:00:27 ora_q000_orcl
delphix  10084     1  0 Oct18 ?        00:02:42 ora_q001_orcl
delphix  14181     1  0 Oct18 ?        00:01:17 ora_smco_orcl
delphix  30148     1  0 Nov22 ?        00:02:25 ora_m002_orcl
delphix  30151     1  0 Nov22 ?        00:02:27 ora_j000_orcl
delphix  30456     1  0 Nov22 ?        00:00:22 ora_j001_orcl
delphix  31062     1  0 Nov22 ?        00:02:25 ora_j002_orcl
delphix  31456     1  0 Nov22 ?        00:00:21 ora_j003_orcl

In our Oracle environment, we can see every background process, with it’s own pid and along with the process monitor, (pmon) db writer, (dbwr), log writer, (lgwr), we also have archiving, (arcx), job processing, (j00x) performance and other background processing.  I didn’t even grep for the Oracle executable, so you recognize how quickly we can see what is running.

In the SQL Server environment, we only have two processes- our parent process is PID 7 and the child is 9 for SQL Server and nothing to distinguish what they actually are doing.  If we decide to use the pmap utility to view what the parent and child process are doing, we see only sqlservr as the mapping information.

$ pmap -x 7

7:   #ff0000;">/opt/mssql/bin/sqlservr
Address           Kbytes     RSS   Dirty Mode  Mapping
000000762cbcb000     888     724       0 r-x-- sqlservr
000000762cbcb000       0       0       0 r-x-- sqlservr
000000762cea8000      24      24      24 rw--- sqlservr
000000762cea8000       0       0       0 rw--- sqlservr
000000762ceae000     300      20      20 rw---   [ anon ]
000000762ceae000       0       0       0 rw---   [ anon ] 
0000400000000000       4       4       4 rw---   [ anon ]
...
ffffffffff600000       4       0       0 r-x--   [ anon ]
ffffffffff600000       0       0       0 r-x--   [ anon ]
---------------- ------- ------- ------- 
total kB           80212   13660    8896

The information mapping is filled with heap, (anon) info, along with libraries and bin executables.  Nothing out of the ordinary here, but nothing of detail, either.

If I were to do the same for Oracle-  let’s say the pmon process, you’d see more information, including what the process is doing:

$ pmap -x 9331

9331:   #ff0000;">ora_pmon_orcl
Address           Kbytes     RSS   Dirty Mode   Mapping
0000000000400000  189264   12340       0 r-x--  oracle
000000000bed4000    2020     256      36 rw---  oracle
000000000c0cd000     348      60      60 rw---    [ anon ]
000000000c6d5000     584      84      84 rw---    [ anon ]
0000000060000000   12288     356     356 rw-s-    [ shmid=0x50001 ]
0000000060c00000  364544    2824    2824 rw-s-    [ shmid=0x58002 ]
0000000077000000    2048       4       4 rw-s-    [ shmid=0x60003 ]
...

We would see every library being called, along with the memory usage, etc.  The child process of 9 showed substantially more memory usage, but we’re left with little to go on for what each process is responsible for in the database engine:

$ pmap -x 9 

9:   #ff0000;">/opt/mssql/bin/sqlservr
Address           Kbytes     RSS   Dirty Mode  Mapping
000000006a400000       4       4       4 r---- system.sfp
000000006a400000       0       0       0 r---- system.sfp
000000006a401000    1308    1304     164 r-x-- system.sfp
...
ffffffffff600000       4       0       0 r-x--   [ anon ]
ffffffffff600000       0       0       0 r-x--   [ anon ]
---------------- ------- ------- ------- 
total kB         3064272  604820  587480

Digging further into this process and using TOP, we can filter for just the child PID 9 and see the memory usage, but still nothing that tells us what 9 is responsible for.

$ top -H -p 9
top - 22:50:39 up 34 min,  0 users,  load average: 0.00, 0.00, 0.00
Threads: 154 total,   0 running, 154 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.1 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  4042116 total,  1722984 free,   678736 used,  1640396 buff/cache
KiB Swap:  1048572 total,  1048572 free,        0 used.  2936276 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND 
26 root      20   0 3017376 591652  15800 S  1.7 14.6   0:27.98 sqlservr                     
22 root      20   0 3017376 591652  15800 S  0.3 14.6   0:01.41 sqlservr                    
14 root      20   0 3017376 591652  15800 S  0.3 14.6   0:00.32 sqlservr                  
9 root       20   0 3017376 591652  15800 S  0.0 14.6   0:00.00 sqlservr                     
10 root      20   0 3017376 591652  15800 S  0.0 14.6   0:00.04 sqlservr                   
11 root      20   0 3017376 591652  15800 S  0.0 14.6   0:00.00 sqlservr                  
12 root      20   0 3017376 591652  15800 S  0.0 14.6   0:00.04 sqlservr                     
13 root      20   0 3017376 591652  15800 S  0.0 14.6   0:00.43 sqlservr      
...

We can verify that there are a number of background processes, but like we discussed in the beginning, there isn’t any type of assignment letting us know what they are each responsible for.

Using the PS utility doesn’t provide us much more:

ps -T -p 9

  PID  SPID TTY          TIME CMD
    9     9 ?        00:00:00 sqlservr
    9     1 ?        00:00:00 sqlservr
    9     2 ?        00:00:00 sqlservr
...
    9    12 ?        00:00:00 sqlservr
...
    9    60 ?        00:00:00 sqlservr
    9    61 ?        00:00:00 sqlservr

It does inform us that there are 56 subprocesses connected to child PID 9 and now I’ve decided there’s only one way to get the details behind what is what.  It’s time to log into the database and look inside:

Connect via SQLCMD:

$ /opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P ‘

Then queried the information I needed, (going old school) from sysprocesses, as the DMVs weren’t as efficient:

1> select spid, program_name, cmd from MASTER..sysprocesses;
2> go

spid   program_name                         cmd    
------ -----------------------------------------------------
 1                                          XTP_CKPT_AGENT 
 2                                          LOG WRITER      
 3                                          LAZY WRITER     
 4                                          RECOVERY WRITER 
 5                                          SIGNAL HANDLER 
 6                                          LOCK MONITOR    
 7                                          XIO_RETRY_WORKER
 8                                          XIO_LEASE_RENEWA
 9                                          BRKR TASK       
 10                                         TASK MANAGER    
 11                                         TASK MANAGER    
 12                                         XE TIMER        
 13                                         RESOURCE MONITOR
 14                                         XE DISPATCHER   
 15                                         TASK MANAGER    
 16                                         TASK MANAGER    
 17                                         TRACE QUEUE TASK
 18                                         SYSTEM_HEALTH_MO
 19                                         RECEIVE         
 20                                         TASK MANAGER    
 21                                         CHECKPOINT      
 22                                         TASK MANAGER    
 23                                         HADR_AR_MGR_NOTI
 24                                         BRKR EVENT HNDLR
 25                                         BRKR TASK       
 26                                         BRKR TASK       
 27                                         BRKR TASK      
 28                                         TASK MANAGER    
 29                                         TASK MANAGER    
 30                                         TASK MANAGER    
...  
 51      SQLCMD                             SELECT          
 52                                         TASK MANAGER      
 53      SQLServerCEIP                      AWAITING COMMAND
 54                                         TASK MANAGER    
 ...    
 60                                         TASK MANAGER    
 61                                         TASK MANAGER    
(61 rows affected)

If you were curious what I queried looking at the DMVs, here’s the high level versions:

select thread_address, os_thread_id from sys.dm_os_threads;

select session_id, host_process_id, endpoint_id, program_name, client_interface_name, is_user_process, quoted_identifier, database_id, authenticating_database_id from sys.dm_exec_sessions;

Well, no matter what, that answers some of my questions, but there’s still a lot more to learn… </p />
</span></p></div></div>

    	  	<div class=

Achieving a balance

Please think this through carefully?

Cite a single occasion when a woman did not receive shaming, threats of job termination, counter-accusations, death threats, and personal shunning after she finally found herself unable to continue in the abrasive hell created for her, solely because she appealed to (or didn’t appeal to) a self-obsessed self-righteous forever-child?

 

The assertion that women are doing this to “enrich themselves” or “because they want to” is simply ludicrous.  Just look at the facts…

 
  • Who ever benefited financially from bringing harassment charges?  Nobody.
  • Who wins? Nobody.
  • What is gained? Justice, nothing more.
  • Who loses? Everybody, at one time or another.
It has always been easier to just bury it and move on, which has been the time-worn advice always given, and which is why so many of these atrocities have gone unresolved for so long.  Many will remain unresolved forever.  Whether they remain unresolved should be the decision of the victim — not the state, not the employer, and certainly not the perpetrator.
 
The courage required to speak out and live with the consequences is staggering. Those who do so are heroes, nothing less.  These charges are not brought frivolously, but after endless soul-searching.
 
Women simply want to do their jobs just as men are able, without fear or favor, and with recourse other than quitting or moving on when a man targets her.
 
Rape culture, which blames victims as well as perpetrators, forcing women to adapt rather than forcing men to be human, is the reason so many feel outrage at recent developments.
 
Are you an opponent, or an advocate, of rape culture?  Can you prove so by word or deed, and not just by a knee-jerk reaction to the question?
 
What is happening is long overdue, and hopefully will finally achieve a real and lasting balance.