Quantcast
Channel: A! Help
Viewing all articles
Browse latest Browse all 314

Tracing ASYNC Redo Transport When Data Guard Has Far Sync Instance

$
0
0
One way to find out cause for transport lags in data guard is to trace the redo transport related events. For async transport refer the white paper here. However, the trace event 16421 with level 1 did not produce any trace files when the data guard configuration had a far sync instance which was sending redo in async mode to several standby instances. The data guard configuration is given below.
DGMGRL> show configuration

Configuration - db_dg

Protection Mode: MaxAvailability
Members:
db2 - Primary database
db1 - Physical standby database
db3 - Physical standby database
db4 - Physical standby database (receiving current redo)
fs1 - Far sync instance
db5 - Physical standby database
db6 - Physical standby database
db7 - Physical standby database
db8 - Physical standby database
db9 - Physical standby database
db10 - Physical standby database

Members Not Receiving Redo:
fs2 - Far sync instance (alternate of fs1)

Fast-Start Failover: Disabled

Configuration Status:
SUCCESS (status updated 57 seconds ago)
The only async transport in this configuration was happenign between db3 -> db4 and fs1[ or fs2] -> {db5-db10}. White paper mentioned above tells to add the trace even to all instances, both primary and standby. Applying the 16421 at level 1 to all instances (though some of them are doing SYNC transport) including far sync instances didn't result in any trace files being generated.
After a SR oracle support instructed to apply event 16410 with level 16 together with 16421 with level. It too didn't generate any trace files related to async transprot.
Next it was asked to set log_archive_trace=3360 and add two additional events 10724,level 1 and 10293,level 1 while the above two events also remained in place. This combination too failed to generate any trace files related to async transport.
Following this oracle support asked to increase the level on the event 16421. First to 2 if that didn't produced any result then to 3. As such 16421, level 2 was set on all instances. This produce the trace event similar to what is mentioned on the white paper. Trace files were generated on all the source instances of async transport. In this case two trace files were generated. One in db3 which does async transpport to db4 and another on fs1, the far sync instances which does async transport to db5-d10.
As this produced results didn't try the 16421 with level 3.

The trace file generated has stats for all async destination (denoted by log_archive_dest_n parameters).
krsb_end: Begin stats dump for T-1.S-57745
max number of buffers in use 1
Operation elapsed time (micro seconds) 2990220
File transfer time (micro seconds) 2990215
Network Statistics
LOG_ARCHIVE_DEST_3 : OCI REQUEST
Total count : OCI REQUEST 6
Total time : OCI REQUEST 306754
Average time : OCI REQUEST 51125
LOG_ARCHIVE_DEST_4 : OCI REQUEST
Total count : OCI REQUEST 6
Total time : OCI REQUEST 210057
Average time : OCI REQUEST 35009
LOG_ARCHIVE_DEST_5 : OCI REQUEST
Total count : OCI REQUEST 6
Total time : OCI REQUEST 739457
Average time : OCI REQUEST 123242
LOG_ARCHIVE_DEST_6 : OCI REQUEST
Total count : OCI REQUEST 6
Total time : OCI REQUEST 593044
Average time : OCI REQUEST 98840
LOG_ARCHIVE_DEST_7 : OCI REQUEST
Total count : OCI REQUEST 6
Total time : OCI REQUEST 433191
Average time : OCI REQUEST 72198
LOG_ARCHIVE_DEST_8 : OCI REQUEST
Total count : OCI REQUEST 6
Total time : OCI REQUEST 534170
Average time : OCI REQUEST 89028
All TARGETS : OCI REQUEST
Total count : OCI REQUEST 36
Total time : OCI REQUEST 2816673
Average time : OCI REQUEST 78240
LOG_ARCHIVE_DEST_3 : NETWORK SEND
Total count : NETWORK SEND 4
Total time : NETWORK SEND 207
Average time : NETWORK SEND 51
Total data buffers queued 2
Total data buffers completed 2
Total bytes written 12288
Total bytes completed synchronously 12288
Average network send size (blocks) 0
Average network send buffers 0.50
Average buffer turnaround time 70
Throughput (MB/s) 0.00
LOG_ARCHIVE_DEST_4 : NETWORK SEND
Total count : NETWORK SEND 4
Total time : NETWORK SEND 131
Average time : NETWORK SEND 32
Total data buffers queued 2
Total data buffers completed 2
Total bytes written 12288
Total bytes completed synchronously 12288
Average network send size (blocks) 0
Average network send buffers 0.50
Average buffer turnaround time 26
Throughput (MB/s) 0.00
LOG_ARCHIVE_DEST_5 : NETWORK SEND
Total count : NETWORK SEND 4
Total time : NETWORK SEND 138
Average time : NETWORK SEND 34
Total data buffers queued 2
Total data buffers completed 2
Total bytes written 12288
Total bytes completed synchronously 12288
Average network send size (blocks) 0
Average network send buffers 0.50
Average buffer turnaround time 24
Throughput (MB/s) 0.00
LOG_ARCHIVE_DEST_6 : NETWORK SEND
Total count : NETWORK SEND 4
Total time : NETWORK SEND 182
Average time : NETWORK SEND 45
Total data buffers queued 2
Total data buffers completed 2
Total bytes written 12288
Total bytes completed synchronously 12288
Average network send size (blocks) 0
Average network send buffers 0.50
Average buffer turnaround time 24
Throughput (MB/s) 0.00
LOG_ARCHIVE_DEST_7 : NETWORK SEND
Total count : NETWORK SEND 4
Total time : NETWORK SEND 129
Average time : NETWORK SEND 32
Total data buffers queued 2
Total data buffers completed 2
Total bytes written 12288
Total bytes completed synchronously 12288
Average network send size (blocks) 0
Average network send buffers 0.50
Average buffer turnaround time 23
Throughput (MB/s) 0.00
LOG_ARCHIVE_DEST_8 : NETWORK SEND
Total count : NETWORK SEND 4
Total time : NETWORK SEND 219
Average time : NETWORK SEND 54
Total data buffers queued 2
Total data buffers completed 2
Total bytes written 12288
Total bytes completed synchronously 12288
Average network send size (blocks) 0
Average network send buffers 0.50
Average buffer turnaround time 97
Throughput (MB/s) 0.00
All Targets : NETWORK SEND
Total count : NETWORK SEND 24
Total time : NETWORK SEND 1006
Average time : NETWORK SEND 41
Total data buffers queued 12
Total data buffers completed 12
Total bytes written 73728
Total bytes completed synchronously 73728
Average network send size (blocks) 0
Average network send buffers 0
Throughput (MB/s) 0.02
Total network layer time 2817679
Percentage of time in network 94.23
Disk Statistics
Total count : DISK READ 2
Total time : DISK READ 590
Average time : DISK READ 295
Disk stall blocks 2
Total count : BUFFER RELEASE 1
Total time : BUFFER RELEASE 1
Average time : BUFFER RELEASE 1
Total disk layer time 591
Percentage of time in disk layer 0.02
Data Guard Processing Statistics
Total DG layer time 171945
Percentage of time in DG layer 5.75
Remote Server-Side Network Statistics
LOG_ARCHIVE_DEST_3 : NETWORK GET
Total count : NETWORK GET 4
Total bytes : NETWORK GET 4096
Total time : NETWORK GET 73
Average time : NETWORK GET 18
LOG_ARCHIVE_DEST_4 : NETWORK GET
Total count : NETWORK GET 4
Total time : NETWORK GET 63
Average time : NETWORK GET 15
LOG_ARCHIVE_DEST_5 : NETWORK GET
Total count : NETWORK GET 4
Total bytes : NETWORK GET 4096
Total time : NETWORK GET 61
Average time : NETWORK GET 15
LOG_ARCHIVE_DEST_6 : NETWORK GET
Total count : NETWORK GET 4
Total time : NETWORK GET 122
Average time : NETWORK GET 30
LOG_ARCHIVE_DEST_7 : NETWORK GET
Total count : NETWORK GET 4
Total bytes : NETWORK GET 4096
Total time : NETWORK GET 95
Average time : NETWORK GET 23
LOG_ARCHIVE_DEST_8 : NETWORK GET
Total count : NETWORK GET 4
Total bytes : NETWORK GET 4096
Total time : NETWORK GET 72
Average time : NETWORK GET 18
All TARGETS : NETWORK GET
Total count : NETWORK GET 24
Total bytes : NETWORK GET 16384
Total time : NETWORK GET 486
Average time : NETWORK GET 20
LOG_ARCHIVE_DEST_3 : NETWORK STALL REAP
Total count : NETWORK STALL REAP 1
Total time : NETWORK STALL REAP 1002872
Average time : NETWORK STALL REAP 1002872
LOG_ARCHIVE_DEST_4 : NETWORK STALL REAP
Total count : NETWORK STALL REAP 2
Total bytes : NETWORK STALL REAP 4096
Total time : NETWORK STALL REAP 926514
Average time : NETWORK STALL REAP 463257
LOG_ARCHIVE_DEST_5 : NETWORK STALL REAP
Total count : NETWORK STALL REAP 1
Total time : NETWORK STALL REAP 676984
Average time : NETWORK STALL REAP 676984
LOG_ARCHIVE_DEST_6 : NETWORK STALL REAP
Total count : NETWORK STALL REAP 2
Total bytes : NETWORK STALL REAP 4096
Total time : NETWORK STALL REAP 384819
Average time : NETWORK STALL REAP 192409
LOG_ARCHIVE_DEST_7 : NETWORK STALL REAP
Total count : NETWORK STALL REAP 1
Total time : NETWORK STALL REAP 214786
Average time : NETWORK STALL REAP 214786
All TARGETS : NETWORK STALL REAP
Total count : NETWORK STALL REAP 7
Total bytes : NETWORK STALL REAP 8192
Total time : NETWORK STALL REAP 3205975
Average time : NETWORK STALL REAP 457996
Total server-side network layer time 3206461
Percentage of time in network 107.23
Remote Server-Side Disk Statistics
LOG_ARCHIVE_DEST_3 : DISK WRITE
Total count : DISK WRITE 1
Total time : DISK WRITE 29
Average time : DISK WRITE 29
LOG_ARCHIVE_DEST_4 : DISK WRITE
Total count : DISK WRITE 1
Total time : DISK WRITE 64
Average time : DISK WRITE 64
LOG_ARCHIVE_DEST_5 : DISK WRITE
Total count : DISK WRITE 1
Total time : DISK WRITE 33
Average time : DISK WRITE 33
LOG_ARCHIVE_DEST_6 : DISK WRITE
Total count : DISK WRITE 1
Total time : DISK WRITE 53
Average time : DISK WRITE 53
LOG_ARCHIVE_DEST_7 : DISK WRITE
Total count : DISK WRITE 1
Total time : DISK WRITE 44
Average time : DISK WRITE 44
LOG_ARCHIVE_DEST_8 : DISK WRITE
Total count : DISK WRITE 1
Total time : DISK WRITE 21
Average time : DISK WRITE 21
All TARGETS : DISK WRITE
Total count : DISK WRITE 6
Total time : DISK WRITE 244
Average time : DISK WRITE 40
LOG_ARCHIVE_DEST_3 : DISK NOSTALL REAP
Total count : DISK NOSTALL REAP 2
Total time : DISK NOSTALL REAP 39
Average time : DISK NOSTALL REAP 19
LOG_ARCHIVE_DEST_4 : DISK NOSTALL REAP
Total count : DISK NOSTALL REAP 2
Total time : DISK NOSTALL REAP 51
Average time : DISK NOSTALL REAP 25
LOG_ARCHIVE_DEST_5 : DISK NOSTALL REAP
Total count : DISK NOSTALL REAP 2
Total time : DISK NOSTALL REAP 37
Average time : DISK NOSTALL REAP 18
LOG_ARCHIVE_DEST_6 : DISK NOSTALL REAP
Total count : DISK NOSTALL REAP 2
Total time : DISK NOSTALL REAP 38
Average time : DISK NOSTALL REAP 19
LOG_ARCHIVE_DEST_7 : DISK NOSTALL REAP
Total count : DISK NOSTALL REAP 2
Total time : DISK NOSTALL REAP 21
Average time : DISK NOSTALL REAP 10
LOG_ARCHIVE_DEST_8 : DISK NOSTALL REAP
Total count : DISK NOSTALL REAP 1
Total time : DISK NOSTALL REAP 4
Average time : DISK NOSTALL REAP 4
All TARGETS : DISK NOSTALL REAP
Total count : DISK NOSTALL REAP 11
Total time : DISK NOSTALL REAP 190
Average time : DISK NOSTALL REAP 17
LOG_ARCHIVE_DEST_3 : BUFFER GET
Total count : BUFFER GET 1
Total time : BUFFER GET 5
Average time : BUFFER GET 5
LOG_ARCHIVE_DEST_4 : BUFFER GET
Total count : BUFFER GET 1
Total time : BUFFER GET 4
Average time : BUFFER GET 4
LOG_ARCHIVE_DEST_5 : BUFFER GET
Total count : BUFFER GET 1
Total time : BUFFER GET 4
Average time : BUFFER GET 4
LOG_ARCHIVE_DEST_6 : BUFFER GET
Total count : BUFFER GET 1
Total time : BUFFER GET 5
Average time : BUFFER GET 5
LOG_ARCHIVE_DEST_7 : BUFFER GET
Total count : BUFFER GET 1
Total time : BUFFER GET 5
Average time : BUFFER GET 5
LOG_ARCHIVE_DEST_8 : BUFFER GET
Total count : BUFFER GET 1
Total time : BUFFER GET 4
Average time : BUFFER GET 4
All TARGETS : BUFFER GET
Total count : BUFFER GET 6
Total time : BUFFER GET 27
Average time : BUFFER GET 4
Total server-side disk layer time 461
Percentage of time in disk layer 0.02
Remote Server-Side Data Guard Processing Statistics
LOG_ARCHIVE_DEST_3 : SLEEP TIME
Total count : SLEEP TIME 1
Total time : SLEEP TIME 10097
Average time : SLEEP TIME 10097
LOG_ARCHIVE_DEST_4 : SLEEP TIME
Total count : SLEEP TIME 1
Total time : SLEEP TIME 10166
Average time : SLEEP TIME 10166
LOG_ARCHIVE_DEST_5 : SLEEP TIME
Total count : SLEEP TIME 1
Total time : SLEEP TIME 10151
Average time : SLEEP TIME 10151
LOG_ARCHIVE_DEST_6 : SLEEP TIME
Total count : SLEEP TIME 1
Total time : SLEEP TIME 10086
Average time : SLEEP TIME 10086
LOG_ARCHIVE_DEST_7 : SLEEP TIME
Total count : SLEEP TIME 1
Total time : SLEEP TIME 10097
Average time : SLEEP TIME 10097
All TARGETS : SLEEP TIME
Total count : SLEEP TIME 5
Total time : SLEEP TIME 50597
Average time : SLEEP TIME 10119
LOG_ARCHIVE_DEST_3 : PUBLISH RTA BOUNDARY
Total count : PUBLISH RTA BOUNDARY 1
Total time : PUBLISH RTA BOUNDARY 6
Average time : PUBLISH RTA BOUNDARY 6
LOG_ARCHIVE_DEST_4 : PUBLISH RTA BOUNDARY
Total count : PUBLISH RTA BOUNDARY 1
Total time : PUBLISH RTA BOUNDARY 8
Average time : PUBLISH RTA BOUNDARY 8
LOG_ARCHIVE_DEST_5 : PUBLISH RTA BOUNDARY
Total count : PUBLISH RTA BOUNDARY 1
Total time : PUBLISH RTA BOUNDARY 5
Average time : PUBLISH RTA BOUNDARY 5
LOG_ARCHIVE_DEST_6 : PUBLISH RTA BOUNDARY
Total count : PUBLISH RTA BOUNDARY 1
Total time : PUBLISH RTA BOUNDARY 5
Average time : PUBLISH RTA BOUNDARY 5
LOG_ARCHIVE_DEST_7 : PUBLISH RTA BOUNDARY
Total count : PUBLISH RTA BOUNDARY 1
Total time : PUBLISH RTA BOUNDARY 4
Average time : PUBLISH RTA BOUNDARY 4
All TARGETS : PUBLISH RTA BOUNDARY
Total count : PUBLISH RTA BOUNDARY 5
Total time : PUBLISH RTA BOUNDARY 28
Average time : PUBLISH RTA BOUNDARY 5
LOG_ARCHIVE_DEST_3 : VALIDATE BUFFER
Total count : VALIDATE BUFFER 1
Total time : VALIDATE BUFFER 1
Average time : VALIDATE BUFFER 1
LOG_ARCHIVE_DEST_4 : VALIDATE BUFFER
Total count : VALIDATE BUFFER 1
Total time : VALIDATE BUFFER 2
Average time : VALIDATE BUFFER 2
LOG_ARCHIVE_DEST_5 : VALIDATE BUFFER
Total count : VALIDATE BUFFER 1
Total time : VALIDATE BUFFER 1
Average time : VALIDATE BUFFER 1
LOG_ARCHIVE_DEST_6 : VALIDATE BUFFER
Total count : VALIDATE BUFFER 1
Total time : VALIDATE BUFFER 2
Average time : VALIDATE BUFFER 2
LOG_ARCHIVE_DEST_7 : VALIDATE BUFFER
Total count : VALIDATE BUFFER 1
Total time : VALIDATE BUFFER 1
Average time : VALIDATE BUFFER 1
LOG_ARCHIVE_DEST_8 : VALIDATE BUFFER
Total count : VALIDATE BUFFER 1
Total time : VALIDATE BUFFER 1
Average time : VALIDATE BUFFER 1
All TARGETS : VALIDATE BUFFER
Total count : VALIDATE BUFFER 6
Total time : VALIDATE BUFFER 8
Average time : VALIDATE BUFFER 1
Total Server-Side DG layer time 18446744073709334909
Percentage of time in DG layer 616903601029120.00
krsb_end: End stats dump


Trace file also gives summary for all targets. For example total time for all the OCI requests
All TARGETS : OCI REQUEST          
Total time : OCI REQUEST 2816673
comes from summing of all the values for each log archive destination's OCI REQUEST total time values
LOG_ARCHIVE_DEST_3 : OCI REQUEST          
Total time : OCI REQUEST 306754
LOG_ARCHIVE_DEST_4 : OCI REQUEST
Total time : OCI REQUEST 210057
LOG_ARCHIVE_DEST_5 : OCI REQUEST
Total time : OCI REQUEST 739457
LOG_ARCHIVE_DEST_6 : OCI REQUEST
Total time : OCI REQUEST 593044
LOG_ARCHIVE_DEST_7 : OCI REQUEST
Total time : OCI REQUEST 433191
LOG_ARCHIVE_DEST_8 : OCI REQUEST
Total time : OCI REQUEST 534170


For granual timing information for disk and DG layer the "Remote Server-Side *" sections of the trace output must be refered. This is something missing on the white paper. There lies the problem with this tracing. The section before the "Remote Server-Side *" (one that has similar output to one that is shown in white paper) has percentage values that add up to 100.
    Percentage of time in network           94.23
Percentage of time in disk layer 0.02
Percentage of time in DG layer 5.75
It is these values (source of the ASYNC transport) as per the white paper that is used to tune the transport related issues. Remote section which seems to relate to the standby instances that receive redo is not considered for tuning.
The odd part is that the percentages after "Remote Server-Side *" doesn't add up to 100.
Remote Server-Side Network Statistics
Percentage of time in network 107.23
Remote Server-Side Disk Statistics
Percentage of time in disk layer 0.02
Remote Server-Side Data Guard Processing Statistics
Percentage of time in DG layer 616903601029120.00
These odd statistics were observed multiple times during tracing. Oracle support acknowledged that this is a bug, but as the original SR was on "how to trace" and as that was resolved was eager to close the SR. As such there was no further investigation on the matter.
On a separate trace test case involving just one primary and one standby the remote section did produce percentages values that made sense.
krsb_end: Begin stats dump for T-1.S-5521

max number of buffers in use 10
Operation elapsed time (micro seconds) 229354206
File transfer time (micro seconds) 229354197

Network Statistics

LOG_ARCHIVE_DEST_2 - OCI REQUEST
Total count - OCI REQUEST 7820
Total time - OCI REQUEST 137552
Average time - OCI REQUEST 17

LOG_ARCHIVE_DEST_2 - NETWORK SEND
Total count - NETWORK SEND 7817
Total time - NETWORK SEND 252474
Average time - NETWORK SEND 32

Total data buffers queued 7816
Total data buffers completed 7816
Total bytes written 10035712
Total bytes completed synchronously 9645056
Average network send size (blocks) 2
Average network send buffers 1.00
Average buffer turnaround time 770
Throughput (MB/s) 0.04

LOG_ARCHIVE_DEST_2 - NETWORK NO-STALL REAP
Total count - NETWORK NO-STALL REAP 36
Total bytes - NETWORK NO-STALL REAP 359424
Total time - NETWORK NO-STALL REAP 370
Average time - NETWORK NO-STALL REAP 10

LOG_ARCHIVE_DEST_2 - NETWORK STALL REAP
Total count - NETWORK STALL REAP 3
Total bytes - NETWORK STALL REAP 31232
Total time - NETWORK STALL REAP 82695
Average time - NETWORK STALL REAP 27565

Total network layer time 473091
Percentage of time in network 0.21

Disk Statistics

Total count - DISK READ 7816
Total time - DISK READ 53465
Average time - DISK READ 6

Log buffer blocks 19600
Total count - BUFFER RELEASE 7815
Total time - BUFFER RELEASE 7972
Average time - BUFFER RELEASE 1

Total disk layer time 61437
Percentage of time in disk layer 0.03

Data Guard Processing Statistics

Total count - SLEEP 7814
Total time - SLEEP 228139616
Average time - SLEEP 29196

Total DG layer time 228819669
Percentage of time in DG layer 99.77

Remote Server-Side Network Statistics

LOG_ARCHIVE_DEST_2 - NETWORK GET
Total count - NETWORK GET 23449
Total bytes - NETWORK GET 8112128
Total time - NETWORK GET 86691
Average time - NETWORK GET 3

LOG_ARCHIVE_DEST_2 - NETWORK STALL REAP
Total count - NETWORK STALL REAP 6450
Total bytes - NETWORK STALL REAP 1923584
Total time - NETWORK STALL REAP 224942932
Average time - NETWORK STALL REAP 34874

Total server-side network layer time 225029623
Percentage of time in network 98.11

Remote Server-Side Disk Statistics

LOG_ARCHIVE_DEST_2 - DISK WRITE
Total count - DISK WRITE 7816
Total time - DISK WRITE 3977689
Average time - DISK WRITE 508

LOG_ARCHIVE_DEST_2 - DISK NOSTALL REAP
Total count - DISK NOSTALL REAP 7816
Total blocks - DISK NOSTALL REAP 19601
Total time - DISK NOSTALL REAP 3088
Average time - DISK NOSTALL REAP 0

LOG_ARCHIVE_DEST_2 - BUFFER GET
Total count - BUFFER GET 7816
Total time - BUFFER GET 3059
Average time - BUFFER GET 0

LOG_ARCHIVE_DEST_2 - BUFFER RELEASE
Total count - BUFFER RELEASE 7816
Total time - BUFFER RELEASE 2001
Average time - BUFFER RELEASE 0

Total server-side disk layer time 3985837
Percentage of time in disk layer 1.74

Remote Server-Side Data Guard Processing Statistics

LOG_ARCHIVE_DEST_2 - PUBLISH RTA BOUNDARY
Total count - PUBLISH RTA BOUNDARY 7816
Total time - PUBLISH RTA BOUNDARY 25116
Average time - PUBLISH RTA BOUNDARY 3

LOG_ARCHIVE_DEST_2 - VALIDATE BUFFER
Total count - VALIDATE BUFFER 7816
Total time - VALIDATE BUFFER 3041
Average time - VALIDATE BUFFER 0

Total Server-Side DG layer time 338737
Percentage of time in DG layer 0.15

krsb_end: End stats dump


In this case the percentages on each layer were
Percentage of time in network           0.21
Percentage of time in disk layer 0.03
Percentage of time in DG layer 99.77
For remote section
Remote Server-Side Network Statistics
Percentage of time in network 98.11
Remote Server-Side Disk Statistics
Percentage of time in disk layer 1.74
Remote Server-Side Data Guard Processing Statistics
Percentage of time in DG layer 0.15
So the statistics related issue appears to occur in tracing when redo is sent to multiple destinations.
In summary if async redo transport to be traced when far sync is present use event 16421 with level 2. Check the values generated by the trace file makes sense.

Viewing all articles
Browse latest Browse all 314

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>