LGWR process priorities on the Solaris platform
Increasing LGWR priority would be justified if:
1) High user waits for “log file sync”
2) LGWR itself is NOT waiting significant time for “log file parallel write”
3) OS statistic “OS Wait-cpu (latency) time” for LGWR’s session is a
significant amount of logwriter’s time (you’ve got to sample this over an
interval, subtract CPU and wait times from the interval and hopefully then you
can compare the remainder to the statistics value. And you need
timed_os_statistics=true or statistics_level=all for that)
..if those conditions aren’t satisfied, then you won’t gain from increasing
LGWRs priority even if your CPUs are 100% utilized.
Increasing Priority of lgwr Process using _high_priority_processes
To Tanel’s and Bob Sneed’s point, RT cpu priority is kernel preemptive. LGWR can block the interrupts being serviced by executing in RT mode in the CPU. LGWR needs those interrupts to be serviced and those interrupts quite probably could be I/O completion interrupts for LGWR to complete commit/log file sync processing. This issue of waiter blocking the interrupts server leads to an incorrect priority anomaly, analogous to a waiter of a resource evicting the holder of a resource off the CPU.
On the other hand, in RAC, LMS processes will wait for LGWR to complete ‘log file sync’ if the block to be transferred is considered “busy”. But, LMS is also running in RT mode. So if the LGWR process is not running in RT mode, then the LMS process can evict LGWR from the CPU. This also leads to incorrect priority anomaly.
If there are enough CPUs, in RAC, I would prefer, to run both LGWR and LMS* processes in RT mode (and VKTM in 11g). Then fence the interrupts to a different processor set so that LGWR/LMS* can not block those interrupts. In this way we should be able to keep interrupts/LMS/LGWR priorities straight. [At the very least, run LGWR in FX 60 in Solaris so that LGWR does not suffer from TS class scheduling latencies.]
It is important to also realize that, default number of LMS processes are quite excessive. For example, one of my client site had 28 LMS processes and all 28 of them running in RT mode! You can imagine how much CPU usage will be used by those LMS processes. In my opinion, in most cases, 3 or 5 LMS processes per instance is good enough to service GC traffic. So, if the number of LMS processes are excessive, then it is probably worthwhile to reduce that and then also increase LGWR priority
Manly Men Only Use Solid State Disk For Redo Logging. LGWR I/O is Simple, But Not LGWR Processing
Without elevated scheduling priority for LGWR, it can often be the case that the processes LGWR wakes up will take the CPU from LGWR. Some platforms (Legacy Unix) support the notion of making processes non-preemptable as well. On those platforms you can make LGWR non-preemptable. Without preemption protection, LGWR can lose CPU before it has exhausted its fair time slice.
Once LGWR loses his CPU it may be quite some time until he gets it back. For instance, if LGWR is preempted in the middle of trying to perform a redo buffer flush, there may be several time slices of execution for other processes before LGWR gets back on CPU.
Improving Performance by Using a Cartesian Join 2