标签归档:SLOB

How to get extreme Exadata IOPS with SLOB

From Exadata X3-2 datasheet, The maximum Flash IOPS is 1.5M, which is pretty impressive.

With SLOB and the help from Kevin Closson, I can get around 1.4M maxinum IOPS on X3-2 full rack. This Blog show how to use Kevin Closson’s SLob to push the IOPS on X3-2 over 1M. To push IOPS as hard as possible, I’ll use all the 8 db nodes on X3-2, to trigger ramdon read concurrently to the 14 cell nodes. The procedure is to setup 256 SLOB schema, keep the SLOB table into smart flash cache and warm up the cache by smart scan. Prepare the SQL*NET connection to the 8 db nodes, make sure the buffer cache is small enough, so the logical read can not hit the buffer cahce, and need to issue physical read request to cells. Here comes the preparation steps and final observations.

1. Slob 2

Get the Slob 2 source 2013.05.05.slob2.tar and patch with SLOB2_patch_2014.01.13.tar

SLOB 2 — A Significant Update. Links Are Here.
http://kevinclosson.wordpress.com/2013/05/02/slob-2-a-significant-update-links-are-here/

Patch README :
https://my.syncplicity.com/share/ien1nsqok0equbk/README

Patch Source:
https://my.syncplicity.com/share/vba5x0tgn05a3bk/SLOB2_patch_2014.01.13.tar

2. Setup 256 SLOB.

You need hack the setup.sh a little bit to setup over 128 schemas.

NOTIFY: Load Parameters (slob.conf):

LOAD_PARALLEL_DEGREE == 32

...

NOTIFY: Seed table loading procedure has exited.
NOTIFY: Setting up user   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32
NOTIFY: Waiting for background processes - Sun Jan 26 04:58:02 PST 2014
NOTIFY: Setting up user   33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64
NOTIFY: Waiting for background processes - Sun Jan 26 05:03:13 PST 2014
NOTIFY: Setting up user   65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96
NOTIFY: Waiting for background processes - Sun Jan 26 05:08:13 PST 2014
NOTIFY: Setting up user   97  98  99  100  101  102  103  104  105  106  107  108  109  110  111  112  113  114  115  116  117  118  119  120  121  122  123  124  125  126  127  128
NOTIFY: Waiting for background processes - Sun Jan 26 05:13:06 PST 2014
NOTIFY: Setting up user   129  130  131  132  133  134  135  136  137  138  139  140  141  142  143  144  145  146  147  148  149  150  151  152  153  154  155  156  157  158  159  160
NOTIFY: Waiting for background processes - Sun Jan 26 05:17:59 PST 2014
NOTIFY: Setting up user   161  162  163  164  165  166  167  168  169  170  171  172  173  174  175  176  177  178  179  180  181  182  183  184  185  186  187  188  189  190  191  192
NOTIFY: Waiting for background processes - Sun Jan 26 05:23:35 PST 2014
NOTIFY: Setting up user   193  194  195  196  197  198  199  200  201  202  203  204  205  206  207  208  209  210  211  212  213  214  215  216  217  218  219  220  221  222  223  224
NOTIFY: Waiting for background processes - Sun Jan 26 05:29:02 PST 2014
NOTIFY: Setting up user   225  226  227  228  229  230  231  232  233  234  235  236  237  238  239  240  241  242  243  244  245  246  247  248  249  250  251  252  253  254  255  256
NOTIFY: Waiting for background processes - Sun Jan 26 05:34:54 PST 2014

Connect to the SLOB database as user1/user1 via sqlplus and execute @./procedure.

3. Prepare 8 slob service and add the 8 entries to the tnsnames.ora

srvctl add service -d exa -s slob1 -r exa1 -a exa2
srvctl add service -d exa -s slob2 -r exa2 -a exa1
srvctl add service -d exa -s slob3 -r exa3 -a exa1
srvctl add service -d exa -s slob4 -r exa4 -a exa1
srvctl add service -d exa -s slob5 -r exa5 -a exa1
srvctl add service -d exa -s slob6 -r exa6 -a exa1
srvctl add service -d exa -s slob7 -r exa7 -a exa1
srvctl add service -d exa -s slob8 -r exa8 -a exa1

srvctl start service -d exa -s slob1
srvctl start service -d exa -s slob2
srvctl start service -d exa -s slob3
srvctl start service -d exa -s slob4
srvctl start service -d exa -s slob5
srvctl start service -d exa -s slob6
srvctl start service -d exa -s slob7
srvctl start service -d exa -s slob8

slob1 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = demo10db01)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = slob1)))
slob2 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = demo10db02)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = slob2)))
slob3 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = demo10db03)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = slob3)))
slob4 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = demo10db04)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = slob4)))
slob5 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = demo10db05)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = slob5)))
slob6 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = demo10db06)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = slob6)))
slob7 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = demo10db07)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = slob7)))
slob8 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = demo10db08)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = slob8)))

4. slob.conf

UPDATE_PCT=0
RUN_TIME=300
WORK_LOOP=0
SCALE=1000000
WORK_UNIT=256
REDO_STRESS=HEAVY
LOAD_PARALLEL_DEGREE=32
SHARED_DATA_MODULUS=0

# Settings for SQL*Net connectivity:
SQLNET_SERVICE_BASE=slob
SQLNET_SERVICE_MAX=8

5. init parameters

Huge page is enable, except one of the eight node dont have enough huge page to hold the whole SGA. 128M buffer cache size seems the smallest size with CPU_COUNT=32.

*.use_large_pages='TRUE'
*.db_cache_size  = 128m

6. Keep the Slob tables into smart flash cache and warm up SLOB tables in smart flash cache by 3 smart scans.

declare
n number;
begin
  for i in 1 .. 256
  loop
    execute immediate 'alter table user' || i || '.cf1 storage (CELL_FLASH_CACHE keep)';
    execute immediate 'select /*+ full(cf1)*/ count(*) from user' || i||'.cf1' into n;
    execute immediate 'select /*+ full(cf1)*/ count(*) from user' || i||'.cf1' into n;
    execute immediate 'select /*+ full(cf1)*/ count(*) from user' || i||'.cf1' into n;
  end loop;
end;
/

7. Obserzation

I’ve tried the number of slob sessions with 96, 128, 160, 192, 224 and 256. With 160 sessions, I can get the maximum IOPS, which is average 1.36M. From the screen capture, the peak IOPS is over 14M, which is pretty close to the 1.5M stated in the X3-2 Datasheet, and each of the 14 cells provide 100K IOPS evenly, which is good. From the awr of 160 session, “cell list of blocks physical read” is 3ms and “cell single block physical read” is 1.25ms. CPU utilization is up to 90% on cells, while it’s 37% on db nodes, seems in such extreme case, the cell processing power is not sufficient and become the first bottleneck to scale.

Top Timed Events                            DB/Inst: EXA/exa1  Snaps: 994-995

                               Wait                                   Event                      Wait Time                    Summary Avg Wait Time (ms)
       ---------------------------------------------------   ----------------------   ------------------------------   ----------------------------------------
  I#   Class      Event                                             Waits %Timeouts        Total(s) Avg(ms) %DB time        Avg      Min      Max  Std Dev  Cnt
----   ---------- ----------------------------------------   ------------ ---------   ------------- ------- --------   -------- -------- -------- -------- ----
   *   User I/O   cell list of blocks physical read            11,416,579       0.0       35,283.85     3.1    43.04       3.09     2.98     3.12     0.05    8
       User I/O   cell single block physical read              25,295,624       0.0       31,685.75     1.3    38.65       1.25     1.25     1.26     0.00    8
                  DB CPU                                              N/A       N/A       24,707.27     N/A    30.14                                          8
       Cluster    gc cr grant 2-way                               758,555       0.0          154.92     0.2     0.19       0.21     0.20     0.26     0.02    8
       System I/O log file parallel write                           9,559       0.0           46.95     4.9     0.06       4.93     4.46     5.22     0.30    8
       User I/O   read by other session                            11,890       0.0           12.14     1.0     0.01       1.02     0.99     1.07     0.03    8
       System I/O control file sequential read                      8,279       0.0            9.62     1.2     0.01       1.16     1.13     1.18     0.02    8
       Cluster    gc buffer busy acquire                            1,638       0.0            8.86     5.4     0.01       6.77     3.35    11.89     3.36    8
       Other      CSS operation: data query                         1,600       0.0            7.74     4.8     0.01       4.84     3.76     6.00     0.81    8
       Concurrenc latch: cache buffers chains                      49,387       0.0            4.55     0.1     0.01       0.09     0.09     0.10     0.00    8

System Statistics - Per Second               DB/Inst: EXA/exa1  Snaps: 994-995

             Logical     Physical     Physical         Redo        Block         User
  I#         Reads/s      Reads/s     Writes/s   Size (k)/s    Changes/s      Calls/s      Execs/s     Parses/s   Logons/s       Txns/s
---- --------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ---------- ------------
   1      170,727.68    170,406.3          2.7      5,017.9         17.2          1.6        664.6          3.8       0.14          0.0
   2      160,987.67    162,898.8          2.5      4,833.9         22.4          1.4        630.4          5.9       0.13          0.1
   3      163,967.64    163,394.0          2.1      4,825.7         15.0          1.4        638.2          3.6       0.14          0.0
   4      174,957.44    174,551.8          3.1      5,130.8         12.7          1.3        679.2          2.1       0.13          0.0
   5      169,568.64    170,028.9          1.8      5,012.8         10.4          1.4        658.4          2.1       0.14          0.0
   6      172,990.65    172,771.3          3.4      5,082.2         13.6          1.4        671.8          2.3       0.14          0.0
   7      172,572.60    172,256.4          1.8      5,064.6         14.0          1.4        671.6          3.5       0.14          0.0
   8      171,493.42    170,899.7          2.6      5,016.6          9.8          1.4        665.8          2.1       0.14          0.0
 ~~~ ~~~~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~~~
 Sum    1,357,265.74  1,357,207.1         20.0     39,984.6        115.1         11.5      5,280.0         25.3       1.10          0.3
 Avg      169,658.22    169,650.9          2.5      4,998.1         14.4          1.4        660.0          3.2       0.14          0.0
 Std        4,777.34      4,268.8          0.6        111.3          4.0          0.1         17.1          1.3       0.00          0.0

Update on 2014-Feb-08:

With work_unit=256 and only 160 slob sessions, I can push the IOPS over 1.3M. Although the cell is out of CPU(90%), CPU is pretty idle on db node(37%). To seek for 1.5M IOPS, I setup 768 slob schemas, try different small work_unit with more slob sessions. With work_unit=8 or work_unit=16, the db node is running out of cpu much faster than the storage cell. With work_unit=64, the CPU on db nodesand storage cell is relatively balance. The maximum IOPS observed with work_unit=64 is same as 1.4M(still can not get closer to 1.5M). Here are the graphs with work_unit=64, showing how IOPS, db node CPU utilization, Average Active Sessions, IO latency scale with the slob sessions.

--slob.conf
UPDATE_PCT=0
RUN_TIME=300
WORK_LOOP=0
SCALE=500000
WORK_UNIT=64
REDO_STRESS=HEAVY
LOAD_PARALLEL_DEGREE=32
SHARED_DATA_MODULUS=0

SLOB on Exadata X3-2: Random Read IOPS

I try to benchmark the random read IOPS capacity on one of Exadata X3-2 db node with SLOB, Here comes some figure.

1. Parameters

Disable prefecting and avoid multiple block read by adding below 3 init parameters.

_db_block_prefetch_limit      0
_db_block_prefetch_quota      0
_db_file_noncontig_mblock_rea 0

Block size is 8K, db_cache_siz is set to 128M(seems this the lowest db_cache_size I can set 11.2.0.3)

cpu_count                     32
db_block_size                 8192
db_cache_size                 134217728

2. Segment size and sql

prepare 128 users, each have the table CF1 which size is around 8G(with scale=1000000 in slob.conf)

setup.sh IOPS 128

The loading of 8K block random random reads is triggered by below query.

SELECT COUNT(C2) FROM CF1 WHERE CUSTID > ( :B1 - :B2 ) AND (CUSTID < :B1 )

Observations

The Max IOPS is around 140K, when the concurrent process is 96. The avg wait time of ‘db file sequential read’ increase when the concurrent process beyond 64, and CPU utilization drops when the processes number beyond 96, seem the system is overloading and the performance degrades.

Update on 2013-Feb-09:

After below two change, I can gain 250K IOPS with 64 sessions on single db node(almost twice IOPS than before the change), when the cpu is 90%.

Change 1. Comment out the below parameters to enable db block prefect.

--*._db_block_prefetch_limit = 0
--*._db_block_prefetch_quota = 0
--*._db_file_noncontig_mblock_read_count = 0

Change 2. update the WORK_UNIT=16

--slob.conf
UPDATE_PCT=0
RUN_TIME=300
WORK_LOOP=0
SCALE=1000000
WORK_UNIT=16
REDO_STRESS=HEAVY
LOAD_PARALLEL_DEGREE=8
SHARED_DATA_MODULUS=0

The awr number of single block read and physical read, with 32/64/96/128 slob sessions.

[oracle@demo01db01 awr]$ grep "cell single block physical rea" * | sort -t_ -k1 -n
32_awr.txt:cell single block physical rea    9,257,339       3,778      0   39.3 User I/O
64_awr.txt:cell single block physical rea   15,161,265       8,301      1   43.2 User I/O
96_awr.txt:cell single block physical rea   19,802,040      13,751      1   47.6 User I/O
128_awr.txt:cell single block physical rea   17,269,328      15,964      1   41.4 User I/O
[oracle@demo01db01 awr]$ grep "Physical reads:" * | sort -t_ -k1 -n
32_awr.txt:  Physical reads:          197,068.2       59,670,688.0
64_awr.txt:  Physical reads:          249,996.2        3,594,314.9
96_awr.txt:  Physical reads:          253,909.9        3,669,155.5
128_awr.txt: Physical reads:          255,982.8        3,733,850.4

SLOB – Knowing the Maximum LIO Number

This is a quick blog post to learn the maximun LIO on and old server by Kevin Closson‘s SLOB.

Why care about LIO.

With the apoption of SSD for storage cache and database files, the tranditional database system bottleneck on IO subsystem is greatly mitigated. The cpu power on database server can easily become the new bottleneck. Because Oracle is license by the database cpu processing power, which is the number of processor, the dba need to keep an eye on the cpu resource of database server. It’s good to learn the processing power by oracle logical IO, and learn how many LIO the database server can handle. SLOB is an amazing tool to learn abount system capacity and limit, Not only for measuring the IOPS and latency, but also the LIO maximun number.

In this litle test, I’ll look at the LIO maximun number only. The test server has 8 cores, Linux 2.6.9-55, Oracle version is 10.2.0.5.6. It’s one node of RAC.

[oracle SLOB]$ cat /proc/cpuinfo | grep -e "processor\|model name"
processor       : 0
model name      : Intel(R) Xeon(R) CPU           E5440  @ 2.83GHz
...
processor       : 7
model name      : Intel(R) Xeon(R) CPU           E5440  @ 2.83GHz

SQL> show parameter cpu_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     8

Running with below 8 concurrent read only session(./runit 8 with update_pct=0), here comes the awr.txt and mpstats.out. The 8 cores are almost 100 percent busy. From the awr report, the 8-core server is able to process 1,845,584 LIO per seconds, it’s 230,000 per core. New Intel processor is probably faster, but the 230,000 per core may give you a crude estimation.

~~~~~~~~~~~                       Begin        End
                             ---------- ----------
               Buffer Cache:     3,072M     3,072M  Std Block Size:         8K
           Shared Pool Size:     1,024M     1,024M      Log Buffer:    14,328K

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:              4,439.12              3,884.76
              Logical reads:          1,845,683.68          1,615,193.45
              Block changes:                 12.25                 10.72
             Physical reads:                  0.81                  0.71
            Physical writes:                  1.45                  1.27
                 User calls:                  7.13                  6.24
                     Parses:                  4.84                  4.23
                Hard parses:                  0.03                  0.02
                      Sorts:                  2.96                  2.59
                     Logons:                  0.13                  0.12
                   Executes:              7,161.47              6,267.14
               Transactions:                  1.14


Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
CPU time                                          2,359          98.7
latch: cache buffers chains             629           4      7    0.2 Concurrenc
reliable message                         23           3    149    0.1      Other
PX Deq: reap credit                   1,425           3      2    0.1      Other
db file sequential read                 514           3      6    0.1   User I/O
          -------------------------------------------------------------

[oracle 8.0]$ cat db_stats.out
1|30000|29462|98.207|
3|30000|29675|98.917|
8|30000|29468|98.227|
7|30000|29562|98.54|
2|30000|29353|97.843|
6|30000|29304|97.68|
5|30000|28902|96.34|

[oracle 8.0]$ more mpstat.out
...
04:14:41 PM  CPU   %user   %nice %system %iowait    %irq   %soft   %idle    intr/s
04:14:44 PM  all   99.13    0.00    0.75    0.00    0.00    0.12    0.00   2614.72
04:14:44 PM    0   96.66    0.00    2.68    0.00    0.00    0.67    0.00   1574.25
04:14:44 PM    1   99.33    0.00    1.00    0.00    0.00    0.00    0.00      0.00
04:14:44 PM    2  100.00    0.00    0.33    0.00    0.00    0.00    0.00      0.00
04:14:44 PM    3   99.33    0.00    1.00    0.00    0.00    0.00    0.00     30.77
04:14:44 PM    4  100.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
04:14:44 PM    5  100.33    0.00    0.33    0.00    0.00    0.00    0.00      0.00
04:14:44 PM    6  100.33    0.00    0.33    0.00    0.00    0.00    0.00   1003.68
04:14:44 PM    7  100.00    0.00    0.33    0.00    0.00    0.00    0.00      6.02
...

SLOB: 有趣的IO测试工具

除了ORION, 如果想更真实测试存储的IO能力, Kevin Closson的工具SLOB(Silly Little Oracle Benchmark)可能会很有用. 见他的博客介绍, 需要翻墙访问.

http://kevinclosson.wordpress.com/2012/02/06/introducing-slob-the-silly-little-oracle-benchmark/

下面是下载链接, 配置很简单, 看一下README-FIRST就知道了.

http://oaktable.net/articles/slob-silly-little-oracle-benchmark

使用SLOB, 不同类型的测试只需要控制db_cache_size大小.
继续阅读