¿ù°£ Àα⠰Խù°

°Ô½Ã¹° 1,372°Ç
   
FIO ¿É¼ÇÁ¤¸®
±Û¾´ÀÌ : ÃÖ°í°ü¸®ÀÚ ³¯Â¥ : 2016-08-08 (¿ù) 11:59 Á¶È¸ : 11569
±ÛÁÖ¼Ò :
                                

https://wiki.mikejung.biz/Benchmarking#Fio

Fio Test Options and Examples

blocksize

This options determines the block size for the I/O units used during the test. The default value for blocksize is 4k (4KB). This option can be set for both read and write tests. For random workloads, the default value of 4k is typically used, for sequential workloads, a value of 1M (MB) is usually used. Change this value to whatever your production environment uses so that you are replicating the real world scenario as much as possible. If your servers deal with 4K block sizes 99% of the time, then why test out performance using 1MB blocksize?

--blocksize=4k (default)

ioengine

By default, Fio will run tests using the sync io engine, but if you want to change the engine used, you can. There are many different options you could change this value to, but on Linux the most common options are sync or libaio if the kernel supports it.

--ioengine=sync (default)

iodepth

The iodepth option defines the amount of IO units that will continue to hammer a file with requests during the test. If you are using the default sync ioengine, then increasing the iodepth beyond the default value of 1 will not have an effect. Even if you change the ioengine to use something like libaio the OS might restrict the maximum iodepth and ignore the specified value. Because of this I recommend starting off testing with an iodepth of 1 and raise this to something like 16 and test again, if you do not see any performance differences then you may not want to even specify this option, especially if you have set directio to a value of 1. Again, every server / OS is different so test out a few combinations of options before you start recording results.

--iodepth=1 (default)

direct

This option tells Fio whether or not it should use direct IO, or buffered IO. The default value is "0" which means that Fio will use use buffered I/O for the test. If you set this value to 1 then Fio will avoid using buffered IO, usually this is similar to O_DIRECT. Using buffered IO will almost always provide better performance than non-buffered IO, especially for read tests, or if you are testing out a server with a very large amount of RAM, using non-buffered IO helps to avoid inflated results. If you every run a test and Fio tells you that an SSD performed 600,000 IOPs, odds are it's not, and Fio is reading out of RAM, which will obviously be faster.

--direct=0 (default)
direct=1   ¹öÆÛ¸¦ »ç¿ëÇÏÁö ¾Ê°í ÀåÄ¡¿¡ Á÷Á¢ i/o 

fsync

The fsync option tells Fio how often it should use fsync to flush "dirty data" to disk. By default this value is set to 0 which means "don't sync". Many applications perform like this and leave it up to Linux to figure out when to flush data from memory to disk. If your application or server always flushes every write to disk (meta-data and data) then you should include this option and set it to a 1. If your application does not flush data to disk after each write, or you aren't too worried about potential data loss, then leave this value alone. Setting fsync to 1 will completely avoid the buffering of writes, so if you want to see the "worst case" performance IO performance for a block device, set fsync to 1 and run a random write test. Results will be much lower than without fsync, but since every single write operation has to get flushed to disk, the disk will be stressed.

--fsync=0 (default)

Fio Random Write and Random Read Command Line Examples

Random Write

The command below will run a Fio random write test. This test writes a total of 4GB files (8 jobs x 512MB each = 4GB total size being accessed) running 8 processes at once. If you are testing a server with say 8GB of RAM, you would want to adjust the file size to be double the RAM to avoid excessive buffering, if there is 8GB of RAM, then set size to 2G, or leave the file size alone and double the amount of jobs. If you don't have that much space to test you can change "--direct=0" to "--direct=1" which will help to avoid caching the writes / buffering them, this might happen in the real world, but if you just want to isolate the block device performance without Linux caching / buffering the data, either use direct=1 or use a data set that is 2 times larger than the amount of RAM to make it impossible to cache / buffer all of the writes. By using the Group reporting option FIO will combine each job's stats into one aggregate result so the output is much easier to read. I'm using a QD of 1 for this example and am using buffered IO, both of these settings are the FIO defaults, so this is the most basic random write test you can run.

fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --direct=0 --size=512M --numjobs=8 --runtime=240 --group_reporting

For example, I would run the command below on a 1GB SSD LiquidWeb StormVPS to get a quick idea of it's random write performance. Here I am only running the test for 60 seconds, just to gather the results quickly for this example.

fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --direct=0 --size=256M --numjobs=8 --runtime=60 --group_reporting

Once I run the command above, the output will look like this. The test begins as soon as the last of the 8 files is laid out by FIO.

randwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
...
randwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.0.13
Starting 8 processes
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
Jobs: 5 (f=5): [w_w_www_] [86.7% done] [0K/286.8M/0K /s] [0 /73.5K/0  iops] [eta 00m:02s]

Once the test is complete, FIO will output the test results, which looks like the output below. There are a ton of stats here and at first it's a little overwhelming. When I run a FIO test I always record the full results and store them somewhere. Even if I only care about 1 or two stats like iops or 95% clat, I still store all the results somewhere in case I need to grab another stat later on. Usually I'll store the full results in a Google Sheet, in a note in the same cell as the iops. If you don't store all of this data and only record the iops, what happens if you need to recall the date you ran the test? I placed a * next to the lines that I usually pay attention to.

randwrite: (groupid=0, jobs=8): err= 0: pid=22394: Sun Mar  1 13:13:18 2015
*  write: io=2048.0MB, bw=169426KB/s, iops=42356 , runt= 12378msec
    slat (usec): min=1 , max=771608 , avg=177.53, stdev=5435.09
    clat (usec): min=0 , max=12601 , avg= 1.46, stdev=65.22
     lat (usec): min=2 , max=771614 , avg=180.20, stdev=5436.13
    clat percentiles (usec):
     |  1.00th=[    0],  5.00th=[    0], 10.00th=[    0], 20.00th=[    0],
     | 30.00th=[    0], 40.00th=[    0], 50.00th=[    1], 60.00th=[    1],
*     | 70.00th=[    1], 80.00th=[    1], 90.00th=[    1], 95.00th=[    1],
     | 99.00th=[    2], 99.50th=[    3], 99.90th=[   70], 99.95th=[  217],
     | 99.99th=[ 1160]
    bw (KB/s)  : min=  154, max=80272, per=12.29%, avg=20817.29, stdev=16052.89
    lat (usec) : 2=95.46%, 4=4.06%, 10=0.16%, 20=0.13%, 50=0.07%
    lat (usec) : 100=0.04%, 250=0.04%, 500=0.03%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=0.94%, sys=15.12%, ctx=153139, majf=0, minf=201
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=524288/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=2048.0MB, aggrb=169425KB/s, minb=169425KB/s, maxb=169425KB/s, mint=12378msec, maxt=12378msec

Disk stats (read/write):
*  vda: ios=62/444879, merge=28/119022, ticks=168/530425, in_queue=530499, util=89.71%

The key things to look for from these results are:

  • - On the 1GB LiquidWeb SSD VPS, the fio test was able to achieve A Lot more than this run displays random write IOPS with buffered IO, a QD of 1, and 8 jobs writing to 8 x 256MB files for up to 60 seconds. Almost everyone uses the IOPS stat instead of the BW stat for random reads or writes since BW is usually used for sequential tests. An IOP is 1 input or output operation, IOPS is the amount of those operations performed in 1 second. The more IOPS the better.
  • clat percentiles (usec) 95.00th=[ 1] - I prefer to look at clat instead of slat because clat is the time between submission and completion, which is more accurate than just slat which just tells you the submission latency. I like to use the 95% value which tells you that 95% of all IO operations completed in under this time. It does not count the slowest 5%, but there will always be some requests that are slower, we just want to find out how fast most requests would be complete. If you use an average or max number, it's a lot harder to understand how quickly most requests complete. The values for clat are in microseconds, 1 microsecond (1 us) means that the request took 1/1000000 of a second to complete. Don't get this value mixed up with 1 Millisecond which is 1/1000 of a second, and is a few orders of magnitude slower. This value may not always be accurate, especially if you are testing out a VPS / Cloud Server. Anytime you get a hypervisor involved with IO there will be some wonkyness since the guest instance can't directly access the block device (at least in most cases), the times may be slightly off compared to testing on Bare Metal hardware.
  • util=89.71% - Once we know how many IOPS the device can handle, and how quickly 95% of the operations complete, the last thing I usually want to know is if the device was maxed out during the test. In this case the block storage device for my 1GB VPS was only 90% utilized which means it still has capacity to serve some requests even while I was running the test. Usually you want to push the device to 100% utilization during the test or you won't get it's true performance potential.

Random Read

Random read test. This reads a total of 4GB files, running 8 processes at once. If you are testing a server with say 8GB of RAM, you would want to adjust the file size to be double the RAM to avoid excessive buffering, if there is 8GB of RAM, then set size to 2G. Group reporting will combine each job's stats so you get an overall result.

fio --name=randread --ioengine=libaio --iodepth=16 --rw=randread --bs=4k --direct=0 --size=512M --numjobs=8 --runtime=240 --group_reporting



À̸§ Æнº¿öµå
ºñ¹Ð±Û (üũÇÏ¸é ±Û¾´À̸¸ ³»¿ëÀ» È®ÀÎÇÒ ¼ö ÀÖ½À´Ï´Ù.)
¿ÞÂÊÀÇ ±ÛÀÚ¸¦ ÀÔ·ÂÇϼ¼¿ä.
   

 



 
»çÀÌÆ®¸í : ¸ðÁö¸®³× | ´ëÇ¥ : ÀÌ°æÇö | °³ÀÎÄ¿¹Â´ÏƼ : ·©Å°´åÄÄ ¿î¿µÃ¼Á¦(OS) | °æ±âµµ ¼º³²½Ã ºÐ´ç±¸ | ÀüÀÚ¿ìÆí : mojily°ñ¹ðÀÌchonnom.com Copyright ¨Ï www.chonnom.com www.kyunghyun.net www.mojily.net. All rights reserved.