Fio Test Options and Examples
blocksize
This options determines the block size for the I/O units used during
the test. The default value for blocksize is 4k (4KB). This option can
be set for both read and write tests. For random workloads, the default
value of 4k is typically used, for sequential workloads, a value of 1M
(MB) is usually used. Change this value to whatever your production
environment uses so that you are replicating the real world scenario as
much as possible. If your servers deal with 4K block sizes 99% of the
time, then why test out performance using 1MB blocksize?
--blocksize=4k (default)
ioengine
By default, Fio will run tests using the sync io engine, but if you
want to change the engine used, you can. There are many different
options you could change this value to, but on Linux the most common
options are sync or libaio if the kernel supports it.
--ioengine=sync (default)
iodepth
The iodepth option defines the amount of IO units that will continue
to hammer a file with requests during the test. If you are using the
default sync ioengine, then increasing the iodepth beyond the default
value of 1 will not have an effect. Even if you change the ioengine to
use something like libaio the OS might restrict the maximum iodepth and
ignore the specified value. Because of this I recommend starting off
testing with an iodepth of 1 and raise this to something like 16 and
test again, if you do not see any performance differences then you may
not want to even specify this option, especially if you have set
directio to a value of 1. Again, every server / OS is different so test
out a few combinations of options before you start recording results.
--iodepth=1 (default)
direct
This option tells Fio whether or not it should use direct IO, or
buffered IO. The default value is "0" which means that Fio will use use
buffered I/O for the test. If you set this value to 1 then Fio will
avoid using buffered IO, usually this is similar to O_DIRECT. Using
buffered IO will almost always provide better performance than
non-buffered IO, especially for read tests, or if you are testing out a
server with a very large amount of RAM, using non-buffered IO helps to
avoid inflated results. If you every run a test and Fio tells you that
an SSD performed 600,000 IOPs, odds are it's not, and Fio is reading out of RAM, which will obviously be faster.
--direct=0 (default)
direct=1 ¹öÆÛ¸¦ »ç¿ëÇÏÁö ¾Ê°í ÀåÄ¡¿¡ Á÷Á¢ i/o
fsync
The fsync option tells Fio how often it should use fsync to flush
"dirty data" to disk. By default this value is set to 0 which means
"don't sync". Many applications perform like this and leave it up to
Linux to figure out when to flush data from memory to disk. If your
application or server always flushes every write to disk (meta-data and
data) then you should include this option and set it to a 1. If your
application does not flush data to disk after each write, or you aren't
too worried about potential data loss, then leave this value alone.
Setting fsync to 1 will completely avoid the buffering of writes, so if
you want to see the "worst case" performance IO performance for a block
device, set fsync to 1 and run a random write test. Results will be much
lower than without fsync, but since every single write operation has to
get flushed to disk, the disk will be stressed.
--fsync=0 (default)
Fio Random Write and Random Read Command Line Examples
Random Write
The command below will run a Fio random write test. This test writes a
total of 4GB files (8 jobs x 512MB each = 4GB total size being
accessed) running 8 processes at once. If you are testing a server with
say 8GB of RAM, you would want to adjust the file size to be double the
RAM to avoid excessive buffering, if there is 8GB of RAM, then set size
to 2G, or leave the file size alone and double the amount of jobs. If
you don't have that much space to test you can change "--direct=0" to
"--direct=1" which will help to avoid caching the writes / buffering
them, this might happen in the real world, but if you just want to
isolate the block device performance without Linux caching / buffering
the data, either use direct=1 or use a data set that is 2 times larger
than the amount of RAM to make it impossible to cache / buffer all of
the writes. By using the Group reporting option FIO will combine each
job's stats into one aggregate result so the output is much easier to
read. I'm using a QD of 1 for this example and am using buffered IO,
both of these settings are the FIO defaults, so this is the most basic
random write test you can run.
fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --direct=0 --size=512M --numjobs=8 --runtime=240 --group_reporting
For example, I would run the command below on a 1GB SSD LiquidWeb StormVPS
to get a quick idea of it's random write performance. Here I am only
running the test for 60 seconds, just to gather the results quickly for
this example.
fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=randwrite --bs=4k --direct=0 --size=256M --numjobs=8 --runtime=60 --group_reporting
Once I run the command above, the output will look like this. The
test begins as soon as the last of the 8 files is laid out by FIO.
randwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
...
randwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.0.13
Starting 8 processes
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
randwrite: Laying out IO file(s) (1 file(s) / 256MB)
Jobs: 5 (f=5): [w_w_www_] [86.7% done] [0K/286.8M/0K /s] [0 /73.5K/0 iops] [eta 00m:02s]
Once the test is complete, FIO will output the test results, which
looks like the output below. There are a ton of stats here and at first
it's a little overwhelming. When I run a FIO test I always record the full results and store them somewhere.
Even if I only care about 1 or two stats like iops or 95% clat, I still
store all the results somewhere in case I need to grab another stat
later on. Usually I'll store the full results in a Google Sheet, in a
note in the same cell as the iops. If you don't store all of this data
and only record the iops, what happens if you need to recall the date
you ran the test? I placed a * next to the lines that I usually pay attention to.
randwrite: (groupid=0, jobs=8): err= 0: pid=22394: Sun Mar 1 13:13:18 2015
* write: io=2048.0MB, bw=169426KB/s, iops=42356 , runt= 12378msec
slat (usec): min=1 , max=771608 , avg=177.53, stdev=5435.09
clat (usec): min=0 , max=12601 , avg= 1.46, stdev=65.22
lat (usec): min=2 , max=771614 , avg=180.20, stdev=5436.13
clat percentiles (usec):
| 1.00th=[ 0], 5.00th=[ 0], 10.00th=[ 0], 20.00th=[ 0],
| 30.00th=[ 0], 40.00th=[ 0], 50.00th=[ 1], 60.00th=[ 1],
* | 70.00th=[ 1], 80.00th=[ 1], 90.00th=[ 1], 95.00th=[ 1],
| 99.00th=[ 2], 99.50th=[ 3], 99.90th=[ 70], 99.95th=[ 217],
| 99.99th=[ 1160]
bw (KB/s) : min= 154, max=80272, per=12.29%, avg=20817.29, stdev=16052.89
lat (usec) : 2=95.46%, 4=4.06%, 10=0.16%, 20=0.13%, 50=0.07%
lat (usec) : 100=0.04%, 250=0.04%, 500=0.03%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
cpu : usr=0.94%, sys=15.12%, ctx=153139, majf=0, minf=201
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=524288/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=2048.0MB, aggrb=169425KB/s, minb=169425KB/s, maxb=169425KB/s, mint=12378msec, maxt=12378msec
Disk stats (read/write):
* vda: ios=62/444879, merge=28/119022, ticks=168/530425, in_queue=530499, util=89.71%
The key things to look for from these results are:
- - On the 1GB LiquidWeb SSD VPS, the fio test was able to achieve A Lot more than this run displays random write IOPS
with buffered IO, a QD of 1, and 8 jobs writing to 8 x 256MB files for
up to 60 seconds. Almost everyone uses the IOPS stat instead of the BW
stat for random reads or writes since BW is usually used for sequential
tests. An IOP is 1 input or output operation, IOPS is the amount of
those operations performed in 1 second. The more IOPS the better.
- clat percentiles (usec) 95.00th=[ 1] - I prefer to
look at clat instead of slat because clat is the time between submission
and completion, which is more accurate than just slat which just tells
you the submission latency. I like to use the 95% value which tells you
that 95% of all IO operations completed in under this time. It does not
count the slowest 5%, but there will always be some requests that are
slower, we just want to find out how fast most requests would be
complete. If you use an average or max number, it's a lot harder to
understand how quickly most requests complete. The values for clat are
in microseconds, 1 microsecond (1 us) means that the request took
1/1000000 of a second to complete. Don't get this value mixed up with 1
Millisecond which is 1/1000 of a second, and is a few orders of
magnitude slower. This value may not always be accurate, especially if you are testing out a VPS / Cloud Server. Anytime you get a hypervisor
involved with IO there will be some wonkyness since the guest instance
can't directly access the block device (at least in most cases), the
times may be slightly off compared to testing on Bare Metal hardware.
- util=89.71% - Once we know how many IOPS the device can
handle, and how quickly 95% of the operations complete, the last thing I
usually want to know is if the device was maxed out during the test. In
this case the block storage device for my 1GB VPS was only 90% utilized
which means it still has capacity to serve some requests even while I
was running the test. Usually you want to push the device to 100%
utilization during the test or you won't get it's true performance
potential.
Random Read
Random read test. This reads a total of 4GB files, running 8
processes at once. If you are testing a server with say 8GB of RAM, you
would want to adjust the file size to be double the RAM to avoid
excessive buffering, if there is 8GB of RAM, then set size to 2G. Group
reporting will combine each job's stats so you get an overall result.
fio --name=randread --ioengine=libaio --iodepth=16 --rw=randread --bs=4k --direct=0 --size=512M --numjobs=8 --runtime=240 --group_reporting