Understanding I/O latency and queueing

29 August 2013 – Gerard Almon

Storage administrators and engineers often ask questions about response times and I/O rates and how to measure if there is a potential storage performance problem. It is important to understand some of the basics. The I/O rate is commonly referred to as IOPS. To any specific disk or storage LUN, the I/O rate refers to the number of disk transfers or I/O’s that have been transferred to the disk in each 1 second interval. The question is then normally what should a good I/O rate be? This is where queueing, block size and response times come in. It is important to remember the very basic maths that there are only a 1000 milliseconds in a second, so if you have a very simple environment with a SATA disk that can only do a single I/O at a time (queue depth of 1), your maximum I/O rate would depend on how long each I/O takes to complete. This is referred to as the latency or response time.

As an example, if each I/O in this case takes 1ms (millisecond) to complete, you can theoretically only do 1000 IOPS due to the basic maths shown above. If the response time is 10ms, your maximum IO rate would be 100 IOPS. Again, the block size of each I/O is irrelevant in the calculation.

Enterprise storage controllers can however handle multiple concurrent I/O’s to the same device which is also referred to as the I/O queue to the disk. The operating system and HBA drivers can be configured to send multiple concurrent I/O’s to the disk. In this case, if our individual response time for each I/O is 10ms, but we can send 16 concurrent I/O’s it is still quite easy to work our your maximum I/O rate (16 x 100 = 1600 IOPS). This means if the queue depth is 16, you can do a total of 1600 IOPS to a device if the response time is 10ms.

Although block size has nothing to do with the calculations above, it is important to note that bigger block sizes generally take longer to transfer, therefore will usually increase the response time for the individual I/O’s. For instance, if a 2KB block take 1ms to transfer, a 64KB block could potentially take 5ms due to the extra data payload being transferred.

Written by