Friday, April 24, 2009

Mastering the Art of Indexing - slides available

At the MySQL Conference & Expo 2009, I did a presentation "Mastering the Art of Indexing" , and now you can get the slides from MySQL Conference site .
I was excited that so many people attended to my session. I hope the session will help you to improve indexing techniques. Feedbacks are welcome.

I was asked from some people at the conference where my blog is. I am mostly publishing MySQL technical information in Japanese, but now I believe it's a good time to start publishing in English. I plan to post mainly benchmarking & performance analysis, high availability solutions, internationalization (which sometimes people have problems), etc.

4 comments:

Baron said...

Thanks. I missed your presentation. I did not know about the I/O scheduler queue size. Very useful.

Baron

peter-zaitsev said...

Yoshinori,

Very nice prepared slides. Good work. Regarding queue size Baron mentions I'm very surprised by results - the drive itself normally can't have TCQ more than few commands (and not everyone can support it) - could something else be effect here ?

Yoshinori Matsunobu said...

Hi Peter,

My understanding is as follows.

I/O requests are handled by the following order:
syscalls(write/pwrite)
-> I/O scheduler
-> Device Driver/Disks(Command Queuing)

I/O scheduler sorts incoming i/o requests by logical block addresses,
then sending them to a device driver.
I/O scheduler does not depend on storage devices so
is helpful for some parts (i.e. minimizing disk seek overheads),
not helpful for other parts (i.e. minimizing disk rotation overheads)

TCQ(and NCQ) can handle very limited number (normally 32-64) of i/o requests
but highly optimized for storage devices.
TCQ can minimize not only disk seeks but also disk rotation overhead.


Based on the above, sorting almost all requests on I/O sheculer then
sending them to TCQ would be nice.

Suppose 100,000 random read i/o requests come.
When i/o scheduler queue size is 128 (default in many cases),
TCQ gets i/o requests by almost random order,
so pretty high disk seek overhead happens
for each (handling 32-64 i/o requests) action.

When i/o scheduler queue size is 100,000,
TCQ gets all i/o requests by sorted order, so
seek overhead can be much smaller.


This explains my benchmarking results.

I/O scheduler queue size settings would not be helpful for InnoDB because
InnoDB internally does sorts i/o requests by storage order.
MyISAM does nothing (highly depending on OS) so this helps.

jametong said...

Excellent posts.

Post a Comment