Friday, May 20, 2011

Proper handling of insert-mostly, select-recently datasets

Some kinds of large tables such as chat messages, blog entries, etc have the following characteristics.

* huge number of records, huge data and index size
* insert and select mostly
* select from only recent data
* select by secondary index (i.e. user_id)
* secondary index entries are inserted in random order

What are optimal ways to handle these tables? The below single large table does not perform well.
CREATE TABLE message (
body VARCHAR(255),
created DATETIME,

The cause of poor performance is a secondary index on user_id. user_id is inserted in random order. Index size grows, and sooner or later it will exceed RAM size. Once index size on user_id exceeds RAM size, inserting into message table causes massive random read disk i/o, which reduces throughput significantly.
The below is a simple insert benchmark. Once random read disk i/o starts happening, throughput drops hugely. "Sequential order" means index entries are inserted sequentially, "Random order" means randomly.

This figure was what I presented at the MySQL Conference and Expo 2009. It's pretty old, but basic principles have not changed.

Using Range Partitioning

How can we make it faster? One of the best approaches in MySQL is using range partitioning, partitioned by date or primary key. This is one of my favorite features in MySQL. By using range partitioning, only the latest partition is actively accessed. Data/indexes in the rest partitions are much less accessed so they don't occupy buffer pool. Each partition size can be small enough to fit in memory, so insert performance does not drop.

CREATE TABLE message (
body VARCHAR(255),
created DATETIME,
) engine=InnoDB
PARTITION BY RANGE(to_days(d1)) (
PARTITION p201103 VALUES LESS THAN (to_days('2011-03-01')),
PARTITION p201104 VALUES LESS THAN (to_days('2011-04-01')),
PARTITION p201105 VALUES LESS THAN (to_days('2011-05-01')),
PARTITION p201106 VALUES LESS THAN (to_days('2011-06-01')),
PARTITION p201107 VALUES LESS THAN (to_days('2011-07-01')),

As long as INSERT statements do inserts order by partition key and other SQL statements fetch only the recent data, no random disk reads will happen. Partitioning itself has some CPU overheads, but it's almost negligible in the real workloads, compared to disk i/o overheads.

(update:) Index and data size on each partition can be measured from information schema.
mysql> SELECT partition_name, index_length, data_length, table_rows FROM 
information_schema.partitions WHERE table_name='message';
| partition_name | index_length | data_length | table_rows |
| p201103 | 15565062144 | 15527313408 | 145146231 |
| p201104 | 15522070528 | 15507390464 | 205873280 |
| p201105 | 9736028160 | 9945743360 | 88653190 |
| p201106 | 32768 | 16384 | 0 |
6 rows in set (0.13 sec)

In MySQL 5.0 or earlier versions where range partitioning is not supported, creating daily/weekly/monthly tables is a good way as a workaround, though applications have to aware of table name differences.

CREATE TABLE message_201103 ..
CREATE TABLE message_201104 ..
CREATE TABLE message_201105 ..

How about Database Sharding?

As you know, database sharding is very common approach for handling huge data. Is sharding good for handling these tables? Probably not. Database sharding is mainly used to reduce slow disk i/o by reducing data size per server. In the above case, inserts can be done in memory regardless of data size(10,000+ inserts/second), so from performance point of view splitting tables is not needed as long as applications can keep up with in-memory insert speed. From database size vs storage size point of view (disk capacity point of view), you'll need to archive or purge older data.

Actually I have seen a couple of times that people use NoSQLs supporting transparent sharding(Auto-Sharding) for these kinds of tables: such as MongoDB, Cassandra. Data size will sooner or later exceed disk size, so using unlimited horizontally scaling database sounds reasonable. But if the database products don't support range partitioning, sharding becomes much less optimal for handling these tables. Suppose you have 3000GB datasets and only recent 30GB data are mostly accessed. With MySQL 5.1+ range partitioning, you can simply manage one large(3000GB) table with weekly/daily partitions (as long as disk space is available). Only the latest partitions (30GB) are actively accessed and the rest partitions (2970GB) are less likely accessed. Single commodity database server can probably handle enough workloads.

On the other hand, if you shard 3000GB database without range partitioning support, you might need 300GB x 10 shards because the whole secondary indexes are accessed. This means you need 10 times more servers.

Transparent sharding is good for application developers, but I believe range partitioning is a mandatory feature for handling insert-mostly, select-recently huge datasets.


Anonymous said...

>> PARTITION p201103 VALUES LESS THAN (to_days('2011-03-01')),

If your data is growing, so you need alter the table from time to time?

Yoshinori Matsunobu said...

You can prepare many partitions beforehand. And ALTER TABLE .. ADD PARTITION does not rebuild existing partitions so can finish quickly.

Casey said...

I have a large table of TEXT (20 million rows) that is accessed by primary key but 90% of reads are recent data (say the last 500K rows).

Is there an advantage to partitioning here? It seems like partitioning by ID range would might conserve buffer pool space although I'm not totally sure how to do that.

Martin Farach-Colton said...


Fortunately the problem you describe has a direct solution: the TokuDB storage engine whose indexing performance stays strong at multi-terabyte+. (TokuDB is a fully MySQL compatible storage engine like InnoDB and MyISAM.)

Your post does a great job explaining the Achilles heel of OldSQL solutions that use B-tree indexing: They can't handle larger-than-memory indexing of non-sequential keys. Just to be clear, larger-than-memory performance collapse is not inherent in SQL databases but in B-tree databases. With TokuDB, you have full SQL support, with transactions and MVCC and the works, and still have great performance on big data.

Once you switch over your table to TokuDB - assuming you want to take it for a spin - it's important to take a look at your indexes. For example, in your case, INDEX(user_id) is implicitly INDEX(month, user_id), where the month is defined by the partitioning. Unfortunately, MySQL doesn't allow you to have a functional component in an index definition. (If it did, that would already work way better than partitioning, since 'month' as the first part of the key would make this index look a lot more sequential.) To get good performance, you'd need a different index than INDEX(user_id).

You didn't say what your queries were like, but assuming that you are doing equality tests on user_id you could define INDEX(user_id,d1). That should work well, and it would pose no indexing problem for TokuDB. Even better would be CLUSTERING INDEX(user_id,d1), because that index will cut out all the random I/O from your queries. No more slow lookups into the primary table. It's sequential I/O all the way.

Do you have a data set and set of queries that are available? We'd love to do a side-by-side comparison.

pachanga said...

Thanks for the post! BTW, what is the best way to measure the size of the index for InnoDB tables?

Yoshinori Matsunobu said...


Information schema has the statistics. I've added an example on the post.

pachanga said...


Thank you, just what I needed! And what do you think of usage of complex primary keys in InnoDB, e.g (player_id, id) ? According to your presentation, sequential primary keys is the best choice for inserts.

Anonymous said...

> From database size vs storage size point of view, you'll need to archive or purge older data.

What does "database size vs storage size" stand for ?

Yoshinori Matsunobu said...


I intended disk capacity.

Yingkuan Liu said...

Hi Yoshinori,

I noticed the index size is bigger than data of this table. Is this table very heavily indexed? Given the table should have majority of data in body column which I believe you don't need to index. It's strange to see such big index size.

also the table size fluctuate a lot in relation to row count. April has 50% more rows but almost same size. May has half rows but size is a lot more than half. But I guess that probably related to your application change and feature change etc.

Anonymous said...




Yoshinori Matsunobu said...


Index size and data size is just an example, so please don't take the numbers seriously. It is pretty common the index size gets bigger than data size if heavily indexed, but you're right data size should be much larger if body column is large, and highly depending on application changes.

Philip Zhong said...


now the MYSQL does not support online partition drop;the index is global index,so if we use the MYSQL partition,we can't use it like oracle partition,can you provide a good idea?

domain hosting said...

I think that, you are expert in database management. I am not that much expert in database but I like database very much. Thanks for sharing this useful and detail information.

Baby ear infection said...

I would like to thank you for the efforts you have made in writing this article. Raw food diet Toenail fungus infection Common digestive disorders Common digestive system diseases and disorders

rs gold said...

Really impressed! Everything is very, very clear, open is a description of the problem. It contains the information.

Runescape Gold
MapleStory Mesos

wow gold said...

Just how much has this told her i Buy Tera Goldwould WOW Gold bring forth during the past? Work concerning the item earlier than you purchasing the point. BoEs may well change well inwards evaluate by server that will help you waiter, to’MT striving Crummy Aion Golden adequate how of import a to discover the industry unity’ray working with. Lay out conseverative with RS Gold the help of everyone of the payments. Wearable’metric ton run off and buying a lot of BoEs that were enough products only usually requires nearly a year to trade first earnings. BoEs ar parlous, high up wages pieces, gum olibanum an excellent make a motion can result in briny network,Inexpensive Aion Amber World Of Warcraft Gold only a tight relocation could make you recede riot golden.

Runescape Gold said...

This really is our first-time my spouse and i click here. I discovered numerous engaging products with your website, especially its debate. In the tons of feedback in your content, I assume I'm not really the only one having all the leisure below! Continue the nice work.

tea length wedding dresses said...

Great article. I merely found your blog along with planned to declare i get genuinely liked looking at your blog blogposts. A reservoir of innovation. Reaped from the most popular designers and the new prominent. You are invited to be impressed with Amber’s selection of lace wedding gowns, by virtue of an incredible amount of self-motivated ideas that sparkling around. Have a digging-out on our span-new designer cheap wedding dresses plus size that are tagged into distinctive genre and style wedding dresses 2012.

Domain in India said...

This is my first time i visit here. I found so many interesting stuff in your blog especially its discussion. From the tons of comments on your articles, I guess I am not the only one having all the enjoyment here! keep up the good work.

ambersbridal said...

Thanks for the informative writing. Would mind updating some good tips about it. I still wait your next place.simple wedding dresses
mermaid wedding dresses
tea length wedding dresses

Runescape Gold said...

Thank you for sharing. Glad to see you.It is really a good post.

ambersbridal said...

A holy church, rings, a bouquet, a 3-floor wedding cake, lace wedding dress
champagne, and the moving promise “ I do”, make up the happiest moment in one’s life. Then it is the wedding.
lace wedding gowns
A gorgeous wedding dress has been inside a girl’s dream since she was five years old. It is widely said that the bride is the most beautiful woman all over the world.
Bridal gowns vary in different parts of the world. For example, cheap wedding dresses plus sizein traditional Chinese culture, the color red is regarded for centuries as the symbol of good luck; while in the Occident, wedding dresses 2012 a full-length white wedding dress is always associated with romance and sanctity. But, with the globalization, wedding dresses tend to be alike in the whole world.

Michael said...

MySql Hosting India

IndoUs Hosting is the leading MySql Hosting companies in India. The company provides dedicated as well as shared hosting services of MySql hosting. If you are looking for guaranteed MySql Hosting India, we can be the best option for you.

Anonymous said...

What was presented in this article has forced me to reconsider and redefine previously held viewpoints. This can only be a good thing. Kimberly@reverse phone lookup

GuildWars2Items said...

Would you take the pen in hand, plunge right in and just do it, struggling to keep up with the twists and turns of the torrents of words that take you where they take you Buy D3 Gold? Would your plans be so extensive that you never even got to the writing Cheap D3 Gold? But if you do decide to use it, what would you do with it? How would you play the game D3 Gold Sale?

GuildWars2Items said...

It's great to be great, but it's greater to be human Sell Runescape Gold, if I could rearrange the alphabet, I'd put buy guild wars 2 gold Y and I together guild wars 2 gold buy, A lifetime of happiness! No man alive could bear it; it would be hell on earth .

GuildWars2Items said...

Experience is a hard teacher because she gives the test first, the lesson afterwards cheap c9 gold, being happy doesn't mean that everything is perfect. It means that you've decided to look beyond the imperfections c9 money, Life is like a hot bath. It feels good while you're in it, but the longer you stay in, the more wrinkled you get WOW Power Leveling.

GuildWars2Items said...

You have a forever friend for life, and forever has no end D3 Gold Sale, and if you find such a friend, you feel happy and complete, because you need not worry Diablo iii Gold, your forever friend holds your hand and tells you that everything is going to be okay Buy Diablo iii Gold.

GuildWars2Items said...

And if you find such a friend, you feel happy and complete, because you need not worry Cheap Diablo iii Gold, Diablo iii Gold Sale your forever friend holds your hand and tells you that everything is going to be okay Diablo 3 Gold.

Maria Arredondo said...

Your future depends on your dreams. So go to sleep Diablo 3 Gold, he time of life is short ; to spend that shortness basely, it would be too long C9 Gold, It's great to be great, but it's greater to be human C9 Gold Sale. People who are serious about the relation are moody as they have devoted a lot that makes them worry about gains and losses.

Maria Arredondo said...

One meets its destiny on the road he takes to avoid it Cheap C9 Gold, If I could rearrange the alphabet, I'd put Y and I together c9 gold, People who are serious about the relation are moody as they have devoted a lot that makes them worry about gains and losses c9 online gold

Maria Arredondo said...

Suppose someone gave you a pen — a sealed, solid-colored pen.You could not see how much ink it had D3 Gold, think Positive Thoughts Every Day Buy D3 Gold, the brightest future will always be based on a forgotten past, you can't go on well in life until you let go of your past failures and heartaches Cheap D3 Gold.

Maria Arredondo said...

You have a forever friend for life, and forever has no end D3 Gold Sale, and if you find such a friend, you feel happy and complete, because you need not worry Diablo iii Gold, your forever friend holds your hand and tells you that everything is going to be okay Buy Diablo iii Gold.

dte said...

The consequences of today are determined by the actions of the past scarlet blade gold. To change your future, alter your decisions today scarlet blade gold, Experience is a hard teacher because she gives the test first, the lesson afterwards scarlet blade gold, but it takes character to keep you there.

buyGuildwars2gold said...

Empty what's full. Scratch where it itches. Don't go around saying the world owes you a living rs gold, The world owes you nothing. It was here first. Being happy doesn't mean that everything is perfect Runescape Gold, It means that you've decided to look beyond the imperfections. Do not pray for tasks equal to your powersrs gold.

Jess said...

They managed to provide a service for people and charge for money. People who do not want to spend time grind levels will need the cheap runescape gold service. You can take the chance to make a profit. You have another way to make money—buying and selling. You can buy items needed by other players and sell to them. They will pay more than the price you buy cheap ffxiv gil.

steve7876 said...

I am quite sure they will learn lots of new stuff here than anybody else. Thanks for sharing this informationSupplements in Australia

sadia sulaman said...

I just wanted to say that I found your site via Goolge and I am glad I did. Keep up the good work and I will make sure to bookmark you for when I have more free time away from the books. Thanks again! | | | | | | | | | |

obat penis said...

Father mother or sister I love what I see with this, is very interesting content that led me curious to go to your website is. our greetings alat bantu pria.

sextoy terbaru said...

I am very happy with what you convey this in your web. And I want to share some specific information, may be received here. previous post and I thank you for her great.
Kami membawa sebuah informasi yang cukup menarik menurut kami, info ini berkaitan dengan apa yang di butuhkan kaum pria remaja maupun yang sudah berkeluarga. info tersebut sebagai berikut. Toko alat pembesar penis berupa vakum dan alat tarik otot.
Selain itu Kami juga menyediakan Obat pembesar penis yang mengutamakan kesehatan, maka dari itu yang kami bawa ini sebuah ramuan Memperbesar penis yang terbuat dari bahan2 herbal alami. klik disini.

panca-samudera said...

The article posted was very informative and useful
thanks for sharing.
jaring futsal , jaring futsal murah , jual jaring futsal , toko jaring futsal , agen jaring futsal , jaring golf , jual jaring golf , jaring golf murah ,
jaring polynet , tangga darurat , jaring truk , jaring cargo , jaring outbound , jual rumput futsal murah , jual rumput sintetis murah ,
tali tambang , cargo net , jaring tanaman , jaring kassa , jaring proyek , jaring bangunan , jaring gedung , jaring pengaman proyek ,
jaring pengaman bangunan , jaring pengaman gedung , jaring peneduh , jaring waring , kasa hijau , tangga darurat , jaring gawang futsal , jual jaring gawang futsal murah , jaring peneduh , jaring truk , tali tambang nylon

lina sexshop said...

Berikut kami akan membahas tentang bagai mana cara memperbesar penis dengan hasil cepat dan aman tanpa operasi
jika anda ingin memperbesar penis dengan hasil cepat,aman tanpa operasi itu mudah anda bisa pilih cara memperbesar penis dengan obat pembesar penis herbal
seperti Pembesar Penis Obat Good Man

jika anda riskan konsumsi obat pembesar penis anda bisa pilih jalan lain seperti metode trapi ( pemakaian luar ) kami juga punya solusinya anda harus punya Alat Pembesar Penis sejenis Pro Extender atau Vimax Extender

Berikut penjelasan kami :

Apa Obat Good Man itu : Obat Good Man
Apa Vimax Extender itu : Vimax Extender Asli
Apa Pro Extender itu : Pro Extender Asli

Sekarang anda bisa tau rahasia cara memperbesar penis dengan aman tanpa operasi.selamat mencoba.

Thank you for the information , I wait for more information and I ask for a return visit our new website
>> Obat Pembesar penis <<

dialogue said...

more kudos

snapchat filters

ajong coy said...

thanks for sharing
jual obat penggugur kandungan
obat telat datang bulan manjur
obat penggugur kandungan
obat telat datang bulan terbaik
jual obat telat datang bulan
obat terlambat datang bulan

ajong coy said...

thanks for sharing
082210743003 / 2BFO841D
jual obat penggugur kandungan
obat telat datang bulan manjur
obat penggugur kandungan
obat telat datang bulan terbaik
jual obat telat datang bulan
obat terlambat datang bulan

perangsang wanita potenzol termanjur


Hosting Safety said...

Thanks for sharing such informative post on web hosting, keep updating.
Shared Hosting India | Best Web Hosting India | Shared Hosting companies India

gubab said...

IPL Live Streaming
IPL Live Stream
IPL 2017 schedule
IPL 2017 Points Table
IPL 2017
IPL 2017 Song
IPL 2017 Teams
IPL 2017 Tickets
KL Rahul Will Not Play In IPL After Virat Kohli
Top Players Injured Before Start Of IPL 2017

kelinci said...

Great post. Thanks
Jual Kelinci
Jual Kelinci
Holland Lops
Holland Lop
Kelinci Pedia

Hosting Safety said...

Hostingsafety is secured web hosting, we will provide best hosting services in India.
Click here to know more

Web Hosting India | Domain Name Registration India | cheap dedicated hosting India | cheap dedicated server India

Jason Borne said...

Avast Support | Avast Help
Norton Support | Norton Help
Gmail Customer Support

Jon Don said...

Office Setup
Office Com Setup
Printer Tech Support


Thanks for the information
I wait for more information and I ask for a return visit our new website Hammer Of Thor Asli
=> Hammer Of Thor Di Bogor
=> Hammer Of Thor Di Medan
=> Hammer Of Thor Di Semarang
=> Hammer Of Thor Di Bandung
=> Hammer Of Thor Di Bandar Lampung
Obat Good Man | Obat pembesar penis | Vimax Asli | Vimax Extender

Post a Comment