Wednesday, August 26, 2009

Accessing MySQL tables from UDF + storage engine API

A couple of weeks ago, at a MySQL study group in Tokyo I presented about various kinds of MySQL hacking techniques, such as debugging with MS Visual Studio or gdb, tracing with DTrace, writing information schema plugins, accessing tables from UDF and storage engine API, and extending MySQL server code itself. This 90-minute session went successful. Nearly 100 people attended, including famous MySQL developers such as Kazuho Oku-san and Kentoku Shiba-san, having productive discussions with attendees and quite a lot of people seemed interested in MySQL hacking. The slides are written in Japanese, but sample codes can be understandable and can be downloaded here.
What audiences were most interested in was accessing MySQL tables directly from Plugins(currently UDFs) and storage engine API. I showed a sample UDF code, based on what Kazuho-san released over one year ago, which is over 10 times faster than SQL or stored procedures for complex operations. Excerpted codes are as follows(skipping error handling).


#define MYSQL_SERVER 1
#include
...
/* initializing tables, fields and indexes */
THD *thd = current_thd;
TABLE_LIST tables;
TABLE *table;
tables.init_one_table("test","message", TL_READ);
simple_open_n_lock_tables(thd, &tables);
table= tables.table;
message_id_fld = get_field(table, "id");
message_user_id_fld = get_field(table, "user_id");
message_key = index_init(table, "user_id", true);

/* accessing table. Equivalent query is
SELECT id FROM message WHERE user_id=? ORDER BY id DESC LIMIT 1; */
uchar* key_buff= (uchar*) thd->alloc(message_key->key_length);
bzero(key_buff, message_key->key_length);
int null_offset= message_user_id_fld->null_bit;
int4store(key_buff + null_offset, user_id);
err= table->file->index_read_last_map(table->record[0], key_buff, 1);
fprintf(stderr, "id=%lld\n", message_id_fld->val_int());

/* closing objects */
table->file->ha_index_end();
close_thread_tables(thd);


By defining MYSQL_SERVER to 1, you can access to MySQL internal structures.
You can build the sample UDF on development environment as follows. MySQL source code is required to build when accessing to internal structures.

g++ -g -DSAFE_MUTEX -DSAFEMALLOC \
-Wall -fno-rtti -fno-exceptions -fPIC -shared \
-I/debug-built-mysql-src/include \
-I/debug-built-mysql-src/regex \
-I/debug-built-mysql-src/sql \
-o udf_sample.so udf_sample.cc


On production environment, skip using -DSAFE_MUTEX -DSAFEMALLOC and include MySQL source built with configurations for production. When using some macros(i.e. current_thd) that behave differently between debug and prod configurations, you need to include appropriate source directly.

g++ -g \
-Wall -fno-rtti -fno-exceptions -fPIC -shared \
-I/prod-built-mysql-src/include \
-I/prod-built-mysql-src/regex \
-I/prod-built-mysql-src/sql \
-o udf_sample.so udf_sample.cc


This snippet skips error handling etc for good legibility, but you need to carefully handle errors, otherwise mysqld crashes. Please check bench_udf.cc, build_udf_debug.sh, and build_udf_prod.shin my sample code.

MySQL is fast, but it still has many expensive CPU operations such as parsing SQL statements, making execution plans every time (MySQL doesn't have optimizer-plan caching mechanism), a lot of memcpy/malloc operations, etc. If you write a UDF which directly accesses tables through storage engine API, these expensive operations can be skipped so it can be much faster than SQL statements in many cases.
The above code looks like using key value store APIs (i.e. TokyoCabinet APIs). Most KVSs are faster than MySQL, and one of the reasons is they don't need to do extra operations such as making execution plans etc. But it is not impossible to use MySQL like fast KVS as above. You can not only get benefits about better performance, but also utilize traditional RDBMS implementations such as B+Tree (range scan), perfect durability, schema awareness. Data is stored in one place, and you can get the same data regardless of accessing methods. For example, you can use UDF for some special parts (i.e. picking up all friends' latest messages, which is difficult for MySQL to fetch quickly), and use normal SQL statements for others. I assume such two-way client interfaces (fast API and traditional SQL) can be a common implementation for next-generation RDBMS. Third, this is currently not common hacks so there is almost no document. For example, handling two or more tables requires additional codes, but this is not documented (I am still fully not understand to safely handle them).
Of course there are many disadvantages about using UDFs and storage engine APIs directly. First, it highly depends on MySQL versions. Storage engine APIs are standardized but client side operations such as calling table open/lock functions are different from MySQL versions. Second, mysqld easily crashes if UDF has bugs.
But there are a lot of possibilities here and this certainly attracts developers. For example, Kazuho-san yesterday published a very interesting UDF: mycached (memcached server udf). It listens memcached protocol, then directly accessing a table through storage engine API (primary key lookup). According to his benchmark this is twice as fast as SELECT with primary key lookup (if records are cached in buffer pool). I assume this approach will work very well as a durable key-value store, for example storing session data, last access timestamp etc.

6 comments:

rpbouman said...

Hi!

hey, very interesting! I wasn't aware you actually could access internals through a UDF - probably due to my general lack of C/C++ knowledge and insight in the code base.

Just curious, can you describe the minimal recipe that allows you to access internals through a UDF? I mean which headers, compilation options etc?

TIA,

Roland

Anonymous said...

Code snippet as presented is likely to crash the server if the UDF is called in a SELECT statement when another table is already opened.
You should check the return of open_and_lock().
But yes, it naturally would be faster. ;)
A way to "get around" the open_and_lock() limitation would be to create a THD object for the purpose. There is a risk of deadlock but those can be broken.

Yoshinori Matsunobu said...

Hi Roland,

I added more detail in my post. Define MYSQL_SERVER to 1, including include/regex/sql MySQL source directries.

Yoshinori Matsunobu said...

Hi atcurtis,

Yes, thanks. My sample code (url is published in my post) checks simple_open_n_lock_tables() errors as below. You're right, mysqld crashes if skipping to check errors.
if (simple_open_n_lock_tables(thd, &tables))
{
close_thread_tables(thd);
DBUG_RETURN(-1);
}

Anonymous said...

cool

Sildenafil Citrate said...

I have a friend that has taught me some of MySQL hacking techniques debugging with MS Visual Studio or gdb, tracing with DTrace and I did not know about this, thank you very much!

Post a Comment