Shard Query



1   Overview
2   More about software
3   Shard Query setup
4   Remote-access test
5   Transparent-sharding test
  Related pages

1. Overview.

This document is intended for Linux developers. If you notice mistakes, write to:

Shard Query is PHP-based software that supports MySQL Proxy-based sharding.

Shard Query offers significant positives: It works with existing MySQL data sets and queries. It works with all MySQL storage engines. It supports multiple approaches to splitting up data. And it seems to be further along than other FOSS MySQL sharding frameworks.

Performance enhancements come from both scale-up and scale-out. Scale-up means parallel processing of records stored by an individual database server. Scale-out means distribution of records, and therefore workload, across multiple database servers.

I've tested Shard Query myself and it seems to work correctly.

In one basic test, I created a virtual MySQL server that consisted of two actual MySQL servers running on different hosts located in different countries. Data was distributed between the two hosts.

This document explains how I set up Shard Query on a Debian (Ubuntu) system and tested the software. Note: All steps were done as superuser except where otherwise noted.

The procedure used may or may not work for newer releases of Debian and/or Ubuntu. However, it's discussed here in the hope that it may be useful.

2. More about software.

Here's a dataflow diagram related Shard Query. This is from the official site:


Here's a performance chart:


The original text for the performance chart read as follows:

Here is a graph of the performance of Shard-Query on a 55GB MySQL database with an increasing number of workers, compared to a single threaded database query as the amount of data examined increases. The exact same MySQL server and settings were used for both.

3. Shard Query setup.

3.1. I started with a Debian (Ubuntu) machine that I'd configured as a development system previously.

For the Debian setup steps that I used, or reasonably compatible steps, click here.

3.2. I installed Gearman and some related components, including the Gearman MySQL UDF, a Gearman PECL extension, and a Gearman PEAR extension. For more information, click here.

3.3. I went to a downloads directory and created a Shard Query source tarball as follows:

svn checkout \ \
mv      shard-query-read-only shard-query
tar zcf shard-query.tar.gz    shard-query
rm -fr  shard-query

3.4. I unpacked the Shard Query source tarball and entered the top-level source directory:

tar zxf shard-query
cd      shard-query

3.5. I copied one of the directory trees provided into place on the system:

mkdir -p    /usr/share/php/Net
cp -a Net/* /usr/share/php/Net/

3.6. I edited the worker script provided and changed the port number used in the line to: 4730

4730 was the port number that gearmand was using on the local system. I obtained the number from the gearmand “usage” screen. I used netstat to confirm that gearmand was actually using the port in question.

3.7. As a smoke test, I ran the worker script using php worker and terminated it after a brief pause. No error messages were printed during the run.

3.8. I created a simple MySQL database to use for the next test. To create the database, I used a shell script named mksqdb01.txt. To view or download the script in question, use this link.

3.9. I created the following simple Shard Query configuration file in the same directory and called it shards.ini:



db, host, password, port, and user here are MySQL database parameters: database name, hostname, database user password, port number, and database user name. The password should be double-quoted.

The number in gearman=localhost:4730 was discussed previously.

Shard Query supports different ways to split up data. In this case, mapper=hash tells the framework to organize things using a column-based hash. column=username indicates that a column named username should be used.

3.10. I used the following command to start a Shard Query “worker” loop:

bash ./run_worker &

3.11. I edited the following Shard Query source file to fix a bug:


The change is shown below:

-    if($this->broadcast_query) {
+    if (isset ($this->broadcast_query) &&
+               $this->broadcast_query) {

3.12. Finally, I put the following MySQL query in a text file named queries.txt:

select * from members;

and executed the following command:

php run_query < queries.txt

Output similar to the following was produced. Note: This is a representative fragment and not the entire output:

    [username] => Boba Fett
    [bdyear] => 1992
    [bdmonth] => 3
    [bdday] => 21

The output confirmed that a MySQL query made to Shard Query had traveled through Gearman and MySQL Proxy to the MySQL database created previously and that correct results had made the return trip successfully.

At this point, basic setup of Shard Query was believed to be completed.

4. Remote-access test.

After Shard Query was set up, I created a MySQL database on a remote server and repeated the final test from section 3 above using the remote database as the target. The steps that I took are summarized below.

4.1. I ran a copy of the script mksqdb01.txt on the remote server. I modified the password used in the file appropriately first. The script created a remote MySQL database.

4.2. I configured the remote server to make access by remote (to it) clients possible. For more information on the procedure used for this step, click here.

4.3. I edited the shards.ini file discussed previously and set the database parameters in the file appropriately for use with the database server set up previously in step 4.1.

4.4. I repeated the final test from section 3. New output similar to the old output was produced. This confirmed that Shard Query was working with remote as well as local MySQL databases.

5. Transparent-sharding test.

The next goal was to use Shard Query to set up and test a virtual MySQL server consisting of two actual MySQL servers; one running on the Debian system set up initially and the other running on the remote system mentioned in part 4.

The steps that I took are summarized below.

This section assumes that the developer is working in the directory that contains run_query, shards.ini, etc. Also that the developer is in superuser mode and the “worker” loop started in a previous section is still running.

5.1. I ran copies of the script mksqdb02.txt on both systems. I modified the password used in each copy of the file appropriately first. The script produced identical empty MySQL databases on the two systems; by creating the databases, if necessary, or by resetting them, if they already existed.

5.2. I edited the shards.ini file discussed previously and replaced its contents with the following text:




The [shard1] block here is intended to specify settings associated with the local MySQL server.

The [shard2] block here is intended to specify settings associated with the remote MySQL server.

I modified the two password settings to reflect the actual MySQL root passwords used on the local and remote servers. I also modified the host setting in the [shard2] block to reflect the actual hostname of the remote MySQL server.

5.3. I executed the following command:

php run_query < mksqdb03.txt

where mksqdb03.txt was a set of MySQL insert commands. To view the file, click here.

This step appeared to report success. I checked the contents of both of the MySQL databases used afterwards and found that records had been distributed between them.

For acknowledgments related to CSS and other code used, click here.

Linked or embedded works (software, books, articles, etc.) are presented under their own licenses.

Other unique content on this page, excluding screenshots, is distributed under the following license: C.C. Attribution NonCommercial ShareAlike 3.0. In some cases, screenshots are distributed under the same license. In other cases, the associated program's license applies.

image Valid XHTML 1.0