MySQL Query Cache with ReadySet

Marcelo Altmann
4 min readOct 9, 2023

--

Cache is a known strategy for speeding up data access by saving result-sets in memory for later usage. When the same data is needed, the results are accessed directly from memory.

Caching comes with a maintenance cost. How do you keep caches up to date or invalidate cache entries when they become invalid?

MySQL had its built-in implementation named Query Cache, however, it had some scalability problems related to high concurrency and cache invalidation, which were decided not worth fixing, and the feature was deprecated in MySQL 5.7.20 and completely removed on MySQL 8.0.

There are other options available, such as ProxySQL Query Cache. As with many other Caching strategies. It relies on TTL, which can be inefficient as your entry is only cached for a period of time and is guaranteed to be out-of-date if there are new writes affecting the cached entry.

Readyset is a SQL caching engine that sits in front of your existing MySQL (or PostgreSQL) with the purpose of caching common queries.

How it works

Different from other caching strategies, ReadySet utilizes the concept of dataflow graph and automatically keeps cache entries up to date as new data arrives on the source database.

Image by Readyset

Readyset takes an initial snapshot of the database and registers itself as a replica to keep receiving new data as it arrives on the source server. It implements the MySQL & PostgreSQL protocol, so you can point your application directly to it. The queries that are not found on its list of cached queries are automatically proxied to the upstream database.

How to configure

For this example, we will be using MySQL Employees database. The easiest way to get ReadySet up and running is by using a docker container:

  1. For this example, we will be creating a readyset user on MySQL:
mysql> CREATE USER readysetUser@'%' IDENTIFIED BY 'readysetPWD';
mysql> GRANT ALL ON *.* TO readysetUser@'%';

2. Download the docker composer and configure UPSTREAM_DB_URL :

curl -L -o compose.yml "https://readyset.io/quickstart/compose.yml";
# Replace your connection string mysql://user:pwd@IP:port
sed -i 's/# UPSTREAM_DB_URL:/UPSTREAM_DB_URL: mysql:\/\/readysetUser:readysetPWD@172.17.0.1:3306/g' compose.yml
docker compose up -d

Note: 172.17.0.1 is the gateway IP from containers to access the host machine running docker.

3. Check logs and connection

docker logs readyset-cache-1
. . .
2023-10-09T00:13:30.659923Z INFO taking database snapshot: replicators::noria_adapter: Snapshot finished
2023-10-09T00:13:30.663942Z INFO replicators::mysql_connector::connector: Starting binlog replication next_position=binlog.000002:66380408
2023-10-09T00:13:30.664152Z INFO replicators::noria_adapter: MySQL connected
2023-10-09T00:13:30.664159Z INFO replicators::noria_adapter: binlog_position=binlog.000002:66380408
2023-10-09T00:13:32.067140Z INFO Compacting index{table=docker_compose_deployment-employees-salaries-0.db column_family=0}: dataflow_state::persistent_state: Compaction finished table=docker_compose_deployment-employees-salaries-0.db cf=0

mysql -u readysetUser -preadysetPWD -P 5433
mysql> SHOW READYSET TABLES;
+----------------------------+-------------+-------------+
| table | status | description |
+----------------------------+-------------+-------------+
| `employees`.`departments` | Snapshotted | |
| `employees`.`dept_emp` | Snapshotted | |
| `employees`.`dept_manager` | Snapshotted | |
| `employees`.`employees` | Snapshotted | |
| `employees`.`salaries` | Snapshotted | |
| `employees`.`titles` | Snapshotted | |
+----------------------------+-------------+-------------+
6 rows in set (0,00 sec)

We have all we need to start caching our queries.

Create new cache

To create a new cache, we can first run the desired query and check if the query is supported via SHOW PROXIED SUPPORTED QUERIES and later create a cache for it via CREATE CACHE FROM

mysql -u readysetUser -preadysetPWD -P 5433
mysql> SELECT COUNT(*) FROM employees.salaries WHERE salary = 100000;
+----------+
| COUNT(*) |
+----------+
| 13 |
+----------+
1 row in set (19,47 sec)

mysql> SHOW PROXIED SUPPORTED QUERIES;
+--------------------+-------------------------------------------------------------------+--------------------+-------+----------+----------+----------+
| query id | proxied query | readyset supported | count | p50 (ms) | p90 (ms) | p99 (ms) |
+--------------------+-------------------------------------------------------------------+--------------------+-------+----------+----------+----------+
| q_774bf1a388e74841 | SELECT count(*) FROM `employees`.`salaries` WHERE (`salary` = $1) | yes | 0 | 0.000 | 0.000 | 0.000 |
+--------------------+-------------------------------------------------------------------+--------------------+-------+----------+----------+----------+
1 row in set (0,00 sec)

mysql> CREATE CACHE FROM SELECT COUNT(*) FROM employees.salaries WHERE salary = 100000;
Query OK, 0 rows affected (0,00 sec)

mysql> SELECT COUNT(*) FROM employees.salaries WHERE salary = 100000;
+----------+
| count(*) |
+----------+
| 13 |
+----------+
1 row in set (0,00 sec)

As we can see above, our query dropped from 19 seconds to 0 seconds. Nice. So what about if we add a new salary entry with salary = 100k?

mysql> INSERT INTO employees.salaries values (10001, 100000, '2023-10-08', '9999-01-01');
Query OK, 1 row affected (0,02 sec)

mysql> SELECT COUNT(*) FROM employees.salaries WHERE salary = 100000;
+----------+
| count(*) |
+----------+
| 14 |
+----------+
1 row in set (0,00 sec)

Readyset received the new entry via replication workflow and automatically updated the required dataflow node for serving this cache entry. No more TTL invalidations!

Conclusion

Readyset is still in its early stages. The product is really promising.

Many MySQL and PostgreSQL query language support are being implemented by their team.

Check out their documentation for more details and join their slack community if you are interested.

More important, give it a try!

--

--