Paginating The Right Way In MySQL | Limit Offset Is Bad

You must have implemented Pagination in MySQL using Limit Offset queries. You might have never faced any performance issues with it. This method works fine with a few thousand rows. But what if you have millions of rows in your dataset? In this article, we will discuss the issue of using Limit Offset queries. Also, we will discuss Paginating the right way in MySQL.

Database Sample

For this article, I am using a sample Employees Data Set provided officially by MySQL. The emp_salaries table I will be referring to in this post contain approximately 20 million rows.

The table is described as follows:

mysql> desc emp_salaries;
+-----------+---------+------+-----+---------+----------------+
| Field     | Type    | Null | Key | Default | Extra          |
+-----------+---------+------+-----+---------+----------------+
| id        | int(11) | NO   | PRI | NULL    | auto_increment |
| emp_no    | int(11) | YES  |     | NULL    |                |
| salary    | int(11) | YES  |     | NULL    |                |
| from_date | date    | YES  |     | NULL    |                |
| to_date   | date    | YES  |     | NULL    |                |
+-----------+---------+------+-----+---------+----------------+

mysql> select count(1) from emp_salaries;
+----------+
| count(1) |
+----------+
| 19908329 |
+----------+

Pagination Using Limit Offset Queries

One way to perform pagination is by using the Limit Offset queries. The limit is the maximum number of rows that will be selected in the query. An offset is the number of rows that will be skipped from the beginning to select the rows.

The following query will select 5 rows from the table and will not skip any of them since the offset is 0.

mysql> SELECT * FROM emp_salaries LIMIT 5 OFFSET 0;
+----+--------+--------+------------+------------+
| id | emp_no | salary | from_date  | to_date    |
+----+--------+--------+------------+------------+
|  1 |  10001 |  60117 | 1986-06-26 | 1987-06-26 |
|  2 |  10001 |  62102 | 1987-06-26 | 1988-06-25 |
|  3 |  10001 |  66074 | 1988-06-25 | 1989-06-25 |
|  4 |  10001 |  66596 | 1989-06-25 | 1990-06-25 |
|  5 |  10001 |  66961 | 1990-06-25 | 1991-06-25 |
+----+--------+--------+------------+------------+

Now, this query will first skip the first 100 rows and then select 5 rows.

mysql> SELECT * FROM emp_salaries LIMIT 5 OFFSET 100;
+-----+--------+--------+------------+------------+
| id  | emp_no | salary | from_date  | to_date    |
+-----+--------+--------+------------+------------+
| 101 |  10009 |  85875 | 1997-02-15 | 1998-02-15 |
| 102 |  10009 |  89324 | 1998-02-15 | 1999-02-15 |
| 103 |  10009 |  90668 | 1999-02-15 | 2000-02-15 |
| 104 |  10009 |  93507 | 2000-02-15 | 2001-02-14 |
| 105 |  10009 |  94443 | 2001-02-14 | 2002-02-14 |
+-----+--------+--------+------------+------------+

Why Is This Not Optimised?

Consider the scenario when there are millions of records in a table. Offset needs to first skip the N number of rows in order to return the desired rows. If the offset is very large, then it will have to skip a very large number of rows to reach the desired row number.

mysql> SELECT * FROM emp_salaries LIMIT 5 OFFSET 100;
+-----+--------+--------+------------+------------+
| id  | emp_no | salary | from_date  | to_date    |
+-----+--------+--------+------------+------------+
| 101 |  10009 |  85875 | 1997-02-15 | 1998-02-15 |
| 102 |  10009 |  89324 | 1998-02-15 | 1999-02-15 |
| 103 |  10009 |  90668 | 1999-02-15 | 2000-02-15 |
| 104 |  10009 |  93507 | 2000-02-15 | 2001-02-14 |
| 105 |  10009 |  94443 | 2001-02-14 | 2002-02-14 |
+-----+--------+--------+------------+------------+
5 rows in set (0.00 sec)

When the offset is small, MySQL does not take much time to return 5 rows.

Now for instance, consider a case when the offset is large (15000000)

mysql> SELECT  * FROM emp_salaries LIMIT 5 OFFSET 15000000;
+----------+--------+--------+------------+------------+
| id       | emp_no | salary | from_date  | to_date    |
+----------+--------+--------+------------+------------+
| 15197466 |  92041 |  56671 | 1998-05-29 | 1999-05-29 |
| 15197467 |  92041 |  59011 | 1999-05-29 | 2000-05-28 |
| 15197468 |  92041 |  63105 | 2000-05-28 | 2001-05-28 |
| 15197469 |  92041 |  64407 | 2001-05-28 | 2002-05-28 |
| 15197470 |  92041 |  68774 | 2002-05-28 | 9999-01-01 |
+----------+--------+--------+------------+------------+
5 rows in set (2.99 sec)

It takes approximately 3 seconds to return 5 rows when the offset is a bit large.

Since every time MySQL has to skip an increasing number of rows, the time to run the query increases. As a result, as the offset becomes larger, our query will become slower and slower.

We can infer from this experiment that the time taken for our select query to run will keep on increasing with the offset value.

Paginating The Right Way

Instead of using Limit Offset queries, we can create batches using a range on the primary key for paginating the right way.

Find the smallest id in the table. In our case, it is 1. Also, the maximum id in our case is 20145287. Select the top 5 rows from the table starting from the Minimum id and keep on increasing the starting id.

For example, suppose we want to select our first batch of 100 rows, we can select all rows from id 1 to 100. For the second batch, we can select rows with id 101 to 200. Since the id is the primary key of the table, MySQL will always use the index to make the selection. This will make our query highly optimised as it does not have to skip through a large number of rows to reach our desired row number.

mysql> SELECT  * FROM emp_salaries where id >=1 limit 5;
+----+--------+--------+------------+------------+
| id | emp_no | salary | from_date  | to_date    |
+----+--------+--------+------------+------------+
|  1 |  10001 |  60117 | 1986-06-26 | 1987-06-26 |
|  2 |  10001 |  62102 | 1987-06-26 | 1988-06-25 |
|  3 |  10001 |  66074 | 1988-06-25 | 1989-06-25 |
|  4 |  10001 |  66596 | 1989-06-25 | 1990-06-25 |
|  5 |  10001 |  66961 | 1990-06-25 | 1991-06-25 |
+----+--------+--------+------------+------------+
5 rows in set (0.00 sec)

Similarly, we can select the batch starting from 15000000 by using the following query:

mysql> SELECT  * FROM emp_salaries where id >15000000 limit 5;
+----------+--------+--------+------------+------------+
| id       | emp_no | salary | from_date  | to_date    |
+----------+--------+--------+------------+------------+
| 15000001 |  71305 |  72396 | 1997-08-16 | 1997-10-09 |
| 15000002 |  71306 |  64990 | 1991-08-20 | 1992-08-19 |
| 15000003 |  71306 |  64913 | 1992-08-19 | 1993-08-19 |
| 15000004 |  71306 |  65467 | 1993-08-19 | 1994-08-19 |
| 15000005 |  71306 |  68429 | 1994-08-19 | 1995-08-19 |
+----------+--------+--------+------------+------------+
5 rows in set (0.00 sec)

Notice that it returns the same data at the same time as the first batch. We can observe that the time taken by the query to run using range queries is constant throughout all the batches.

Performance Analysis

I performed a performance analysis of the two queries and found expected results. I used a limit of 20,00,000 per batch and paginated through approximately 20 million records.

Paginating the right way comparison graph Concatly
Line Graph comparing the performance of both the methods for paginating

The results were expected. The time taken by Limit Offset queries increase linearly with the batch number. Hence, MySQL will take more and more time to skip rows as the offset increases. On the other hand, the time taken by Range Queries is almost constant with the increasing batch number. This is because MySQL uses Primary Key Index on the id field.

Using Range Queries is Paginating The Right Way!

Conclusion

We discussed Paginating the right way in this article. The traditional limit offset queries work fine for a small number of rows. However, when it comes to a large data set, it is always beneficial to write range based queries on the primary key instead of offset.

You can find more articles on MySQL on Concatly! Share this article with your friends and colleagues from the share links below.

Spread the Knowledge

Leave a Reply

Your email address will not be published. Required fields are marked *