How to optimize ORDER BY RANDOM()

Doing a ORDER BY RAND() in MySQL is bad. Very bad. As Tobias Petry details (and Bernard Grymonpon always used to tell at local meetups): Ordering records in a random order involves these operations: Load all rows into memory matching your conditions Assign a random value RANDOM() to each row in the database Sort all …

Doing a ORDER BY RAND() in MySQL is bad. Very bad. As Tobias Petry details (and Bernard Grymonpon always used to tell at local meetups):

Ordering records in a random order involves these operations:

  1. Load all rows into memory matching your conditions
  2. Assign a random value RANDOM() to each row in the database
  3. Sort all the rows according to this random value
  4. Retain only the desired number of records from all sorted records

His solution is to pre-add randomness to each record, in an extra column. For it he uses a the Geometric Datatype POINT type. In Postgres he then uses the following query that orders the records by distance from a new random point.

SELECT * FROM repositories ORDER BY randomness <-> point(0.753,0.294) LIMIT 3;

~

For MySQL you also have a POINT class (ever since MySQL 5.7.6), but don’t really see how that would work there as ST_Distance would need to be called for each row:

SET @randomx = RAND();
SET @randomy = RAND();
SELECT *, ST_X(randomness), ST_Y(randomness), ST_Distance(POINT(@randomx, @randomy), randomness) AS distance FROM repositories ORDER BY distance DESC LIMIT 0,3;

In that scenario I’d simply rely one single float value that contains the pre-randomness …

ALTER TABLE `repositories` ADD `randomness` FLOAT(17,16) UNSIGNED NOT NULL AFTER `randomness`;
ALTER TABLE `repositories` ADD INDEX(`randomness`);
UPDATE `repositories` SET randomness = RAND() WHERE 1; -- Update existing records. New records would have this number pre-generated before inserting

… and then do something like this:

SET @randomnumber = RAND(); -- This number would typically be generated by your PHP code
SELECT * FROM repositories WHERE randomness < @randomnumber ORDER BY randomness DESC 0,3;

Unlike the query using POINT(), this last query is written in such a way that it can leverage the index created on the randomness column ?

~

How to optimize ORDER BY RANDOM()

Via Freek


Print Share Comment Cite Upload Translate
APA
Bramus! | Sciencx (2024-03-29T08:04:18+00:00) » How to optimize ORDER BY RANDOM(). Retrieved from https://www.scien.cx/2021/06/07/how-to-optimize-order-by-random/.
MLA
" » How to optimize ORDER BY RANDOM()." Bramus! | Sciencx - Monday June 7, 2021, https://www.scien.cx/2021/06/07/how-to-optimize-order-by-random/
HARVARD
Bramus! | Sciencx Monday June 7, 2021 » How to optimize ORDER BY RANDOM()., viewed 2024-03-29T08:04:18+00:00,<https://www.scien.cx/2021/06/07/how-to-optimize-order-by-random/>
VANCOUVER
Bramus! | Sciencx - » How to optimize ORDER BY RANDOM(). [Internet]. [Accessed 2024-03-29T08:04:18+00:00]. Available from: https://www.scien.cx/2021/06/07/how-to-optimize-order-by-random/
CHICAGO
" » How to optimize ORDER BY RANDOM()." Bramus! | Sciencx - Accessed 2024-03-29T08:04:18+00:00. https://www.scien.cx/2021/06/07/how-to-optimize-order-by-random/
IEEE
" » How to optimize ORDER BY RANDOM()." Bramus! | Sciencx [Online]. Available: https://www.scien.cx/2021/06/07/how-to-optimize-order-by-random/. [Accessed: 2024-03-29T08:04:18+00:00]
rf:citation
» How to optimize ORDER BY RANDOM() | Bramus! | Sciencx | https://www.scien.cx/2021/06/07/how-to-optimize-order-by-random/ | 2024-03-29T08:04:18+00:00
https://github.com/addpipe/simple-recorderjs-demo