The best PHP MySQL SELECT and UPDATE loop

Testing 3 Types of MySQL SELECT and then UPDATE from an API call loops with PHP for speed, efficiency and ability to scale.

The idea is needing to select a value from the database where it was updated less than 1 week ago, use this value in an API call and then update the database with the API data.

Note the examples below are the same structure as tested but the MySQL queries and API call have been made to appear fictional.

Type 1

One while loop

Positives: One connection needed for the SELECT, only one loop.

Negatives: Cannot be scaled or ran more than once at a time due to the strong possibility of getting duplicates in the loop. The array for the while loop gets built first which locks in the rows another concurrent process could also include these rows. Causes issues if a DELETE is involved.

43.23 seconds for LIMIT 12.

function suLoop($amount, $db_ip = '127.0.0.1')
{
    $db = db_connect($db_ip);
    $db->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
    $select = $db->prepare("SELECT `uid` FROM `people` WHERE `updated` < (NOW() - INTERVAL 1 WEEK) ORDER BY RAND() LIMIT ?;");
    $select->execute(array($amount));
    while ($db_row = $select->fetch(PDO::FETCH_ASSOC)) {
        $data = json_decode(file_get_contents("https://website.com/api/users?uid=".$db_row['uid'].""), true);
        $update = $db->prepare("UPDATE `people` SET `score` = :score WHERE `uid` = :uid LIMIT 1;");
        $update->execute(array(':score' => $data['Results']['Score'], ':uid' => $db_row['uid']));
        echo "DONE UID:{$db_row['uid']} SCORE:{$data['Results']['Score']}<br>";
    }
}

Type 2

One by one

Positives: Can be scaled, many concurrent executions.

Negatives: Creates a MySQL connection each loop for the SELECT aspect.

52.28 seconds for LIMIT 12.

function suOneByOne($amount, $db_ip = '127.0.0.1')
{
    $db = db_connect($db_ip);
    for ($i = 1; $i <= $amount; $i++) {
        $select = $db->prepare("SELECT `uid` FROM `people` WHERE `updated` < (NOW() - INTERVAL 1 WEEK) ORDER BY RAND() LIMIT 1;");
        $db_row = $select->fetch($select->execute());
        $data = json_decode(file_get_contents("https://website.com/api/users?uid=".$db_row['uid'].""), true);
        $update = $db->prepare("UPDATE `people` SET `score` = :score WHERE `uid` = :uid LIMIT 1;");
        $update->execute(array(':score' => $data['Results']['Score'], ':uid' => $db_row['uid']));
        echo "DONE UID:{$db_row['uid']} SCORE:{$data['Results']['Score']}<br>";
    }
}

Type 3

Loop on a loop (while)

Positives: Somewhat scaleable, lower chance for conflicts. Faster with less loops and a higher LIMIT.

Negatives: Slight chance for conflicts from concurrent processing.

61.83 seconds for 4 loops of LIMIT 3.

36 seconds for 3 loops of LIMIT 4.

function sudLoopOnLoop($loop_amount, $amounts_per_loop, $db_ip = '127.0.0.1')
{
    $db = db_connect($db_ip);
    $db->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
    for ($i = 1; $i <= $loop_amount; $i++) {
        $select = $db->prepare("SELECT `uid` FROM `people` WHERE `updated` < (NOW() - INTERVAL 1 WEEK) ORDER BY RAND() LIMIT ?;");
        $select->execute(array($amounts_per_loop));
        while ($db_row = $select->fetch(PDO::FETCH_ASSOC)) {
            $data = json_decode(file_get_contents("https://website.com/api/users?uid=".$db_row['uid'].""), true);
            $update = $db->prepare("UPDATE `people` SET `score` = :score WHERE `uid` = :uid LIMIT 1;");
            $update->execute(array(':score' => $data['Results']['Score'], ':uid' => $db_row['uid']));
            echo "DONE UID:{$db_row['uid']} SCORE:{$data['Results']['Score']}<br>";
        }
    }
}

 

For doing concurrent tasks type 3 is best with less loops and a higher limit. Although there is still a chance for a row to be select in another running process, this is no issue for the example above but it is ‘inefficient’.

The only downside to type 2 is the need to reconnect to the database and do the select query each loop. In terms of avoiding duplicate processing this is the best type however the consitent connecting and querying could slow down attemps at scaling.