The documentation on mysqli_poll is a bit sparse.
In the example, they create 3 identical arrays containing all the MySQLi connections to check, but if you read the
TL;DR: Your suppositions are correct, but take care in your coding.
mysqli_poll is a thin convenience wrapper around a socket select, and knowing how socket_select works will get you a long way toward understanding this function.
mysqli_poll
is only available when the underlying driver is mysqlnd
, because only MySQL ND provides native, socket-level access to the MySQL server. The important point to keep in mind is that the "socket-level" access is what makes polling possible, and why understanding socket select is crucial in understanding the function and limitations of mysqli_poll
.
To answer your questions:
It sounds to me like $read is an array holding the connections to check, but $error and $reject should be unpopulated vars that will be filled by the function if there are errors. Is that correct?
Yes, but not a complete picture. Notice the signature of mysqli_poll
:
int mysqli_poll ( array &$read , array &$error , array &$reject , int $sec [, int $usec ] )
All three arrays are pass-by-reference, meaning the PHP engine has the ability to modify all three, which it will. In the obvious case, it modifies $error
and $reject
when any of the requested connections from $read
are in an error or connection rejected state.
But PHP will also modify $read
when there is data waiting to be read. This is the key to answering your question:
What happens when the function returns >= 1? How do you know which connections have data ready to be "reaped"? Is $read modified too? i.e. reduced to the set connections that actually have data?
Yes, and that is crucial and not obvious in the docs. $read
is going to get modified to the list of connections that are ready to be read. You'll loop over them and do your business. But an imperative point is glossed over: if $read
is modified, what happens if you put the poll in a loop and try to read from those again? Well, you'll only be reading from a subset, which isn't what you want.
What most examples show when doing a select in PHP is that the source $read
array is copied to a new array before being select'd. In the man page for mysqli_poll
, notice this loop that "resets" the read array just before the call to the mysqli_poll:
foreach ($all_links as $link) {
$links[] = $errors[] = $reject[] = $link;
}
This is perhaps the most important point: each of these arrays passed into mysqli_poll
will get modified when mysqli_poll
finishes: the arrays will be trimmed so that only affected connections are in the result, so you have to reset the arrays each time before you call mysqli_poll
.
Another example is seen in this PHP note on socket_select. Notice how $read = $clients;
before the select?
To your last question:
Lastly, do sec and usec actually do anything? If so, what? I tried setting sec to 0 and usec to 1 (I assume that means 0 seconds + 1 microsecond = 1 microsecond total wait time) but it pauses for more than a second when I run a big query, so it doesn't seem to abort or cause an error when it times out. What does it do?
Yes, it works. These should represent the upper-bound PHP will wait for data to become available on any of the connections in $read
(but read on). It did not work for you because the minimum time is 1 second. When you set 0
for second, even though you had a > 0
microsecond, PHP interpreted that as "wait forever".
As a side note, the unit tests for mysqli_poll might be illuminating.
Update: I wasn't near a computer to test last night. Now that I am, I've got some observations to share.
$ cat mysqli_poll_test
$link = mysqli_connect(...);
$sql = 'SELECT SLEEP(2), 1';
mysqli_query($link, $sql, MYSQLI_ASYNC);
$links = array ($link);
$begin = microtime(true);
$i = 0;
do {
printf("start i=%d @ T+%.f\n", $i, (microtime(true)-$begin));
$read = $error = $reject = $links;
mysqli_poll($read, $error, $reject, 1, 500000);
printf(
"finish i=%d, count(read, error, reject)=(%d, %d, %d) @ T+%f\n\n",
$i++, count($read), count($error), count($reject), (microtime(true)-$begin)
);
} while (count($links) !== count($read) + count($error) + count($reject));
$ php mysqli_poll_test
start i=0 @ T+0.000012
finish i=0, count(read, error, reject)=(0, 0, 0) @ T+1.501807
start i=1 @ T+1.501955
finish i=1, count(read, error, reject)=(1, 0, 0) @ T+2.001353
In this test, the long running query is a simple sleep for 2 seconds at the MySQL server. The mysqli_poll
has a timeout of 1.5 seconds. As expected, after 1.5 seconds elapse, the poll returns. Also as expected, there is no data ready to be read, so the do .. while
restarts. After the remaining half-second, the poll returns indicating one link is ready to read. This is expected, because the query takes only 2 seconds to resolve and poll sees that very close to exactly two seconds.
If you change the poll timeout to half a second and re-run:
// changed this from 1 to 0 --------V
mysqli_poll($read, $error, $reject, 0, 500000);
The poll kicks out after half a second, and the loop runs four times, as expected. If you change it to 1 microsecond as in your example, it does kick out after 1 microsecond. And if you change it to 0 seconds and 0 microseconds, it runs as fast as it possibly can.
So, I was definitely wrong when I said 0
meant wait forever.
Let's change our script to have a few more links and try again:
$link0 = mysqli_connect(...);
$link1 = mysqli_connect(...);
$link2 = mysqli_connect(...);
$sql0 = 'SELECT SLEEP(2) AS wait, 1 AS num';
$sql1 = 'SELECT foo FROM';
$sql2 = 'SELECT 2 AS num';
mysqli_query($link0, $sql0, MYSQLI_ASYNC);
mysqli_query($link1, $sql1, MYSQLI_ASYNC);
mysqli_query($link2, $sql2, MYSQLI_ASYNC);
$links = array ($link0, $link1, $link2);
$begin = microtime(true);
$i = 0;
do {
printf("start i=%d @ T+%.f\n", $i, (microtime(true)-$begin));
$read = $error = $reject = $links;
$count = mysqli_poll($read, $error, $reject, 1, 500000);
if (0 < $count) {
foreach ($links as $j => $link) {
$result = mysqli_reap_async_query($link);
if (is_object($result)) {
printf("link #%d, row=%s\n", $j, json_encode($result->fetch_assoc()));
mysqli_free_result($result);
} else if (false !== $result) {
printf("link #%d, output=%s\n", $j, $link);
} else {
printf("link #%d, error=%s\n", $j, mysqli_error($link));
}
}
}
printf(
"finish i=%d, count(read, error, reject)=(%d, %d, %d) @ T+%f\n\n",
$i++, count($read), count($error), count($reject), (microtime(true)-$begin)
);
} while (count($links) !== count($read) + count($error) + count($reject));
In this test, I'm expecting two results to resolve immediately: one a syntax error and the other a data row. I'm also expecting this to take 1.5 seconds, since the query sleeping 2 seconds will not resolve until after the timeout expires. That doesn't appear to be the case:
start i=0 @ T+0.000002
link #0, row={"wait":"0","num":"1"}
link #1, error=You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
link #2, row={"num":"2"}
finish i=0, count(read, error, reject)=(1, 0, 0) @ T+2.001756
start i=1 @ T+2.001827
finish i=1, count(read, error, reject)=(0, 0, 3) @ T+3.503024
It waits until the SLEEP(2)
query resolves, violating the assertion that the timeout was the upper-bound for waiting. The reason this happens is the mysqli_reap_async_query
: we're iterating over all links, and each of those is being asked to be reaped. The reaping process waits until the query finishes.
Same as test #2, but this time let's be smart about what we're reaping.
$ cat mysqli_poll.php
<?php
$link0 = mysqli_connect(...);
$link1 = mysqli_connect(...);
$link2 = mysqli_connect(...);
$sql0 = 'SELECT SLEEP(2) AS wait, 1 AS num';
$sql1 = 'SELECT foo FROM';
$sql2 = 'SELECT 2 AS num';
mysqli_query($link0, $sql0, MYSQLI_ASYNC);
mysqli_query($link1, $sql1, MYSQLI_ASYNC);
mysqli_query($link2, $sql2, MYSQLI_ASYNC);
$links = array ($link0, $link1, $link2);
$begin = microtime(true);
$i = 0;
do {
printf("start i=%d @ T+%.f\n", $i, (microtime(true)-$begin));
$read = $error = $reject = $links;
$count = mysqli_poll($read, $error, $reject, 1, 500000);
printf(
"check i=%d, count(read, error, reject)=(%d, %d, %d) @ T+%f\n",
$i, count($read), count($error), count($reject), (microtime(true)-$begin)
);
if (0 < $count) {
reap('read', $read);
reap('error', $error);
reap('reject', $reject);
} else {
printf("timeout, no results\n");
}
printf("finish i=%d\n\n", $i++);
} while (count($links) !== count($read) + count($error) + count($reject));
function reap($label, array $links) {
foreach ($links as $link) {
$result = mysqli_reap_async_query($link);
if (is_object($result)) {
printf("%s, row=%s\n", $label, json_encode($result->fetch_assoc()));
mysqli_free_result($result);
} else if (false !== $result) {
printf("%s, output=%s\n", $label, $link);
} else {
printf("%s, error=%s\n", $label, mysqli_error($link));
}
}
}
And now run it.
$ php mysqli_poll.php
start i=0 @ T+0.000003
check i=0, count(read, error, reject)=(1, 0, 0) @ T+0.001007
read, error=You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
finish i=0
start i=1 @ T+0.001256
check i=1, count(read, error, reject)=(1, 0, 1) @ T+0.001327
read, row={"num":"2"}
reject, error=You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
finish i=1
start i=2 @ T+0.001627
check i=2, count(read, error, reject)=(0, 0, 2) @ T+1.503261
timeout, no results
finish i=2
start i=3 @ T+1.503564
check i=3, count(read, error, reject)=(1, 0, 2) @ T+2.001390
read, row={"wait":"0","num":"1"}
reject, error=You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
reject, error=
finish i=3
Much better. Each query resolves in its own good time, with appropriate arrays filled out. The important difference in this example, versus earlier is that we iterate over each of the modified arrays. This runs contrary to some documentation, which shows iterating over all the links themselves.
I've opened documentation bug #70505 on it.