How does mysqli_poll work?

前端 未结 1 659
旧巷少年郎
旧巷少年郎 2021-01-02 16:31

The documentation on mysqli_poll is a bit sparse.

In the example, they create 3 identical arrays containing all the MySQLi connections to check, but if you read the

相关标签:
1条回答
  • 2021-01-02 17:31

    TL;DR: Your suppositions are correct, but take care in your coding.


    mysqli_poll is a thin convenience wrapper around a socket select, and knowing how socket_select works will get you a long way toward understanding this function.

    mysqli_poll is only available when the underlying driver is mysqlnd, because only MySQL ND provides native, socket-level access to the MySQL server. The important point to keep in mind is that the "socket-level" access is what makes polling possible, and why understanding socket select is crucial in understanding the function and limitations of mysqli_poll.

    To answer your questions:

    It sounds to me like $read is an array holding the connections to check, but $error and $reject should be unpopulated vars that will be filled by the function if there are errors. Is that correct?

    Yes, but not a complete picture. Notice the signature of mysqli_poll:

    int mysqli_poll ( array &$read , array &$error , array &$reject , int $sec [, int $usec ] )

    All three arrays are pass-by-reference, meaning the PHP engine has the ability to modify all three, which it will. In the obvious case, it modifies $error and $reject when any of the requested connections from $read are in an error or connection rejected state.

    But PHP will also modify $read when there is data waiting to be read. This is the key to answering your question:

    What happens when the function returns >= 1? How do you know which connections have data ready to be "reaped"? Is $read modified too? i.e. reduced to the set connections that actually have data?

    Yes, and that is crucial and not obvious in the docs. $read is going to get modified to the list of connections that are ready to be read. You'll loop over them and do your business. But an imperative point is glossed over: if $read is modified, what happens if you put the poll in a loop and try to read from those again? Well, you'll only be reading from a subset, which isn't what you want.

    What most examples show when doing a select in PHP is that the source $read array is copied to a new array before being select'd. In the man page for mysqli_poll, notice this loop that "resets" the read array just before the call to the mysqli_poll:

    foreach ($all_links as $link) {
        $links[] = $errors[] = $reject[] = $link;
    }
    

    This is perhaps the most important point: each of these arrays passed into mysqli_poll will get modified when mysqli_poll finishes: the arrays will be trimmed so that only affected connections are in the result, so you have to reset the arrays each time before you call mysqli_poll.

    Another example is seen in this PHP note on socket_select. Notice how $read = $clients; before the select?

    To your last question:

    Lastly, do sec and usec actually do anything? If so, what? I tried setting sec to 0 and usec to 1 (I assume that means 0 seconds + 1 microsecond = 1 microsecond total wait time) but it pauses for more than a second when I run a big query, so it doesn't seem to abort or cause an error when it times out. What does it do?

    Yes, it works. These should represent the upper-bound PHP will wait for data to become available on any of the connections in $read (but read on). It did not work for you because the minimum time is 1 second. When you set 0 for second, even though you had a > 0 microsecond, PHP interpreted that as "wait forever".

    As a side note, the unit tests for mysqli_poll might be illuminating.


    Update: I wasn't near a computer to test last night. Now that I am, I've got some observations to share.

    test 1: long running queries

    $ cat mysqli_poll_test
    $link  = mysqli_connect(...);
    $sql   = 'SELECT SLEEP(2), 1';
    mysqli_query($link, $sql, MYSQLI_ASYNC);
    
    $links = array ($link);
    $begin = microtime(true);
    $i = 0;
    do {
        printf("start i=%d @ T+%.f\n", $i, (microtime(true)-$begin));
        $read = $error = $reject = $links;
        mysqli_poll($read, $error, $reject, 1, 500000);
        printf(
            "finish i=%d, count(read, error, reject)=(%d, %d, %d) @ T+%f\n\n",
            $i++, count($read), count($error), count($reject), (microtime(true)-$begin)
        );
    } while (count($links) !== count($read) + count($error) + count($reject));
    
    $ php mysqli_poll_test
    start i=0 @ T+0.000012
    finish i=0, count(read, error, reject)=(0, 0, 0) @ T+1.501807
    
    start i=1 @ T+1.501955
    finish i=1, count(read, error, reject)=(1, 0, 0) @ T+2.001353
    

    In this test, the long running query is a simple sleep for 2 seconds at the MySQL server. The mysqli_poll has a timeout of 1.5 seconds. As expected, after 1.5 seconds elapse, the poll returns. Also as expected, there is no data ready to be read, so the do .. while restarts. After the remaining half-second, the poll returns indicating one link is ready to read. This is expected, because the query takes only 2 seconds to resolve and poll sees that very close to exactly two seconds.

    If you change the poll timeout to half a second and re-run:

    // changed this from 1 to 0 --------V
    mysqli_poll($read, $error, $reject, 0, 500000);
    

    The poll kicks out after half a second, and the loop runs four times, as expected. If you change it to 1 microsecond as in your example, it does kick out after 1 microsecond. And if you change it to 0 seconds and 0 microseconds, it runs as fast as it possibly can.

    So, I was definitely wrong when I said 0 meant wait forever.

    test 2: multiple queries, some error and some long running, with timeouts

    Let's change our script to have a few more links and try again:

    $link0  = mysqli_connect(...);
    $link1  = mysqli_connect(...);
    $link2  = mysqli_connect(...);
    
    $sql0   = 'SELECT SLEEP(2) AS wait, 1 AS num';
    $sql1   = 'SELECT foo FROM';
    $sql2   = 'SELECT 2 AS num';
    
    mysqli_query($link0, $sql0, MYSQLI_ASYNC);
    mysqli_query($link1, $sql1, MYSQLI_ASYNC);
    mysqli_query($link2, $sql2, MYSQLI_ASYNC);
    
    $links  = array ($link0, $link1, $link2);
    $begin = microtime(true);
    $i = 0;
    do {
        printf("start i=%d @ T+%.f\n", $i, (microtime(true)-$begin));
        $read = $error = $reject = $links;
        $count = mysqli_poll($read, $error, $reject, 1, 500000);
        if (0 < $count) {
            foreach ($links as $j => $link) {
                $result = mysqli_reap_async_query($link);
                if (is_object($result)) {
                    printf("link #%d, row=%s\n", $j, json_encode($result->fetch_assoc()));
                    mysqli_free_result($result);
                } else if (false !== $result) {
                    printf("link #%d, output=%s\n", $j, $link);
                } else {
                    printf("link #%d, error=%s\n", $j, mysqli_error($link));
                }
            }
        }
        printf(
            "finish i=%d, count(read, error, reject)=(%d, %d, %d) @ T+%f\n\n",
            $i++, count($read), count($error), count($reject), (microtime(true)-$begin)
        );
    } while (count($links) !== count($read) + count($error) + count($reject));
    

    In this test, I'm expecting two results to resolve immediately: one a syntax error and the other a data row. I'm also expecting this to take 1.5 seconds, since the query sleeping 2 seconds will not resolve until after the timeout expires. That doesn't appear to be the case:

    start i=0 @ T+0.000002
    link #0, row={"wait":"0","num":"1"}
    link #1, error=You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
    link #2, row={"num":"2"}
    finish i=0, count(read, error, reject)=(1, 0, 0) @ T+2.001756
    
    start i=1 @ T+2.001827
    finish i=1, count(read, error, reject)=(0, 0, 3) @ T+3.503024
    

    It waits until the SLEEP(2) query resolves, violating the assertion that the timeout was the upper-bound for waiting. The reason this happens is the mysqli_reap_async_query: we're iterating over all links, and each of those is being asked to be reaped. The reaping process waits until the query finishes.

    test 3: long running queries with targeted reaping:

    Same as test #2, but this time let's be smart about what we're reaping.

    $ cat mysqli_poll.php
    <?php
    $link0  = mysqli_connect(...);
    $link1  = mysqli_connect(...);
    $link2  = mysqli_connect(...);
    
    $sql0   = 'SELECT SLEEP(2) AS wait, 1 AS num';
    $sql1   = 'SELECT foo FROM';
    $sql2   = 'SELECT 2 AS num';
    
    mysqli_query($link0, $sql0, MYSQLI_ASYNC);
    mysqli_query($link1, $sql1, MYSQLI_ASYNC);
    mysqli_query($link2, $sql2, MYSQLI_ASYNC);
    
    $links  = array ($link0, $link1, $link2);
    $begin = microtime(true);
    $i = 0;
    do {
        printf("start i=%d @ T+%.f\n", $i, (microtime(true)-$begin));
        $read = $error = $reject = $links;
        $count = mysqli_poll($read, $error, $reject, 1, 500000);
        printf(
            "check i=%d, count(read, error, reject)=(%d, %d, %d) @ T+%f\n",
            $i, count($read), count($error), count($reject), (microtime(true)-$begin)
        );
        if (0 < $count) {
            reap('read', $read);
            reap('error', $error);
            reap('reject', $reject);
        } else {
            printf("timeout, no results\n");
        }
        printf("finish i=%d\n\n", $i++);
    } while (count($links) !== count($read) + count($error) + count($reject));
    
    function reap($label, array $links) {
        foreach ($links as $link) {
            $result = mysqli_reap_async_query($link);
            if (is_object($result)) {
                printf("%s, row=%s\n", $label, json_encode($result->fetch_assoc()));
                mysqli_free_result($result);
            } else if (false !== $result) {
                printf("%s, output=%s\n", $label, $link);
            } else {
                printf("%s, error=%s\n", $label, mysqli_error($link));
            }
        }
    }
    

    And now run it.

    $ php mysqli_poll.php
    start i=0 @ T+0.000003
    check i=0, count(read, error, reject)=(1, 0, 0) @ T+0.001007
    read, error=You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
    finish i=0
    
    start i=1 @ T+0.001256
    check i=1, count(read, error, reject)=(1, 0, 1) @ T+0.001327
    read, row={"num":"2"}
    reject, error=You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
    finish i=1
    
    start i=2 @ T+0.001627
    check i=2, count(read, error, reject)=(0, 0, 2) @ T+1.503261
    timeout, no results
    finish i=2
    
    start i=3 @ T+1.503564
    check i=3, count(read, error, reject)=(1, 0, 2) @ T+2.001390
    read, row={"wait":"0","num":"1"}
    reject, error=You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
    reject, error=
    finish i=3
    

    Much better. Each query resolves in its own good time, with appropriate arrays filled out. The important difference in this example, versus earlier is that we iterate over each of the modified arrays. This runs contrary to some documentation, which shows iterating over all the links themselves.

    I've opened documentation bug #70505 on it.

    0 讨论(0)
提交回复
热议问题