Results of recommendations using Google Books API are irrelevant

前端 未结 1 1895
一生所求
一生所求 2021-02-13 15:07

I am trying to build book title recommendation system using Google Books API. Unfortunately results I get are extremely irrelevant comparing to https://books.google.com . For ex

1条回答
  •  臣服心动
    2021-02-13 15:23

    0. TLDR

    Here's a working fiddle using Google's https://suggestqueries.google.com/complete/search

    Parameters:

    output/client  # "toolbar" => xml, "firefox" => json, "chrome" => jsonp
    ds             #  which site to search in ("bo" for books, "yt" for youtube...)
    q              #  search term: "sher"
    

    Query:

    https://suggestqueries.google.com/complete/search?output=firefox&ds=bo&q=sher

    Results:

    ["sher",["sherlock holmes","sherrilyn kenyon","sherman alexie","sheryl sandberg","sherlock","sherlock holmes short stories","sherlock holmes book","sher o shayari","sherlock holmes novels","sher shah suri"]]
    

    1. Suggestions vs Search Results

    The first thing to realise is that when Google makes suggestions, they are not the results it would show you if you were to hit enter.

    Search Results are relevant if relevant terms are included in your query.

    Suggestions assume that your query is incomplete and therefore compare your query to other queries to guess what the completed version of your query might be.

    When I search "sher" on http://books.google.com the results I see are:

    • The Israeli-Palestinian Peace Negotiations, 1999-2001
    • Beyond Neutrality: Perfectionism and Politics
    • Desert
    • Refuse to Choose!: Use All of Your Interests, Passions, ...

    The reason for this is the author: In the case of the first three, "George Sher" and in the case of the fourth "Barbara Sher". This is desired behaviour because when I search "sher" I don't want "Sherlock" results burying "George Sher".


    2. The Solution

    Google has a kind of API for its suggestions as well. Some information about it can be found here. More significantly, using the developer tools however, you can see precisely what Google is doing.

    Using Developer Tools: Inspect the https://books.google.com page (CTRL+SHIFT+i in Chrome). Go to the network tab and wait until everything is loaded.

    When you begin typing, Google fires requests to the server which you will see populate in the list. When I typed "sher", Google sent this request:

    https://suggestqueries.google.com/complete/search?client=books&ds=bo&q=sher&callback=_callbacks_._1id33zyi5
    

    Look at the variables:

    client   = books
    ds       = bo
    q        = sher
    callback = _callbacks_._1id33zyi5
    
    • client determines the type of result that you receive (XML [toolbar], JSON [firefox], JSONP [chrome])
    • ds limits the search to a specific site (books [bo], youtube [yt] etc.).
    • q is, of course, the query text
    • callback is a paramater used for JSONP (which has some important differences to JSON). Don't worry too much about it because jQuery can handle this for you.

    I pieced together bits of information on these parameters by looking at this request and by reading this and this.

    CORS: Because you are making a request from a domain that's not google.com, you are going to get an Access-Control-Allow-Origin error. This is a security measure trying to prevent XSS. To get around this problem, you will need to use JSONP.

    Using jQuery, we needn't worry about the callback so let's change the client parameter to chrome and use this final query:

    https://suggestqueries.google.com/complete/search?client=chrome&ds=bo&q=sher

    Working Example Below: In this example, you may want to take note of the "google:suggestrelevance" key which is an added bonus of using JSONP (Google only returns that information in JSONP data).

    var requestUrl = "https://suggestqueries.google.com/complete/search?client=chrome&ds=bo&q=";
    var xhr;
    
    $(document).on("input", "#query", function () {
        typewatch(function () {
            // Here's the bit that matters
            var queryTerm = $("#query").val();
            $("#indicator").show();
            if (xhr != null) xhr.abort();
            xhr = $.ajax({
                url: requestUrl + queryTerm,
                dataType: "jsonp",
                success: function (response) {
                    $("#indicator").hide();
                    $("#response").html(syntaxHighlight(response));
                }
            });
        }, 500);
    });
    
    
    /*
     *  --------- YOU ONLY NEED WHAT IS ABOVE THIS LINE ---------
     */
    $(document).ready(function () {
        $("#indicator").hide();
    });
    
    // Just for fun, some syntax highlighting...
    // Credit: http://stackoverflow.com/a/7220510/123415
    function syntaxHighlight(json) {
        if (typeof json != 'string') {
            json = JSON.stringify(json, undefined, 2);
        }
        json = json.replace(/&/g, '&').replace(//g, '>');
        return json.replace(/("(\\u[a-zA-Z0-9]{4}|\\[^u]|[^\\"])*"(\s*:)?|\b(true|false|null)\b|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?)/g, function (match) {
            var cls = 'number';
            if (/^"/.test(match)) {
                if (/:$/.test(match)) {
                    cls = 'key';
                } else {
                    cls = 'string';
                }
            } else if (/true|false/.test(match)) {
                cls = 'boolean';
            } else if (/null/.test(match)) {
                cls = 'null';
            }
            return '' + match + '';
        });
    }
    
    // And automatic searching (when you stop typing)
    // Credit: http://stackoverflow.com/a/2219966/123415
    var typewatch = (function () {
        var timer = 0;
        return function (callback, ms) {
            clearTimeout(timer);
            timer = setTimeout(callback, ms);
        };
    })();
    /* 
     * Safe to ignore:
     * This is just to make stuff look vaguely decent
     */
    body {
      padding: 10px;
    }
    div * {
        vertical-align: top;
    }
    #indicator {
        display: inline-block;
        background: no-repeat center/100% url('http://galafrica.actstudio.ro/img/busy_indicator.gif');
        width: 17px;
        height: 17px;
        margin: 3px;
    }
    /*
     *
     * CREDIT:
     * http://stackoverflow.com/a/7220510/123415
     */
     pre {
        outline: 1px solid #ccc;
        padding: 5px;
    }
    .string {
        color: green;
    }
    .number {
        color: darkorange;
    }
    .boolean {
        color: blue;
    }
    .null {
        color: red;
    }
    .key {
        color: #008;
    }
    
    

    0 讨论(0)
提交回复
热议问题