processingFailure error (400) while retrieving CommentThreads list

前端 未结 3 1021
孤城傲影
孤城傲影 2020-12-07 05:11

I a trying to retrieve all the comments of a video via Python iteration/paging. I am logged correctly with a developer key

import googleapiclient.discovery          


        
相关标签:
3条回答
  • 2020-12-07 05:53

    Issue is, we can't really retrieve all the comments of every video.

    https://issuetracker.google.com/issues/134912604
    We currently don't support paging through the whole stream. So there's no way to retrieve all the 1000+ commentThreads that you have for that video

    0 讨论(0)
  • 2020-12-07 05:54

    According to Google's Python Client Library sample code and to Google's Youtube API sample code, you should have been coding your pagination loop as shown below:

    request = yt.commentThreads().list(...)
    while request:
        response = request.execute()
        # your processing code goes here ...
        request = yt.commentThreads().list_next(request, response)
    
    0 讨论(0)
  • 2020-12-07 06:03

    This is not a solution to your problem. It just shows that querying the endpoint via a GET request method succeeds obtaining from the API the needed page response.

    # comments-wget [-d] VIDEO_ID [PAGE_TOKEN]
    
    $ comments-wget() { 
        local x='eval'
        [ "$1" == '-d' ] && { 
            x='echo'
            shift
        }
    
        local v="$1"
        quote2 -i v
    
        local p="$2"
        quote2 -i p
    
        local O="/tmp/$v-comments%d.json"
        local o
        local k=0
        while :; do
            printf -v o "$O" "$k"
            [ ! -f "$o" ] && break
            (( k++ ))
        done
        quote o
    
        k="$APP_KEY"
        quote2 -i k
        local a="$AGENT"
        quote2 a
    
        local c="\
    wget \
    --debug \
    --verbose \
    --no-check-certif \
    --output-document=$o \
    --user-agent=$a \
    'https://www.googleapis.com/youtube/v3/commentThreads?key=$k&videoId=$v&part=replies,snippet&order=relevance&maxResults=100&textFormat=plainText&alt=json${p:+&pageToken=$p}'"
    
        $x "$c"
    }
    
    $ PAGE_TOKEN=...
    
    $ AGENT=... APP_KEY=... comments-wget CJ_GCPaKywg "$PAGE_TOKEN"
    Setting --verbose (verbose) to 1
    Setting --check-certificate (checkcertificate) to 0
    Setting --output-document (outputdocument) to /tmp/CJ_GCPaKywg-comments0.json
    Setting --user-agent (useragent) to ...
    DEBUG output created by Wget 1.14 on linux-gnu.
    
    --2019-06-10 17:41:11--  https://www.googleapis.com/youtube/v3/commentThreads?...
    Resolving www.googleapis.com... 172.217.19.106, 216.58.214.202, 216.58.214.234, ...
    Caching www.googleapis.com => 172.217.19.106 216.58.214.202 216.58.214.234 172.217.16.106 172.217.20.10 2a00:1450:400d:808::200a
    Connecting to www.googleapis.com|172.217.19.106|:443... connected.
    Created socket 5.
    Releasing 0x0000000000ae57c0 (new refcount 1).
    
    ---request begin---
    GET /youtube/v3/commentThreads?.../1.1
    User-Agent: ...
    Accept: */*
    Host: www.googleapis.com
    Connection: Keep-Alive
    
    ---request end---
    HTTP request sent, awaiting response... 
    ---response begin---
    HTTP/1.1 200 OK
    Expires: Mon, 10 Jun 2019 14:43:39 GMT
    Date: Mon, 10 Jun 2019 14:43:39 GMT
    Cache-Control: private, max-age=0, must-revalidate, no-transform
    ETag: "XpPGQXPnxQJhLgs6enD_n8JR4Qk/OUAqOrEpA9YYqmVx0wqn9en_OrE"
    Vary: Origin
    Vary: X-Origin
    Content-Type: application/json; charset=UTF-8
    X-Content-Type-Options: nosniff
    X-Frame-Options: SAMEORIGIN
    X-XSS-Protection: 1; mode=block
    Content-Length: 205965
    Server: GSE
    Alt-Svc: quic=":443"; ma=2592000; v="46,44,43,39"
    
    ---response end---
    200 OK
    Registered socket 5 for persistent reuse.
    Length: 205965 (201K) [application/json]
    Saving to: ‘/tmp/CJ_GCPaKywg-comments0.json’
    
    100%[==========================================>] 205,965      580KB/s   in 0.3s   
    
    2019-06-10 17:41:18 (580 KB/s) - ‘/tmp/CJ_GCPaKywg-comments0.json’ saved [205965/205965]
    

    Note that the shell functions quote and quote2 above are those from youtube-data.sh (they are not really needed). $PAGE_TOKEN is extracted from the body string of the JSON request object posted above.


    The next question is: why your python code uses a POST request method? Could it be that this is the cause of your problem?

    0 讨论(0)
提交回复
热议问题