processingFailure error (400) while retrieving CommentThreads list

前端未结

关注

 3  1035

I a trying to retrieve all the comments of a video via Python iteration/paging. I am logged correctly with a developer key

import googleapiclient.discovery


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  感情败类        
                
              
                            
                2020-12-07 05:53
              
            
            
                                                                       
Issue is, we can't really retrieve all the comments of every video.


  https://issuetracker.google.com/issues/134912604

  We currently don't support paging through the whole stream. So there's no way to retrieve all the 1000+ commentThreads that you have for that video

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天涯浪人        
                
              
                            
                2020-12-07 05:54
              
            
            
                                                                       
According to Google's Python Client Library sample code and to Google's Youtube API sample code, you should have been coding your pagination loop as shown below:

request = yt.commentThreads().list(...)
while request:
    response = request.execute()
    # your processing code goes here ...
    request = yt.commentThreads().list_next(request, response)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长情又很酷        
                
              
                            
                2020-12-07 06:03
              
            
            
                                                                       
This is not a solution to your problem. It just shows that querying the endpoint via a GET
request method succeeds obtaining from the API the needed page response.

# comments-wget [-d] VIDEO_ID [PAGE_TOKEN]

$ comments-wget() { 
    local x='eval'
    [ "$1" == '-d' ] && { 
        x='echo'
        shift
    }

    local v="$1"
    quote2 -i v

    local p="$2"
    quote2 -i p

    local O="/tmp/$v-comments%d.json"
    local o
    local k=0
    while :; do
        printf -v o "$O" "$k"
        [ ! -f "$o" ] && break
        (( k++ ))
    done
    quote o

    k="$APP_KEY"
    quote2 -i k
    local a="$AGENT"
    quote2 a

    local c="\
wget \
--debug \
--verbose \
--no-check-certif \
--output-document=$o \
--user-agent=$a \
'https://www.googleapis.com/youtube/v3/commentThreads?key=$k&videoId=$v&part=replies,snippet&order=relevance&maxResults=100&textFormat=plainText&alt=json${p:+&pageToken=$p}'"

    $x "$c"
}

$ PAGE_TOKEN=...

$ AGENT=... APP_KEY=... comments-wget CJ_GCPaKywg "$PAGE_TOKEN"
Setting --verbose (verbose) to 1
Setting --check-certificate (checkcertificate) to 0
Setting --output-document (outputdocument) to /tmp/CJ_GCPaKywg-comments0.json
Setting --user-agent (useragent) to ...
DEBUG output created by Wget 1.14 on linux-gnu.

--2019-06-10 17:41:11--  https://www.googleapis.com/youtube/v3/commentThreads?...
Resolving www.googleapis.com... 172.217.19.106, 216.58.214.202, 216.58.214.234, ...
Caching www.googleapis.com => 172.217.19.106 216.58.214.202 216.58.214.234 172.217.16.106 172.217.20.10 2a00:1450:400d:808::200a
Connecting to www.googleapis.com|172.217.19.106|:443... connected.
Created socket 5.
Releasing 0x0000000000ae57c0 (new refcount 1).

---request begin---
GET /youtube/v3/commentThreads?.../1.1
User-Agent: ...
Accept: */*
Host: www.googleapis.com
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 200 OK
Expires: Mon, 10 Jun 2019 14:43:39 GMT
Date: Mon, 10 Jun 2019 14:43:39 GMT
Cache-Control: private, max-age=0, must-revalidate, no-transform
ETag: "XpPGQXPnxQJhLgs6enD_n8JR4Qk/OUAqOrEpA9YYqmVx0wqn9en_OrE"
Vary: Origin
Vary: X-Origin
Content-Type: application/json; charset=UTF-8
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Content-Length: 205965
Server: GSE
Alt-Svc: quic=":443"; ma=2592000; v="46,44,43,39"

---response end---
200 OK
Registered socket 5 for persistent reuse.
Length: 205965 (201K) [application/json]
Saving to: ‘/tmp/CJ_GCPaKywg-comments0.json’

100%[==========================================>] 205,965      580KB/s   in 0.3s   

2019-06-10 17:41:18 (580 KB/s) - ‘/tmp/CJ_GCPaKywg-comments0.json’ saved [205965/205965]


Note that the shell functions quote and quote2 above are those from youtube-data.sh (they are not really needed). $PAGE_TOKEN is extracted from the body string of the JSON request object posted above.



The next question is: why your python code uses a POST request method?
Could it be that this is the cause of your problem?
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复