Selenium: retrieve data that loads while scrolling down

后端 未结 2 552
无人及你
无人及你 2020-12-17 04:14

I\'m trying to retrieve elements in a page that has an ajax-load scroll-down functionality alla Twitter. For some reason this isn\'t working properly. I added some print sta

相关标签:
2条回答
  • 2020-12-17 04:25

    The condition in the while loop was the issue for my use case. It was an infinite loop. I fixed the problem by using a counter :

    def get_items(items):
    
        item_nb = [0, 1] # initializing a counter of number of items found in page
    
        while(item_nb[-1] > item_nb[-2]):   # exiting the loop when no more new items can be found in the page
    
            items = wd.find_elements_by_class_name('stream-item')
            time.sleep(5)
            browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    
            item_nb.append(len(items))
    
        return items
    

    ```

    0 讨论(0)
  • 2020-12-17 04:36

    Try putting a sleep in between

    wd = webdriver.Firefox()
    wd.implicitly_wait(3)
    
    def get_items(items):
        print len(items)
        wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        # len(items) and len(wd.find_elements-by...()) both always seem to return the same number
        # if I were to start the loop with while True: it would work, but of course... never end
    
        sleep(5) #seconds
        while len(wd.find_elements_by_class_name('stream-item')) > len(items):
            items = wd.find_elements_by_class_name('stream-item')
            print items
            wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        return items
    
    def test():
        get_page('http://twitter.com/')
        get_items(wd.find_elements_by_class_name('stream-item'))
    

    Note: The hard sleep is just for demonstrating that it works. Please use the waits package to wait for a smart condition instead.

    0 讨论(0)
提交回复
热议问题