how to fetch javascript contents in python

前端 未结 1 1380
清酒与你
清酒与你 2021-01-23 15:46

I have a website that has data I want to fetch stored in a javascript. How do I fetch it?

The code is this :- http://pastebin.com/zhdWT5HM

I want to fetch from \

相关标签:
1条回答
  • 2021-01-23 16:10

    BeautifulSoup would only help locating the desired script tag. Then, you would have multiple options: you can extract the desired data with a javascript parser, like slimit, or use regular expressions:

    import re
    
    from bs4 import BeautifulSoup
    
    page = """
    <script type="text/javascript">
                var logged = true;
                var video_id = 59374;
                var item_type = 'official';
    
                var debug = false;
                var baseUrl = 'http://www.example.com';
                var base_url = 'http://www.example.com/';
                var assetsBaseUrl = 'http://www.example.com/assets';
                var apiBaseUrl = 'http://www.example.com/common';
                var playersData = [{"playerId":"showsPlayer","userId":true,"solution":"flash","playlist":[{"itemId":"5090","itemAK":"Movie"}]];
    </script><script type="text/javascript" >
    """
    soup = BeautifulSoup(page)
    
    pattern = re.compile(r'"playerId":"(.*?)"', re.MULTILINE | re.DOTALL)
    script = soup.find("script", text=pattern)
    
    print pattern.search(script.text).group(1)
    

    Prints:

    showsPlayer
    
    0 讨论(0)
提交回复
热议问题