Extract multi-line javascript content from [removed] tag using Scrapy

前端 未结 3 1720
借酒劲吻你
借酒劲吻你 2021-01-13 15:37

I\'m trying to extract data from this script tag using Scrapy:



        
3条回答
  •  北荒
    北荒 (楼主)
    2021-01-13 15:55

    Following regex seems to be correct:

    r"data\.bundles\[[^\]]*\] = {([^}]*)}"
    

    * in regexes is greedy - it will always try to match as much as possible, so i use [^\]] to make sure that I will match the closest ]. I do the same with {} brackets. Additionally, I don't have to worry about . not matching newline.

提交回复
热议问题