Get data from listings on a website to excel VBA

后端 未结 1 508
星月不相逢
星月不相逢 2021-01-16 06:31

I am trying to find a way to get the data from yelp.com

I have a spreadsheet on which there are several keywords and locations. I am looking to extract data from yel

相关标签:
1条回答
  • 2021-01-16 06:51

    If you right click in IE, and do View Source, it is apparent that the data served on the site is not part of the document's .Body.innerText property. I notice this is often the case with dynamically served data, and that approach is really too simple for most web-scraping.

    I open it in Google Chrome and inspect the elements to get an idea of what I'm really looking for, and how to find it using a DOM/HTML parser; you will need to add a reference to Microsoft HTML Object Library.

    enter image description here

    I think you can get it to return a collection of the <DIV> tags, and then check those for the classname with an If statment inside the loop.

    I made some revisions to my original answer, this should print each record in a new cell:

    Option Explicit
    Private Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
    Sub find()
    'Uses late binding, or add reference to Microsoft HTML Object Library 
    '  and change variable Types to use intellisense
    Dim ie As Object 'InternetExplorer.Application
    Dim html As Object 'HTMLDocument
    Dim Listings As Object 'IHTMLElementCollection
    Dim l As Object 'IHTMLElement
    Dim r As Long
        Set ie = CreateObject("InternetExplorer.Application")
        With ie
            .Visible = False
            .Navigate "http://www.yelp.com/search?find_desc=boutique&find_loc=New+York%2C+NY&ns=1&ls=3387133dfc25cc99#start=10"
            ' Don't show window
            'Wait until IE is done loading page
            Do While .readyState <> 4
                Application.StatusBar = "Downloading information, Please wait..."
                DoEvents
                Sleep 200
            Loop
            Set html = .Document
        End With
        Set Listings = html.getElementsByTagName("LI") ' ## returns the list
        For Each l In Listings
            '## make sure this list item looks like the listings Div Class:
            '   then, build the string to put in your cell
            If InStr(1, l.innerHTML, "media-block clearfix media-block-large main-attributes") > 0 Then
                Range("A1").Offset(r, 0).Value = l.innerText
                r = r + 1
            End If
        Next
    
    Set html = Nothing
    Set ie = Nothing
    End Sub
    
    0 讨论(0)
提交回复
热议问题