Amazon Sales Data (with Excel VBA)

前端 未结 2 1463
礼貌的吻别
礼貌的吻别 2020-12-20 05:08

I\'m trying to obtain the result number (in the HTML code) of each keyword I search by means of Excel VBA. Narrowing down the criteria by className, id, and data-asin, but t

相关标签:
2条回答
  • 2020-12-20 05:26

    Here is the example which downloads products from Amazon for each search query presented on the sheet Terms, and populates the sheet Products with ASINs and descriptions. It uses XHR, so IE isn't needed. The code is as follows:

    Sub Test()
        lngRow = 1
        ' search each term
        For Each strTerm In Sheets("Terms").UsedRange
            lngPage = 1
            Do
                ' HTTP GET request of the search result page
                strUrl = "https://www.amazon.com/s/ref=nb_sb_noss_2?page=" & lngPage & "&keywords=" & EncodeUriComponent(strTerm)
                Set objXHR = CreateObject("MSXML2.XMLHttp")
                objXHR.Open "GET", strUrl, False
                objXHR.Send
                strResp = objXHR.ResponseText
                ' split response to array by items
                arrResp = Split(strResp, "<li id=""result_")
                ' process each item on the page
                For i = 1 To UBound(arrResp)
                    strItem = arrResp(i)
                    ' extract ASIN
                    strTmp = Split(strItem, "data-asin=""")(1)
                    strTmp = Split(strTmp, """")(0)
                    Sheets("Products").Cells(lngRow, 1).NumberFormat = "@"
                    Sheets("Products").Cells(lngRow, 1).Value = strTmp
                    ' extract the product description
                    strTmp = Split("<li id=""result_" & strItem, "</li>")(0) & "</li>"
                    Sheets("Products").Cells(lngRow, 2).Value = GetInnerText(strTmp)
                    ' show current item
                    Sheets("Products").Cells(lngRow, 1).Select
                    ' next row
                    lngRow = lngRow + 1
                Next
                ' adjust sheet
                Sheets("Products").Columns.AutoFit
                Sheets("Products").Rows.AutoFit
                ' next page
                lngPage = lngPage + 1
            Loop Until UBound(arrResp) = 0 ' empty search result
        Next
    End Sub
    
    Function EncodeUriComponent(strText)
        Static objHtmlfile As Object
        If objHtmlfile Is Nothing Then
            Set objHtmlfile = CreateObject("htmlfile")
            objHtmlfile.parentWindow.execScript "function encode(s) {return encodeURIComponent(s)}", "jscript"
        End If
        EncodeUriComponent = objHtmlfile.parentWindow.encode(strText)
    End Function
    
    Function GetInnerText(strHtmlContent)
        Dim objHtmlFile, objBody
        Set objHtmlFile = CreateObject("htmlfile")
        objHtmlFile.write strHtmlContent
        Set objBody = objHtmlFile.getElementsByTagName("body")(0)
        GetInnerText = Trim(objBody.innerText)
    End Function
    

    I placed on the Terms sheet:

    Results on the Product sheet contain 571 items:

    It's not a complete answer, but I hope it helps you.

    0 讨论(0)
  • 2020-12-20 05:27

    Through trial and error, I finally solved this bloody thing. I just had to take out part of the code which included the "And InStr(TDelement.ID, "result")" and then everything ran smooth as butter.

    0 讨论(0)
提交回复
热议问题