Get link and href text from html doc with Nokogiri & Ruby?

后端 未结 2 867
广开言路
广开言路 2020-12-28 11:02

I\'m trying to use the nokogiri gem to extract all the urls on the page as well their link text and store the link text and url in a hash.


    &         


        
2条回答
  •  囚心锁ツ
    2020-12-28 11:48

    Another way:

    h = doc.css('a[href]').each_with_object({}) { |n, h| h[n.text.strip] = n['href'] }
    # yields {"Foo"=>"#foo", "Bar"=>"#bar"}
    

    And if you're worried that you might have the same text linking to different things then you collect the hrefs in arrays:

    h = doc.css('a[href]').each_with_object(Hash.new { |h,k| h[k] = [ ]}) { |n, h| h[n.text.strip] << n['href'] }
    # yields {"Foo"=>["#foo"], "Bar"=>["#bar"]}
    

提交回复
热议问题