I am trying to scrape some data from a page with a table based layout. So, to get some of the data I need to get something like 3rd table inside 2nd table inside 5th table insid
For nth-of-type
, does the following example help?
user> (require '[net.cgrand.enlive-html :as html])
user> (def test-html
"first
second
third
")
#'user/test-html
user> (html/select (html/html-resource (java.io.StringReader. test-html))
[[:p (html/nth-of-type 2)]])
({:tag :p, :attrs nil, :content ["second"]})
No idea about the second issue. Your approach seems to work with a naive test:
user> (def test-html "in div
not in div
")
#'user/test-html
user> (html/select (html/html-resource (java.io.StringReader. test-html)) [:body :> :p])
({:tag :p, :attrs nil, :content ["not in div"]})
Any chance of looking at your actual HTML?
Update: (in response to the comment)
Here's another example where "the second this is not the one nor this or for that matter this skip this one too definitely not this one not this one not this one either not this one, but almost this one certainly not this one inside the
user> (def test-html "