You are looking for Watir which runs a real browser and allows you to perform every action you can think of on a web page. There's a similar project called Selenium.
You can even use Watir with a so-called 'headless' browser on a linux machine.
Watir headless example
Suppose we have this HTML:
Hello from HTML
and this Javascript:
document.getElementById('hello').innerHTML = 'Hello from JavaScript';
(Demo: http://jsbin.com/ivihur)
and you wanted to get the dynamically inserted text. First, you need a Linux box with xvfb
and firefox
installed, for example on Ubuntu do:
$ apt-get install xvfb firefox
You will also need the watir-webdriver
and headless
gems so go ahead and install them as well:
$ gem install watir-webdriver headless
Then you can read the dynamic content from the page with something like this:
require 'rubygems'
require 'watir-webdriver'
require 'headless'
headless = Headless.new
headless.start
browser = Watir::Browser.new
browser.goto 'http://jsbin.com/ivihur' # our example
el = browser.element :css => '#hello'
puts el.text
browser.close
headless.destroy
If everything went right, this will output:
Hello from JavaScript
I know this runs a browser in the background as well, but it's the easiest solution to your problem i could come up with. It will take quite a while to start the browser, but subsequent requests are quite fast. (Running goto
and then fetching the dynamic text above multiple times took about 0.5 sec for each request on my Rackspace Cloud Server).
Source: http://watirwebdriver.com/headless/