If anyone finds this in the future; I managed to solve it with Selenium, dont know if its a good/correct solution but it seems to be working.
System.setProperty("webdriver.chrome.driver", "C:\\Users\\PC\\Desktop\\Chromedriver\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.get("http://www.bellator.com/events");
String html = driver.getPageSource();
Document doc = Jsoup.parse(html);
Elements elements = doc.select("ul.s_layouts_lineListAlt > li > a");
for(Element element : elements) {
System.out.println(element.attr("href"));
}
Output:
http://www.bellator.com/events/d306b5/bellator-newcastle-pitbull-vs-scope
http://www.bellator.com/events/ylcu8d/bellator-215-mitrione-vs-kharitonov
http://www.bellator.com/events/yk2djw/bellator-216-mvp-vs-daley
http://www.bellator.com/events/e8rdqs/bellator-217-gallagher-vs-graham
http://www.bellator.com/events/281wxq/bellator-218-sanchez-vs-grimshaw
http://www.bellator.com/events/8lcbdi/bellator-219-koreshkov-vs-larkin
http://www.bellator.com/events/9rqguc/bellator-macdonald-vs-fitch
What you see in your web browser is not what Jsoup sees. Disable JavaScript and refresh page to get what Jsoup gets OR press CTRL+U ("Show source", not "Inspect"!) in your browser to see original HTML document before JavaScript modifications. When you use your browser's debugger it shows final document after modifications so it's not not suitable for your needs.
It seems like whole "UPCOMING EVENTS" section is dynamically loaded by JavaScript. Even more, this section is asynchronously loaded with AJAX. You can use your browsers debugger (Network tab) to see every possible request and response.
I found it but unfortunately all the data you need is returned as JSON so you're going to need another library to parse JSON.
That's not the end of the bad news and this case is more complicated. You could make direct request for the data: http://www.bellator.com/feeds/ent_m152_bellator/V1_1_0/d10a728c-547e-4a6f-b140-7eecb67cff6b but the URL seems random and few of these URLs (one per upcoming event?) are included inside JavaScript code in HTML.
My approach would be to get the URLs of these feeds with something like:
List<String> feedUrls = new ArrayList<>();
//select all the scripts
Elements scripts = document.select("script");
for(Element script: scripts){
if(script.text().contains("http://www.bellator.com/feeds/")){
// here use regexp to get all URLs from script.text() and add them to feedUrls
}
}
for(String feedUrl : feedUrls){
// iterate over feed URLs, download each of them
String json = Jsoup.connect(feedUrl).ignoreContentType(true).get().body().toString();
// here use JSON parsing library to get the data you need
}
ALTERNATIVE approach would be to stop using Jsoup because of its limitations and use Selenium Webdriver as it supports dynamic page modifications by JavaScript so you'd get the HTML of the final result - exactly what you see in web browser and Inspector.