How can i retrieve the Contact us link from any webpage in world wide web from it\'s \"footer\" part of the page in JAVA.
E.g. find footer element, or an element with id
But I cannot be 100% sure on that the fetched link...
You will NEVER be sure.
For a given random HTML page, you want to find the "Contact Us" link. This kind of work is trivial for a human. It represents a big challenge for a computer.
I can see some options in your case:
Option 1: Crowd sourcing
Check if the platform offer an API.
+ work done by human
+ dynamically adapt to unknown pattern
- cost money
- We suck at repetitive tasks
Option 2: IA (patten searching)
Have a look at Weka for instance or Java-ML.
+ Automated task
+ Can perform a repetitive task long time
- May take time to built a robust solution
- Risk of false positive or complete miss
Option 3: Use Jsoup
This option is a never ending task. You'll have to always feed Jsoup with new patterns. I suggest you having a monitoring system telling you when website escapes any known pattern.
+ Automated task
+ Can perform a repetitive task long time
- Take time for studying, discovering, adding new patterns
- Risk of false positive or complete miss
Option 4: A mix of the three above options
You can have the three options working on the websites you target.
+ Reduce chances of false positive or complete misses
+ More confident final result
- Take time for studying, discovering, adding new patterns
- Cost money