Are there some noticeable outcomes in terms of performance or other aspects to follow semantic HTML?
Thanks
Semantic code uses html elements for their given purpose. Well structured HTML will have semantic meaning for a wide range of users and user agents (browsers without style sheets, text browsers, PDAs, search engines etc.)
Benefits
The two points mentioned earlier are the basic benefits of using semantic code. If we use globally known tags, others understand without any additional effort. Any software program that uses the globally known tags will not be able to understand our page.
A working example of this is that search engines weigh keyword importance according to what they are. For example, and article title enclosed in one of the headings (h1 and its hierarchy) would get higher importance and hence visibility than spans. Semantic HTML enables effective Search Engine Optimization (SEO).
The semantic data extractor of W3C is a good demonstration of the possibilities of using Semantic HTML and software automation.
A side effect of excluding presentational information from the semantic markup is that now data and its presentation can be decoupled in implementation. Which means that you can change presentation without touching the data, or apply the presentation to multiple types of data. This is exactly what technologies like CSS and XHTML together achieve. Of course Semantic HTML is not necessary for this decoupling, but provides for by being semantic it enforces exclusion of presentational information.
http://www.seoblogger.co.uk/serps/the-benefits-of-using-semantic-code.html
While writing semantically correct mark-up is good for organisation and management of code, and makes separation of style and code easier, I think there's a stronger motivation behind it's use.
Semantically correct mark-up increases the likelihood of a machine (search engine / bot / screen-scraper or other type of script) being able to parse your content to assess it's purpose.
Microformats are a logical extension to semantic markup; use of microformat standards can allow a more accurate assessment to be made.
There are clear benefits. For instance, take a look at this piece of code: <article>Bla bla bla</article>
Now take a look the "equivalent" with HTML4: <div class="article">Bla bla bla</div>
For a machine, there is no special difference between <div>
or <article>
. Two different strings with no special meaning. But if you provide different tags, you can assign a different meaning to each of them, and is what is does in fact.
From the point of view of a search engine bot, a <div>
tag is something that is used as a container, with no more special meaning. But if I use <article>
the search engine bot can understand that inside it's going to be a piece of text with some interesing to say the people.
You can see it more clear with another tag like <aside>
, where you're saying that what is within it doesn't have any relation with the rest of the document. For example, you can put ads within (the ads can be related with the content, but this use to be accidentally :))
In performance terms, I don't know if there is a big difference or if there is a difference at all, but it is not the goal that is want to be reached.
semantic HTML and performance
Semantic HTML is not only using the right tags for the right purposes which obviously improves SEO, but also the separation of markup (HTML), style (CSS) and scripts (JS). The separation will not only improve maintenance, but certainly also improve download performance as you usually cache CSS/JS files. If you clutter the HTML file with all raw CSS/JS code and/or are using style
instead of id
or class
, it would only make the HTML page unnecessarily bigger and it would take longer time to haul it in.
Additionally, using semantic HTML will be of benefit to users of assistive technologies such as screen-readers, which may alter the pitch or gender of the reading voice to signify important information or presentational information, or emphasis. For example, if information you want to be really emphasised is marked up as <em>
for exmphasis, rather than simply bolded (you can still style an <em>
tag to be bolded in your CSS), a screen-reader will alter the inflection of that particular word to emphasise it.
As well as using proper separation of data and formatting making your code more efficient and more readable on-screen, using markup properly will not only signify visually that information is of a certain type, but will again benefit assistive technology users. For example if you have a list of information simply marked up as paragraphs, to someone who couldn't see the page there's no way of signifying that the information is related, whereas if your information is marked up as say and unordered list <ul>
or an ordered list <ol>
, visually it's easier for someone to read that information because it's clearly indented or has bullet points. For screen-reader users for example, when it comes to a list, the screen-reader will announce that the forthcoming content is a list.
It's like making use of the codeblock styling here on Stack Overflow - if you use the 'code' formatting to highlight any code in your post, it makes it clearer for everyone to read, and shows that highlighted text is infact code. HTML's just the same.
More predictable outcome over the different devices used to display HTML in.
Your energy can be focused on the different quirks/bugs in each of the (most commonly used for yor app) devices.