Wednesday, April 18, 2012

HTML5 Guidelines for Web Developers - Structure and Semantics for Documents


Both the previously mentioned MAMA survey conducted by Opera and Google’s study of Web Authoring Statistics of 2005 (http://code.google.com/webstats) conclude that it was common practice at that time to determine the page structure of web sites with the class or id attribute. Frequently used attribute values were footer, content, menu, title, header, top, main, and nav, and it therefore made sense to factor the current practice into the new HTML5 specification and to create new elements for structuring pages.
The result is a compact set of new structural element - for example, header, hgroup, article, section, aside, footer, and nav - that facilitate a clear page structure without detours via class or id. To illustrate this, we will use a fictitious and not entirely serious HTML5 blog entry to risk a look ahead to the year 2022 (see Figure 1.1). But please concentrate less on the content of the post and focus instead on the document structure.
 Figure 1.1 The fictitious HTML5 blog

Before analyzing the source code of the HTML5 blog in detail, here are a few important links, for example, to the specification HTML: The Markup Language Reference—subsequently shortened and referred to as markup specification at
Here, Mike Smith, the editor and team contact of W3C HTML WG, lists each element’s definition, any existing limitations, valid attributes or DOM interfaces, plus formatting rules in CSS notation (if to be applied)—a valuable help that we will use repeatedly. The HTML5 specification also contains the new structural elements in the following chapter:
The .html and .css files to go with the HTML5 blog are of course also available online at:
At first glance, you can see four different sections in Figure 2.1—a header, the article, the footer, and a sidebar. All the new structural elements are used in these four sections. In combination with short CSS instructions in the stylesheet blog. css, they determine the page structure and layout.

1.1 Header with “header” and “hgroup”
In the header we encounter the first two new elements: header and hgroup. Figure
1.2 shows the markup and the presentation of the header:

<header>
<img>
<hgroup>
<h1>
<h2>
</hgroup>
</header>

Figure 1.2 The basic structure of the HTML5 blog header 

The term header as used in the markup specification refers to a container for headlines and additional introductory contents or navigational aids. Headers are not only the headers at the top of the page, but can also be used elsewhere in the document. Not allowed are nested headers or a header within an address or footer element.
In our case the headline of the HTML5 blog is defined by header in combination with the logo as an img element and two headings (h1 and h2) surrounded by an hgroup element containing the blog title and a subtitle.
Whereas it was common practice until now to write the h1 and h2 elements directly below one another to indicate title and subtitle, this is no longer allowed in HTML5. We now have to use hgroup for grouping such elements. The overall position of the hgroup element is determined by the topmost heading. Other elements can occur within hgroup, but as a general rule, we usually have a combination of tags from h1 to h6.
We can glimpse a small but important detail from the markup specification: The guideline is to format header elements as display: block in CSS, like all other structural elements. This ensures that even browsers that do not know what to do with the new tags can be persuaded to display the element concerned correctly.
We only need a few lines of code to teach Internet Explorer 8 our new header element, for example:

<!--[if lt IE 9]>
<script>
document.createElement("<header");
</script>
<style>
header { display: block; }
</style>
<![endif]-->
Of course there is also a detailed JavaScript library on this workaround, and it contains not only header, but also many other new HTML5 elements. Remy Sharp makes it available for Internet Explorer at http://code.google.com/p/html5shim.
In computer language, the term shim describes a compatibility workaround for an application. Often, the term shiv is wrongly used instead. The word shiv was coined by John Resig, the creator of jQuery, in a post of that title (http://ejohn. org/blog/html5-shiv). It remains unknown whether he may in fact have meant shim.
As far as CSS is concerned, the header does not contain anything special. The logo is integrated with float:left, the vertical distance between the two headings h1 and h2 is shortened slightly, and the subtitle is italicized.

1.2 Content with “article”
The article element represents an independent area within a web page, for example, news, blog entries, or similar content. In our case the content of the blog entry consists of such an article element combined with an img element to liven things up, an h2 heading for the headline, a time and address element for the date it was created and the copyright, plus three paragraphs in which you can also see q and cite elements for quotations of the protagonists.
Because the content element is now lacking, although it ranked right at the top in web page analyses by Google and Opera, it did not make it into HTML5 for some reason. Our blog entry is embedded in a surrounding div (see Figure 1.3).
So nothing stands in the way of adding further articles:

<div>
<article>
<img>
<h2>
<address>
<time>
</article>
</div>

Figure 2.3 The basic structure of the HTML5 blog content
By definition, the address element contains contact information, which incidentally does not, as is often wrongly assumed, refer only to the postal address, but simply means information about the contact, such as name, company, and position.
For addresses, the specification recommends using p. The address element applies to the closest article element; if there is none, it applies to the whole document. The time element behaves in a similar way in relation to its attributes pubdate and datetime, which form the timestamp for our document. You will find details on this in section 2.7.2, The “time” Element.
If article elements are nested within each other, the inner article should in principle have a theme similar to that of the outer article. One example of this kind of nesting would be, in our case, adding a subarticle to our blog with comments on the post concerned.
Regarding styling via CSS, we should mention that article once again requires display: block, that the content width is reduced to 79% via the surrounding div, and that this div also neutralizes the logo’s float: left with clear: left. The italicized author information is the result of the default format of address and is not created via em. The picture is anchored on the left with float: left, the text is justified with align: justify, and quotations are integrated using the q element.
One interesting detail is that the quotation marks are not part of the markup but are automatically added by the browser via the CSS pseudo-elements :before and :after in accordance with the style rules for the q element. The syntax in CSS notation once more reflects the markup specification: 
/* Style rule for the q-element: */
q { display: inline; }
q:before { content: '"'; }
q:after { content: '"'; }
 
1.3 Footer with “footer” and “nav”
In the footer of our HTML blog, we find two other new structural elements: footer and nav (see Figure 2.4). The former creates the frame, and the latter provides navigation to other areas of the web page. footer contains additional info on the relevant section, such as who wrote it (as address of course); are there other, related pages; what do we need to look out for (copyright, disclaimer); and so on.
Unlike the human body, where the head is usually at the top and the foot at the bottom, a footer in a document does not always have to be at the end of the document, but can, for example, also be part of an article element. Not allowed, however, are nested footer elements or a footer within a header or address element.
If you want to create navigation blocks to allow page navigation via jump labels within a document or to external related pages, you can use nav. Just as with footer, nav can appear in other areas of the document as well, as you will see in the section 2.4, Sidebar with “aside” and “section”—the only exception being that you cannot have nav within the address element:
<footer>
<p>
<nav>
<h3>
<div>
<a>
</div>
</nav>
</footer>
  Figure 2.4 The basic structure of the HTML blog footer
As for CSS, our HTML5 blog’s footer has a few special features. For example, the entire footer is colored in the same light gray as the page background, and only the links are formatted with background-color: white. The copyright in the first p requires float: left, and the navigation text-align: right plus the h3 heading in the nav block are hidden with display: none. Just why there is an h3 element in there at all will become clear in section 1.5, The Outline Algorithm. To improve the style of the links, they are surrounded by div tags. And of course we have display: block for header and nav, plus a reduction of the width in the footer element to 79%.
1.4 Sidebar with “aside” and “section”
For areas of a page that are only loosely related to the main content and can therefore be seen as rather separate entities, we can use the aside element. In our example, it creates a classical sidebar on the right with three blocks for Questionnaire, Login, and Quick Links. If the link list is implemented as nav, as is to be expected, the two first blocks are embedded in another new element: section.
The section element contains sections of a document that are thematically connected, for example, chapters of an essay or individual tabs of a page constructed from tabs, typically with a heading. If section is used within footer, it is usually used for appendices, indices, license agreements, or the like. Generally, it makes sense to use section if it belonged in a table of contents as well. In our example, as shown in Figure 1.5, the Questionnaire and the Login are tagged with section, and the links are tagged as nav as mentioned earlier: 
<aside>
<h2>
<section>
<h3><p><input>
</section>
<section>
<h3><label><input>
</section>
<nav>
<h3><ul><li><a>
</nav>
</aside>

 Figure 2.5 The basic structure of the HMTL5 blog sidebar

No comments:

Post a Comment