TIP: Always nest your HTML elements properly! Basically, this means you should close things in the reverse order you opened them.
Birds do it, bees don't do it (they just break out in hives), and HTML authors need to do it if they want their pages to have valid syntax. It's called "nesting," and it describes the structural relationship of different elements delimited by opening and closing tags in an HTML document. What Is Nesting?
The basic principle is that an HTML document consists of a number of elements which are containers, with contents consisting of characters and possibly other elements "nested" within them. There are various syntax rules about what sorts of elements are allowed to contain what sorts of other elements (basically, "block-level" elements like<P>
and <BLOCKQUOTE>
can contain character-level elements like <EM>
and <FONT>
but not the other way around), but the most important rule is that elements must always be contained entirely within other elements, not overlapped. They're like those Russian dolls that contain other dolls, one within another; you can't have a doll that's half-contained in one other doll and half-contained in a third one. It's either all in or all out. Web developers who don't understand this principle often think of tags in a manner that HTML purists deride as "Tag Soup". In this mindset, the tags are "formatting commands" executed sequentially by the browser, causing particular actions to be performed like "turn on boldface", "turn off italics", etc. Taking this view, one might produce this piece of invalid HTML code:
<B>This is bold <I>and this is also italic</B>, but what is this?</I>
</B>
but italicizing is still "turned on". But that's not really how HTML works. The B and I elements here have been set up in an overlapping fashion, which is a violation of HTML nesting rules. So it's anyone's guess how a browser might actually render the text. Any rendering will be based on the browser's error-correction rules rather than on the HTML specs, in which the function of such malformed code as this is undefined. So it might vary widely between different browsers even if all browsers follow the specs. Some versions of Netscape, for instance, treated the </B>
tag as if it were </I>
, and the </I>
tag as if it were </B>
, thus closing the elements in their properly-nested order. Thus, the final part of the sentence ended up in non-italic bold, probably not what the web author intended. Other browsers may do other things in their attempts to error-correct. Don't rely on them doing what you want; use correct code to begin with! Of course, one can get even worse than the above example. Somebody posted to an HTML authoring newsgroup that he was having problems with fonts not showing up in the intended colors in some browsers. It turned out his site was full of a long sequence of tags like
<FONT COLOR="#FF0000"> <H2>Header</H2> <FONT COLOR="#000000">
... setting the color to red, then black, and so on, without a single closing FONT tag in the bunch. This clearly derived from a mistaken mindset that the tags were commands to change the color, which could be stuck into the code anywhere the author wants something to come up in a different color than what went before it. Sometimes it may actually work, depending on browser error correction, but you can't count on it. In the case of the site mentioned on the newsgroup, Firefox managed to get the colors correct for a number of headings, then some stack overflowed or something, and the rest of the page was shown with black text regardless of the subsequent font tags. The correct way to think of FONT tags is that the opening and closing tags of a FONT element delimit the range of text that is supposed to be rendered in that color. Also, since a FONT element is character-level, and headers like H2 are block-level, the latter can't nest within the former. You should have your opening and closing font tags within the H2 element... or, better yet, dispense with the FONT tags altogether and use style classes to suggest header colors. (Unfortunately, the newsgroup poster steadfastly refused to catch a clue on this; I think after some tweaking his site kinda-sorta worked, some of the time, but was still as poorly nested as ever.) Optional Closing Tags
Certain closing tags are allowed to be omitted. For instance, a paragraph is begun with<P>
and ended with </P>
, but the closing tag can be left out. The reason is that paragraphs are not allowed to be nested within other paragraphs, so the opening <P>
for the next paragraph can be assumed by the browser to also close the preceding paragraph. Any other block-level opening or closing tag can also be assumed to close any paragraph that is in progress when such tag is reached, since it would violate the nesting rules for the paragraph to continue across such a tag. Incidentally, in the very earliest implementation of HTML, sometimes referred to as "HTML 1.0" although there was never a formal spec for that version (this didn't come until HTML 2.0), the paragraph tag was an empty "paragraph break" much like
<BR>
is a line break. There is no such tag as </BR>
, since the line break is an "empty tag" that is not a container. The paragraph tag used to be that way, but in HTML 2.0 it was changed to be a container as it remains now. However, some of the earliest tutorials on HTML development were based on HTML 1.0 and treated <P>
as an empty "paragraph break", and enough people learned it that way and taught others the same (long after this was obsolete) that the misuse of <P>
as a "paragraph break" instead of a "paragraph container" is still rampant. I even used it that way myself early on, but to force myself to use the tag correctly I eventually began a practice of always using the closing </P>
tag, even though this is optional, in order to make sure I maintain the proper mindset of regarding the element as a container with a beginning and an end. You can tell that a site developer has the obsolete attitude about paragraph tags if you find that the <P>
tags are always at the end of paragraphs, and there isn't one before the first paragraph. Other optional closing tags are the
</TR>
and </TD>
tags at the end of table rows and elements. But you shouldn't omit them anyway, since some browsers have been known to mess up in rendering tables with them absent. And remember that the closing </TABLE>
tag for the table is not optional, and some browsers (most notably Netscape) will not show the user any of the table content at all if the closing tag is omitted. I got into a bit of an online argument once with somebody who advocated that the HTML standards ought to be changed to allow more lenient syntax in the areas of nesting and required closing tags. The problem is that the changes she wanted were basically mutually exclusive; if you got more lenient in allowing bad nesting, then it is all the more difficult to unambiguously determine the point at which one can infer the end of an element in the absence of an explicit closing tag. And judging from the actual syntax of some of her own web pages, she apparently believed that the closing
</TABLE>
tag ought to be inferred by the presence of the opening <TABLE>
tag of the next table in the page. But that's impossible, since tables are allowed to nest within one another; any new syntax rules that inferred table closers this way would "break" thousands of existing pages that rely on nested tables. Whether the use of nested tables for page layout is a good idea is another subject for (heated) debate, but the fact is that there are many pages out there that do it, and it's not a good idea to introduce a new HTML standard that makes them all stop working; this violates the concept of maximizing compatibility between successive standards versions so that, to the greatest extent possible, old browsers can still view new pages, and new browsers can still view old pages, with the basic content intact even if the latest bells and whistles won't work in such cases
No comments:
Post a Comment