TMT: Document Type Declaration

This is an installment of Ten Minute Tech; where I pick a technology related subject and then write a paragraph or two about what I know. I then pick and read a reference article related to the subject and then write another paragraph or two about what I learned. This edition's topic is about Document Type Declaration's.

Before

In my opinion Document Type Declaration's (DOCTYPE) are one of the most overlooked items when it comes to web development. What I mean is that many developers don't use them at all or they use but don't understand why they should. It really is remarkable how using a proper DOCTYPE in your xhtml files can effect the rendering of your pages, especially when it comes to cross browser coding.

I am not going to go into too much more detail about how a proper DOCTYPE can effect web page rendering (volumes could be written on it); I guess I am just really tired of people still complaining about the Box Model problem in Internet Explorer 6 (if they would use a proper DOCTYPE they wouldn't have to worry about it).

Now, a DOCTYPE is an instruction (line of code) that links to a file that describes the markup syntax for the rest of the page. Its most practical application is for use at the top of an xhtml or html page to describe to the browser what tags and attributes the page can use in the xhtml/html code. The file the DOCTYPE links to is a language of it own know as a Document Type Definition (DTD).

For a non-technical example take the English language. If you were writing a letter to your friend in English and you wrote at the very top of the letter 'This letter is written in English'. This would be equivalent to the DOCTYPE. You are simply stating right away what someone is going need to understand in order to read the rest of the letter.

The Document Type Definition (DTD) would be the dictionaries, thesaurus and grammar books that some could use to decipher the message (if they didn't already know the language).

For the computer a proper DOCTYPE not only declares the language, but it also tells the computer where to find the DTD. This would be like you not only writing the fact that the letter is in English, but also listing a bunch of ISBN numbers for the person to lookup the English books.

DOCTYPE's and their definitions are typically used to define markup languages (html, xhtml, rss, soap, etc); but I believe they can be extended to almost any type of syntax (if only in theory).

As far as the syntax of the DOCTYPE itself, I myself can never remember, I always have to look them up.

Article: Document Type Declaration

After

The article itself is actually very narrowly written. It focuses on the various html and xhtml DOCTYPE's (maybe should be expanded). It does however mention the fact that it is used for SGML and XML based documents.

I think the real potential in DOCTYPE's and DTD's is the the idea that they are a practical application of a way to create a computer readable language that describes other languages (i.e human languages). I think that this could really be the foundation of something much greater. Just think of a time that we could truly teach a computer how read and understand a human language. The applications in translation and human to computer understanding could be great, and I think we will slowly grow in this direction. I would be interested to know if there are projects and studies out there today that are working toward this very thing.

Some good additional reading is the article on Document Type Definition's. It goes into the nuts and bolts of how these technologies are used today to describe custom markup languages. Unfortunately this article does not talk about the theory behind it.
http://en.wikipedia.org/wiki/Document_Type_Definition

Here is an article on markup languages in general. It briefly touches on the idea of semantic markup; the idea that the code reflects the [human] meaning of the information inside of it.
http://en.wikipedia.org/wiki/Markup_language

{
}