Introduction to XML 


Definition: XML
The extended markup language (XML) is a language in which character oriented documents with a hierarchical structure can be expressed.

The document must conform to the syntax of XML. If it does not conform then it's said to have illegal syntax.

XML as a markup language

The basic idea behind a markup languages was the ability to tag or mark certain portions of a document and give them meaning. 

Non marked up documents

If a document has no markup, then it is just a collection of characters. No more, no less.  

A good example of non-marked up documents are the plain ASCII/UNICODE files.  
The individual words, sentences, ... have no meaning. They are just a pile of characters.

e.g. This is a very large file

A program displaying this sentence can't give informative clues concerning the used words. It can only display the text.

Marked up documents

Marking a piece of a document has 2 purposes

  1. it identifies that piece (it marks its begin and end)
  2. it gives meaning to that piece (the tag has a meaning)

A good example of marked up documents are HTML files.  

e.g. This is a <STRONG>very</STRONG> large file

Now, a program displaying this sentence can give an informative clue or extra information. It  knows that the author has put strong emphasize on the adjective very. The program could for example color the adjective very in red, pop up an exclamation mark when very is first displayed or make it blink.

Character oriented

The character oriented nature is one of the characteristics of XML which makes it cross-platform and powerful. 

Because an XML document is  technically speaking an a-cyclic graph (a tree structure), it is possible to create binary XML documents. This would however render those binary documents platform specific, because byte order (big or little Indian?) and word lengths ( is an integer 1,2,4 or 8 bytes? ) are platform specific.


diputree documentation © 2000 dipu