CDATA : When, Where, (why) and How to use it in XHTML

References to, and specifications of CDATA can be seen all throughout the W3C Recommendations– especially in Standardized Generalized Markup Language, the Markup Language from which HTML itself is derived (SGML is a descriptive markup for the structure of a computer document – therefore, since HTML is itself a Structural Markup, we can conclude that HTML itself is a form of SGML) and XML.

the following definition is one of the best i’ve found for CDATA. it really seems to break it down into a simple explanation.

All text in an XML document will be parsed by the parser.
Only text inside a CDATA section will be ignored by the parser.

Parsed Data

XML parsers normally parse all the text in an XML document.

When an XML element is parsed, the text between the XML tags is also parsed:
<message>This text is also parsed</message>
The parser does this because XML elements can contain other elements, as in this example, where the <name> element contains two other elements (first and last):
<name><first>Bill</first><last>Gates</last></name>

and the parser will break it up into sub-elements like this:

<name>
<first>Bill</first>
<last>Gates</last>
</name>

Escape Characters

Illegal XML characters have to be replaced by entity references.

If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element. You cannot write something like this:
<message>if salary < 1000 then</message>
To avoid this, you have to replace the "<" character with an entity reference, like this:

<message>if salary &lt; 1000 then</message>

There are 5 predefined entity references in XML:

&lt; < less than
&gt; > greater than
&amp; & ampersand
&apos; apostrophe
&quot; " quotation mark

Note: Only the characters "<" and "&" are strictly illegal in XML. Apostrophes, quotation marks and greater than signs are legal, but it is a good habit to replace them.

CDATA


Everything inside a CDATA section is ignored by the parser.

If your text contains a lot of "<" or "&" characters – as program code often does – the XML element can be defined as a CDATA section.

A CDATA section starts with "<![CDATA[" and ends with "]]>":

<script>
<![CDATA[
function matchwo(a,b)
{
if (a < b && a < 0) then
{
return 1
}
else
{
return 0
}
}
]]>
</script>

In the example above, everything inside the CDATA section is ignored by the parser.
Notes on CDATA sections:

A CDATA section cannot contain the string "]]>", therefore, nested CDATA sections are not allowed.

Also make sure there are no spaces or line breaks inside the "]]>" string.
an excerpt from:
XML CDATA: the W3 Schools
http://www.w3schools.com/xml/xml_cdata.asp

I also recommend the definition provided in the Wikipedia, although we must remember that the Wikipedia is not reviewed by professional editors either, it does have a thorough definitoin– see the citation below for the URL to the resource.

CDATA. (2006, March 11). In Wikipedia, The Free Encyclopedia. Retrieved 10:33, July 29, 2006, from http://en.wikipedia.org/w/index.php?title=CDATA&oldid=43227290.

What you’re reading is a heavily edited version of my original entry on this topic. I want to mention a bit about my feelings on social responsibility, and how this entry made me realize something about what i’m doing with this Web Log.In keeping with my purpose for WordPressCenter.net, when i came upon a bit of information which i felt was important enough to reference again later, i decided to take a note on it, and since i already had an entry on CDATA, I came back to see what i had written before so i might update it if necessary. it’s times like this when i really appreciate the usefulness of this Web Log– because i can see precisely where i might have been confused about the particulars of the terms or concepts. i feel i’ve best assimilated knowledge through this process, and although i am very happy to have your patronage here at WordPress Center .net, i ask that you too seek additional resources for your own studies. I keep this blog as my own learning tool because i feel the best way to learn is to take in some knowledge, move on to something else, and come back to the older topic again. in some cases it may be found that the first understanding was not accurate– but it’s also through that very recognition that when the concept is studied for the second time, it is better assimilated into a long-term memory of the true meaning of the term because a better understanding is established of what the term is NOT, and sometimes that is what is really needed in order to understand what the term really IS.

in any event, i’m happy that you find this journal of notes to be a useful tool in your own learning. i welcome your feedback and your input. thanks!


Leave a Reply

Your email address will not be published. Required fields are marked *