I was having a quick look at some of the old and musty files currently residing on the server and happened across begone .innerHTML – a very hungover Sunday morning attempt at writing a JavaScript parser that converts nasty .innerHTML statements to their swanky DOM equivalents.
An example speaks a thousand words, so, pasting the following HTML into the parser:
<div id="response" style="margin-left:100px; background-color:#ffcccc;" onMouseOver="crashIE();">
<p><u>Hopefully</u>,
<br> this should ease your
<em style="color:#efefef;">innerHTML</em> migration
</p>
</div>
will produce the following JavaScript code that can then be used to replace the original .innerHTML statement:
var tag0 = document.getElementById('replace-me');
while(tag0.firstChild) tag0.removeChild(tag0.firstChild);
var tag1 = document.createElement('div');
tag1.setAttribute('id','response');
tag1.style.marginLeft="100px";
tag1.style.backgroundColor="#ffcccc";
// Warning: event handler may require manual conversion.
tag1.onmouseover = function() { crashIE(); };
var tag2 = document.createElement('p');
// Warning: depreciated tag <u> converted to <span>
// Warning: style added to replicate <u> tag functionality.
var tag3 = document.createElement('span');
tag3.style.textDecoration="underline";
tag3.appendChild(document.createTextNode('Hopefully')); // span + textNode
tag2.appendChild(tag3); // p + span
tag2.appendChild(document.createTextNode(',')); // p + textNode
var tag3 = document.createElement('br');
tag2.appendChild(tag3); // p + br
tag2.appendChild(document.createTextNode(' this should ease your ')); // p + textNode
var tag3 = document.createElement('em');
tag3.style.color="#efefef";
tag3.appendChild(document.createTextNode('innerHTML')); // em + textNode
tag2.appendChild(tag3); // p + em
tag2.appendChild(document.createTextNode(' migration')); // p + textNode
tag1.appendChild(tag2); // div + p
tag0.appendChild(tag1); // your element + div
Things to note…
Looking under the cover will reveal a rather nasty and hacked together parser indeed. As mentioned before, I was undoubtedly hungover the Sunday morning in question – and it shows.
My original idea was to write a parser that accepted any flavour of HTML (containing invalid/unclosed tags etc) and attempted to convert it to a pretty XHTML 1.0 equivalent. This idea was scrapped as soon as I realised how hard it actually is to parse invalid HTML using regular expressions alone.
The parser will attempt to automatically convert deprecated HTML tags though; for example, the tag “u” will be converted to a “span” and a style added to the span to replicate the original tag’s functionality.
The only thing that really needs manual intervention every time are inline event handlers (which would require another parser alltogether to get right).
Demo
View a demo of the begone .innerHTML parser.
Update: Let me just clarify a point! The parser itself was written by Robert D. Cameron and is a fantastic piece of Regular Expression wizardry – it’s the parser productions, written by yours truly, that suck…

Comments are currently closed for this article but feel free to email me with your input - I’d love to hear it.