Featured Post

ASP Search Stemmer Class

The original stemmer class was developed by Martin Porter to bring words back to their word stems. For example "abilities" would stem to "able", "smelling" to "smell", "I'm awesome" to "damn straight", etc. I couldn't find a Classic ASP version of this so I ported a port ( I'm not even sure if that's...

Read More

Replacing HTML character codes / escaped strings

Posted by chelfers | Posted in HTML, Javascript, Web | Posted on 17-07-2009

Tags: , ,

0

I've been coding web pages and applications for quite a few years using quite a few different languages. When I learn something so basic and fundamental I often wonder how much of a noob I really am, and then how many other developers are in my same shoes.

The problem I had was converting HTML character codes such as & # 35 ( apostrophe ) to its ascii equivalent. I've done this many times before, but have always been too lazy to find a different method. After searching through various blogs for a solution I finally came across a beauty.

.replace(/\&\#(\d+);?/g, function (m,n) { return String.fromCharCode(n); } )

A basic explanation of the code goes a little something like this; now this is a story all about how my life got flipped-turned upside down, and I liked to take a minute just sit right there, I'll tell ya how I became the prince of a town called Bel Air *record scratch*, err no wait.

Back on track, this function call does a couple of different things; first we are will search the entire string for every instance of our matches, and secondly we try to make our two matches ( yes two! ). The first match for our regular expression is looking for "&#"; this is our entry point into the search and should nab us every instance of those dang character codes. The second part of our match is the (\d+) sub expression, this is really what we are after and will give us the correct end result.

Once our expression has found a good match we will pass it to our callback function.

function (m,n) { return String.fromCharCode(n); }

The first thing you should notice are the m and n parameters that are being passed, these are tied directly to the matches and follow the order they were matched in, or in other words m = "&#", and n = [ our numerical match ]. These could really be variable name you wish, I stuck with the original example because that's how I rock out.

On the inside of our function we simply ignore the m parameter as we are only interested in the digits baby. We pass the n parameter to our Javascript function and BAM, straight up ascii conversion, what, WHAT.

As always, hope this helps someone :]