Wednesday 16 February 2011

Removing XmlDocument Whitespace C#

I've recently been working on matching certain API calls with XML data pulled from an XML file for testing purposes. I noticed there was a large amount of white space left in the XML when pulled from the resourced XML file; which is something I didn't want.

I thought setting the XmlDocument.PreserveWhitespace property to false would remove this for me, but it just seems to remove the preceding and trailing white space; making it similar to the string.Trim() method. I needed to use something akin to string.Replace() (this replaces a specific substring with another substring), but more powerful. Here comes the Regex.Replace function to the rescue, which is a bit like string.Replace() on steroids! Regex.Replace() allows replacement of text using regular expressions, something which can make make complex replacements a piece of cake.

Here is the code to replace white space in XML or any XML dialect (such as HTML - or XHTML):

// Remove inner Xml whitespace
Regex regex = new Regex(@">\s*<");
string cleanedXml = regex.Replace(dirtyXml, "><");

No comments:

Post a Comment