Chinh Do

Removing Excess Whitespace from a String

9th March 2012

Removing Excess Whitespace from a String

I was looking for the most efficient way to remove excess white space from a string and wrote the following benchmark. Guess which algorithm is faster?

const int iterations = 200000;
const string expr = " Hello    world! Why    are so    many spaces?  Testing One   two three    four    five.";

// Remove excess space using Regex
var doRegex = new Action(() =>
{
    for (int i = 0; i < iterations; i++)
    {
        var newStr = Regex.Replace(expr, @"\s{2,}", " ");
    }
});


// Remove excess space using Split/Join
var doSplit = new Action(() =>
{
    for (int i = 0; i < iterations; i++)
    {
        var newStr = String.Join(" ", expr.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries));
    }
});
var benchMark = new Func<string, Action, long>((name, a) =>
{
    var sw = Stopwatch.StartNew();
    a();
    sw.Stop();
    Console.WriteLine(name + ": " + sw.ElapsedMilliseconds);
    return sw.ElapsedMilliseconds;
});

// Warming up
Console.WriteLine("Warming up.");
doRegex();
doSplit();

// Run benchmark
long regexElapsed = benchMark("Regex", doRegex);
long splitElapsed = benchMark("Split", doSplit);

On my PC, the Split method is about 7.5 times faster than Regex.

image

This entry was posted on Friday, March 9th, 2012 at 9:09 pm and is filed under Dotnet/.NET - C#, Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

There are currently 7 responses to “Removing Excess Whitespace from a String”

  1. 1 On March 17th, 2012, Minh Le said:

    I could not understand your code at line 22. In my opinion It should be something like this:

    var benchMark = new Func((name, a) => {…});

    Could you please explain yours?

    Also, about the result. Mine is not impressive as yours. I got 2098 for regex and 1196 for split method. Do you think what impacts the result?

    I have Core i7-2720QM run at 2.2Ghz and 8GB RAM.

    Thanks.

  2. 2 On March 17th, 2012, Chinh Do said:

    Hi Minh: Good catch on line 22… I think that was some type of copy/paste error. I have fixed the article.

    Your i7-2720QM CPU is only maybe 25% slower than my i7-2600K at 3.4Mhz. I was actually running the code inside a VMWARE machine but I guess that doesn’t slow things down much. Maybe you have some other things going on on your PC when you were running the code?

    Chinh

  3. 3 On December 23rd, 2015, Asep said:

    The blog pubadlish may be worth readading. The disadtincadtiveadness and strucadture that disadplays out of this artiadcle. Now-a-days weblogs are used everyadwhere.The conadcept that all of us revecie from them are unevitable. The attribute needed may be the power assoadciadated with creadativeadness within youradself via learnading, thinkading, makading as well as rigadoradous research. And so the blog post is hugely helpadful for your readaders. Thanks with regard to comadposading such an amazading post. I will wait for your long term post along with exceladlent curiosity

  4. 4 On April 28th, 2016, http://www.sorethumbsblog.com/ said:

    gostaria que se possivel o sinteal,colocase no site,para que possamos ver e se o mesmo sera justo e proveitoso,para essa categoria sofrida.grato e um abraço a todos.

  5. 5 On April 28th, 2016, http://www.blrimages.net/ said:

    Hmm it appears like your website ate my first comment (it was extremely long) so I guess I’ll just sum it up what I wrote and say, I’m thoroughly enjoying your blog. I too am an aspiring blog blogger but I’m still new to everything. Do you have any helpful hints for rookie blog writers? I’d genuinely appreciate it.

  6. 6 On April 28th, 2016, http://www.blrimages.net/ said:

    Speaking of unproductive emotional rhetoric, I’m surprised people let Premier Stelmach off the hook so easily for his comments from a few weeks ago about how capping oil sands expansion would “devastate” Alberta’s economy. Talk about hyperbole.

  7. 7 On April 29th, 2016, http://www.coyote5kfunrun.com/ said:

    Not your address,besides alot of people are not aware that the isp can mask your ip to look more like a dialup.to ward off hackers. So in act he may not even see your real ip. The best thing to do in those sites is to use a proxy server. try proxify.

Leave a Comment

  • Calendar

  • March 2012
    M T W T F S S
    « Aug   Aug »
     1234
    567891011
    12131415161718
    19202122232425
    262728293031