Chinh Do

Removing Excess Whitespace from a String

9th March 2012

Removing Excess Whitespace from a String

I was looking for the most efficient way to remove excess white space from a string and wrote the following benchmark. Guess which algorithm is faster?

const int iterations = 200000;
const string expr = " Hello    world! Why    are so    many spaces?  Testing One   two three    four    five.";

// Remove excess space using Regex
var doRegex = new Action(() =>
{
    for (int i = 0; i < iterations; i++)
    {
        var newStr = Regex.Replace(expr, @"\s{2,}", " ");
    }
});


// Remove excess space using Split/Join
var doSplit = new Action(() =>
{
    for (int i = 0; i < iterations; i++)
    {
        var newStr = String.Join(" ", expr.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries));
    }
});
var benchMark = new Func<string, Action, long>((name, a) =>
{
    var sw = Stopwatch.StartNew();
    a();
    sw.Stop();
    Console.WriteLine(name + ": " + sw.ElapsedMilliseconds);
    return sw.ElapsedMilliseconds;
});

// Warming up
Console.WriteLine("Warming up.");
doRegex();
doSplit();

// Run benchmark
long regexElapsed = benchMark("Regex", doRegex);
long splitElapsed = benchMark("Split", doSplit);

On my PC, the Split method is about 7.5 times faster than Regex.

image

This entry was posted on Friday, March 9th, 2012 at 9:09 pm and is filed under Dotnet/.NET - C#, Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

There are currently 14 responses to “Removing Excess Whitespace from a String”

  1. 1 On March 17th, 2012, Minh Le said:

    I could not understand your code at line 22. In my opinion It should be something like this:

    var benchMark = new Func((name, a) => {…});

    Could you please explain yours?

    Also, about the result. Mine is not impressive as yours. I got 2098 for regex and 1196 for split method. Do you think what impacts the result?

    I have Core i7-2720QM run at 2.2Ghz and 8GB RAM.

    Thanks.

  2. 2 On March 17th, 2012, Chinh Do said:

    Hi Minh: Good catch on line 22… I think that was some type of copy/paste error. I have fixed the article.

    Your i7-2720QM CPU is only maybe 25% slower than my i7-2600K at 3.4Mhz. I was actually running the code inside a VMWARE machine but I guess that doesn’t slow things down much. Maybe you have some other things going on on your PC when you were running the code?

    Chinh

  3. 3 On December 23rd, 2015, Asep said:

    The blog pubadlish may be worth readading. The disadtincadtiveadness and strucadture that disadplays out of this artiadcle. Now-a-days weblogs are used everyadwhere.The conadcept that all of us revecie from them are unevitable. The attribute needed may be the power assoadciadated with creadativeadness within youradself via learnading, thinkading, makading as well as rigadoradous research. And so the blog post is hugely helpadful for your readaders. Thanks with regard to comadposading such an amazading post. I will wait for your long term post along with exceladlent curiosity

  4. 4 On April 28th, 2016, http://www.sorethumbsblog.com/ said:

    gostaria que se possivel o sinteal,colocase no site,para que possamos ver e se o mesmo sera justo e proveitoso,para essa categoria sofrida.grato e um abraço a todos.

  5. 5 On April 28th, 2016, http://www.blrimages.net/ said:

    Hmm it appears like your website ate my first comment (it was extremely long) so I guess I’ll just sum it up what I wrote and say, I’m thoroughly enjoying your blog. I too am an aspiring blog blogger but I’m still new to everything. Do you have any helpful hints for rookie blog writers? I’d genuinely appreciate it.

  6. 6 On April 28th, 2016, http://www.blrimages.net/ said:

    Speaking of unproductive emotional rhetoric, I’m surprised people let Premier Stelmach off the hook so easily for his comments from a few weeks ago about how capping oil sands expansion would “devastate” Alberta’s economy. Talk about hyperbole.

  7. 7 On April 29th, 2016, http://www.coyote5kfunrun.com/ said:

    Not your address,besides alot of people are not aware that the isp can mask your ip to look more like a dialup.to ward off hackers. So in act he may not even see your real ip. The best thing to do in those sites is to use a proxy server. try proxify.

  8. 8 On June 19th, 2016, http://www.perufusion.org/ said:

    Awww sorry!I'd be working in the morning and the school isn't allowing any Halloween celebrations :(In the afternoon is of corse trick or treat, stuff ourselves with candy while we watch "The Nightmare before Christmas".The kids will bring home report cards that day as well….

  9. 9 On July 10th, 2016, http://www./ said:

    Danny, the problem is that you are your own god and make your own religion from bits and pieces here and there. That is commonplace nowadays but it seems somewhat less intellectually honest than just plain atheism. If you create the thing you believe in, rather than believe in that which creates you, you are following a circular maze whose beginning and end is in yourself.

  10. 10 On July 15th, 2016, http://www.lasmangist.com/ said:

    Lisää on tulossa. Esim. elävästä muurahaisesta, murkku vain pakeni linssiä niin kovaa että kuvaaminen kävi jo vähintään hankalaksi. Näilä tienoilla ainakin 3 ötökkää tulossa

  11. 11 On July 20th, 2016, http://insure.liquorisquicker.net/ahc_insurance.xml said:

    “Enligt vakthavande pÃ¥ stationen försvÃ¥rades förhören dels av att männen var starkt alkoholpÃ¥verkade, dels av sprÃ¥ksvÃ¥righeter”Jag känner pÃ¥ mig att nu kommer det snart nÃ¥gon motion i riksdagen om att svensk polis har för dÃ¥liga sprÃ¥kkunskaper och att det ska ställas högre krav pÃ¥ att dom har godkänt i engelska när dom börjar sin polisutbildning.

  12. 12 On October 19th, 2016, prepaid kreditkarte ausland abheben said:

    Que de compliments si joliment tournés ; je n’en attendais pas tant. Mais êtes-vous certain de la correction de cette phrase : « Lire Dominique est la preuve que l’on peut cracher, pester, vilipender, honnir, traîner dans la boue, sans jamais céder à la gesticulation. » Le sujet sous-entendu de « lire » doit être le même que celui de « on peut… », puisque rien ne précise l’identité du pronom « on ».

  13. 13 On October 22nd, 2016, girokonto kostenlos ohne kreditkarte said:

    Thanks Abdul. Have a look at my my software . From a first glance, do you see any issues I would possibly run into with comdo internet security? Thanks

  14. 14 On November 24th, 2016, http://www./ said:

    1c2I like your style of thinking, altough I don’t agree with your conclusions on this one.Why would Android’s competition in the high-end and low-end have to be mutually exclusive?As I see it, with the release of Nexus One Google is aiming at both of them. You’ll have a lot of Android devices with sub-par app performance, and Google’s certified gadget being performance-wise on equal grounds with iPhone.3 c3 45

Leave a Comment

  • Calendar

  • March 2012
    M T W T F S S
    « Aug   Aug »
     1234
    567891011
    12131415161718
    19202122232425
    262728293031