Removing Excess Whitespace from a String

I was looking for the most efficient way to remove excess white space from a string and wrote the following benchmark. Guess which algorithm is faster?

const int iterations = 200000;
const string expr = " Hello    world! Why    are so    many spaces?  Testing One   two three    four    five.";

// Remove excess space using Regex
var doRegex = new Action(() =>
    for (int i = 0; i < iterations; i++)
        var newStr = Regex.Replace(expr, @"\s{2,}", " ");

// Remove excess space using Split/Join
var doSplit = new Action(() =>
    for (int i = 0; i < iterations; i++)
        var newStr = String.Join(" ", expr.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries));
var benchMark = new Func<string, Action, long>((name, a) =>
    var sw = Stopwatch.StartNew();
    Console.WriteLine(name + ": " + sw.ElapsedMilliseconds);
    return sw.ElapsedMilliseconds;

// Warming up
Console.WriteLine("Warming up.");

// Run benchmark
long regexElapsed = benchMark("Regex", doRegex);
long splitElapsed = benchMark("Split", doSplit);

On my PC, the Split method is about 7.5 times faster than Regex.


14 Replies to “Removing Excess Whitespace from a String”

  1. I could not understand your code at line 22. In my opinion It should be something like this:

    var benchMark = new Func((name, a) => {…});

    Could you please explain yours?

    Also, about the result. Mine is not impressive as yours. I got 2098 for regex and 1196 for split method. Do you think what impacts the result?

    I have Core i7-2720QM run at 2.2Ghz and 8GB RAM.


  2. Hi Minh: Good catch on line 22… I think that was some type of copy/paste error. I have fixed the article.

    Your i7-2720QM CPU is only maybe 25% slower than my i7-2600K at 3.4Mhz. I was actually running the code inside a VMWARE machine but I guess that doesn’t slow things down much. Maybe you have some other things going on on your PC when you were running the code?


  3. The blog pubadlish may be worth readading. The disadtincadtiveadness and strucadture that disadplays out of this artiadcle. Now-a-days weblogs are used everyadwhere.The conadcept that all of us revecie from them are unevitable. The attribute needed may be the power assoadciadated with creadativeadness within youradself via learnading, thinkading, makading as well as rigadoradous research. And so the blog post is hugely helpadful for your readaders. Thanks with regard to comadposading such an amazading post. I will wait for your long term post along with exceladlent curiosity

  4. Hmm it appears like your website ate my first comment (it was extremely long) so I guess I’ll just sum it up what I wrote and say, I’m thoroughly enjoying your blog. I too am an aspiring blog blogger but I’m still new to everything. Do you have any helpful hints for rookie blog writers? I’d genuinely appreciate it.

  5. Speaking of unproductive emotional rhetoric, I’m surprised people let Premier Stelmach off the hook so easily for his comments from a few weeks ago about how capping oil sands expansion would “devastate” Alberta’s economy. Talk about hyperbole.

  6. Not your address,besides alot of people are not aware that the isp can mask your ip to look more like a ward off hackers. So in act he may not even see your real ip. The best thing to do in those sites is to use a proxy server. try proxify.

  7. Awww sorry!I'd be working in the morning and the school isn't allowing any Halloween celebrations :(In the afternoon is of corse trick or treat, stuff ourselves with candy while we watch "The Nightmare before Christmas".The kids will bring home report cards that day as well….

  8. Danny, the problem is that you are your own god and make your own religion from bits and pieces here and there. That is commonplace nowadays but it seems somewhat less intellectually honest than just plain atheism. If you create the thing you believe in, rather than believe in that which creates you, you are following a circular maze whose beginning and end is in yourself.

  9. Lisää on tulossa. Esim. elävästä muurahaisesta, murkku vain pakeni linssiä niin kovaa että kuvaaminen kävi jo vähintään hankalaksi. Näilä tienoilla ainakin 3 ötökkää tulossa

  10. “Enligt vakthavande pÃ¥ stationen försvÃ¥rades förhören dels av att männen var starkt alkoholpÃ¥verkade, dels av sprÃ¥ksvÃ¥righeter”Jag känner pÃ¥ mig att nu kommer det snart nÃ¥gon motion i riksdagen om att svensk polis har för dÃ¥liga sprÃ¥kkunskaper och att det ska ställas högre krav pÃ¥ att dom har godkänt i engelska när dom börjar sin polisutbildning.

  11. Que de compliments si joliment tournés ; je n’en attendais pas tant. Mais êtes-vous certain de la correction de cette phrase : « Lire Dominique est la preuve que l’on peut cracher, pester, vilipender, honnir, traîner dans la boue, sans jamais céder à la gesticulation. » Le sujet sous-entendu de « lire » doit être le même que celui de « on peut… », puisque rien ne précise l’identité du pronom « on ».

  12. 1c2I like your style of thinking, altough I don’t agree with your conclusions on this one.Why would Android’s competition in the high-end and low-end have to be mutually exclusive?As I see it, with the release of Nexus One Google is aiming at both of them. You’ll have a lot of Android devices with sub-par app performance, and Google’s certified gadget being performance-wise on equal grounds with iPhone.3 c3 45

Leave a Reply

Your email address will not be published. Required fields are marked *