Removing Excess Whitespace from a String
I was looking for the most efficient way to remove excess white space from a string and wrote the following benchmark. Guess which algorithm is faster?
const int iterations = 200000;
const string expr = " Hello world! Why are so many spaces? Testing One two three four five.";
// Remove excess space using Regex
var doRegex = new Action(() =>
{
for (int i = 0; i < iterations; i++)
{
var newStr = Regex.Replace(expr, @"\s{2,}", " ");
}
});
// Remove excess space using Split/Join
var doSplit = new Action(() =>
{
for (int i = 0; i < iterations; i++)
{
var newStr = String.Join(" ", expr.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries));
}
});
var benchMark = new Func<string, Action, long>((name, a) => {
var sw = Stopwatch.StartNew();
a();
sw.Stop();
Console.WriteLine(name + ": " + sw.ElapsedMilliseconds);
return sw.ElapsedMilliseconds;
});
// Warming up
Console.WriteLine("Warming up.");
doRegex();
doSplit();
// Run benchmark
long regexElapsed = benchMark("Regex", doRegex);
long splitElapsed = benchMark("Split", doSplit);
On my PC, the Split method is about 7.5 times faster than Regex.