Categories: Dotnet/.NET - C#Programming

Detecting Blank Images with C#

Greetings visitor from the year 2020! You can get the latest optimized working source code for this, including a version that does not use unsafe code, from my Github repo here. Thanks for visiting.

Recently I needed a way to find blank images among a large batch of images. I had tens of thousands of images to work with so I came up with this c# function to tell me whether an image is blank.

The basic idea behind this function is that blank images will have highly uniform pixel values throughout the whole image. To measure the degree of uniformity (or variability), the function calculates the standard deviation of all pixel values. An image is determined to be blank if the standard deviation falls below a certain threshold.

Here’s the code. In order to compile, the project to which this code resides must have “Allow Unsafe Code” checked.

public static bool IsBlank(string imageFileName)
{
    double stdDev = GetStdDev(imageFileName);
    return stdDev < 100000;
}

/// <summary>
/// Get the standard deviation of pixel values.
/// </summary>
/// <param name="imageFileName">Name of the image file.</param>
/// <returns>Standard deviation.</returns>
public static double GetStdDev(string imageFileName)
{
    double total = 0, totalVariance = 0;
    int count = 0;
    double stdDev = 0;

    // First get all the bytes
    using (Bitmap b = new Bitmap(imageFileName))
    {
        BitmapData bmData = b.LockBits(new Rectangle(0, 0, b.Width, b.Height), ImageLockMode.ReadOnly, b.PixelFormat);
        int stride = bmData.Stride;
        IntPtr Scan0 = bmData.Scan0;
        unsafe
        {
            byte* p = (byte*)(void*)Scan0;
            int nOffset = stride - b.Width * 3;
            for (int y = 0; y < b.Height; ++y)
            {
                for (int x = 0; x < b.Width; ++x)
                {
                    count++;
                    byte blue = p[0];                            
                    byte green = p[1];
                    byte red = p[2];

                    int pixelValue = red + green + blue;
                    total += pixelValue;
                    double avg = total / count;
                    totalVariance += Math.Pow(pixelValue - avg, 2);
                    stdDev = Math.Sqrt(totalVariance / count);

                    p += 3;
                }
                p += nOffset;
            }
        }

        b.UnlockBits(bmData);
    }

    return stdDev;
}

Chinh Do

I occasionally blog about programming (.NET, Node.js, Java, PowerShell, React, Angular, JavaScript, etc), gadgets, etc. Follow me on Twitter for tips on those same topics. You can also find me on GitHub. See About for more info.

Next Gmail Adds Themes »

Previous « Windows Underlined Letters for Keyboard Accelerators - Peculiarities

View Comments

Chinh Do says:

November 3, 2009 at 9:06 am

Geetesh: My guess is that some of the scanned images have scanning artifacts in them that cause the code to think they are not blank. You can try to increase the Standared Deviation threshold. Change the 100000 number to something bigger.
Cy says:

February 16, 2010 at 3:19 pm

Seems on certain images I get the pointer error even with Timex code.

I've added try catch with continue inside:
try
{
count++;

byte blue = p[0];
byte green = p[1];
byte red = p[2];

int pixelValue = Color.FromArgb(0, red, green, blue).ToArgb();
total += pixelValue;
double avg = total / count;
totalVariance += Math.Pow(pixelValue - avg, 2);
stdDev = Math.Sqrt(totalVariance / count);
p += bytesPerPixel;
}
catch
{
continue;
}

seems to have fixed the issue, but I still dont understand why some images will throw an error.

Times: What do you mean changing the hex to dec? all functions really on a byte not decimal value. Can you repost the decimal version?

Thanks
Stowaway says:

March 29, 2010 at 6:57 pm

There is an error when working with BMPS that are lower than 8 bit per pixel.

ie:
byte bytesPerPixel = (byte)(bitsPerPixel / 8);

if bitsperpixel < 8 the bytersperpixel = 0

and p += bytesPerPixel; is p += 0;

I tried a work around by making in p++ every 8 loops but that didnt work.... (for a 1 bpp bmp)
im a programming n00b.. any ideas?
Chinh Do says:

March 30, 2010 at 7:29 pm

Hi Stowaway: Sorry I can't devote time to investigate this right now, but perhaps a work around can be achieved by converting the 1 BPP bitmaps to the right format? It looks like that can be fairly easily done according to sample code from here: http://www.wischik.com/lu/programmer/1bpp.html
Stowaway says:

March 30, 2010 at 11:31 pm

thanks for your reply chinh do.
I already did that work around actually..

I'll keep my eye on this page incase anyone else has a more efficient solution.
:)
Phil says:

May 14, 2010 at 1:43 pm

I'm trying to convert this code to run under VB.net. I'm having a problem with the following line. byte* p = (byte*)(void*)Scan0;

Has anyone convert this code to VB?

Do I have to use the "Pointer" method or can the code be written without using a Pointer?

Thank you for your Help!
Phil
Rashid says:

September 14, 2010 at 5:41 am

Hi Chinh

need your suggestion as to how to put the code that you have given into a project?
I want to do something similar, and scan through a folder, containing thousands of images, and get as output the filename of a blank image.
Pablo says:

September 25, 2010 at 10:44 am

It's posible to do this without unsafe code???.. thanks
Srinivas says:

October 3, 2011 at 9:58 am

Hi,

I have an scanned document with multiple pages and every alternate page is blank.
When I tried to convert each page to image and run the above code still it is showing Isblank has "false".

How do I overcome the above issue?

Whether tolerance is same for all .tiff files (or) it varies for each page?

For an given multi page scanned document, how do I find the tolerance value?

Any help is highly appreciated.
Leonardo Pignataro says:

December 23, 2011 at 5:19 pm

I use this algorithm in some of my programs, but with a few changes. One of the changes I did to the algorithm was on this line:

int pixelValue = Color.FromArgb(0, red, green, blue).ToArgb();

This is not very wise, because since the ToArgb() method will return a 32-bit integer in the form AARRGGBB, differences among the pixels on the red channel will account more on the calculated standard deviation than differences on the green channel, and even more than on the blue channel. This leads to some difficulty in setting a threshold for the standard deviation value of the images, because the algorithm may return quite different values for images which are near the blank/not-blank point.

A more meaningful value to the pixelValue variable is:

int pixelValue = red + green + blue / 3;

or

double pixelValue = red + green + blue / 3.0;

This way, differences in the 3 channels are accounted with the same weight on the result. Also, the algorithm's output becomes much more comprehensive, varying from 0 to 255. Setting a threshold with this modification to the algorithm has proven much easier. The algorithm usually returns a value close to 1 on blank images and above 8 on non-blank images. I am currently using 2 for threshold, with a near perfect accuracy rate.

I know this means we're interpreting the image as if it were in grayscale, but in my experience it makes no difference.