Chinh Do

Detecting Blank Images with C#

10th September 2008

Detecting Blank Images with C#

Recently I needed a way to find blank images among a large batch of images. I had tens of thousands of images to work with so I came up with this c# function to tell me whether an image is blank.

The basic idea behind this function is that blank images will have highly uniform pixel values throughout the whole image. To measure the degree of uniformity (or variability), the function calculates the standard deviation of all pixel values. An image is determined to be blank if the standard deviation falls below a certain threshold.

Here’s the code. In order to compile, the project to which this code resides must have “Allow Unsafe Code” checked.

public static bool IsBlank(string imageFileName)
{
    double stdDev = GetStdDev(imageFileName);
    return stdDev < 100000;
}
 
/// <summary>
/// Get the standard deviation of pixel values.
/// </summary>
/// <param name="imageFileName">Name of the image file.</param>
/// <returns>Standard deviation.</returns>
public static double GetStdDev(string imageFileName)
{
    double total = 0, totalVariance = 0;
    int count = 0;
    double stdDev = 0;
 
    // First get all the bytes
    using (Bitmap b = new Bitmap(imageFileName))
    {
        BitmapData bmData = b.LockBits(new Rectangle(0, 0, b.Width, b.Height), ImageLockMode.ReadOnly, b.PixelFormat);
        int stride = bmData.Stride;
        IntPtr Scan0 = bmData.Scan0;
        unsafe
        {
            byte* p = (byte*)(void*)Scan0;
            int nOffset = stride - b.Width * 3;
            for (int y = 0; y < b.Height; ++y)
            {
                for (int x = 0; x < b.Width; ++x)
                {
                    count++;
 
                    byte blue = p[0];                            
                    byte green = p[1];
                    byte red = p[2];
 
                    int pixelValue = Color.FromArgb(0, red, green, blue).ToArgb();
                    total += pixelValue;
                    double avg = total / count;
                    totalVariance += Math.Pow(pixelValue - avg, 2);
                    stdDev = Math.Sqrt(totalVariance / count);
 
                    p += 3;
                }
                p += nOffset;
            }
        }
 
        b.UnlockBits(bmData);
    }
 
    return stdDev;
}
This entry was posted on Wednesday, September 10th, 2008 at 10:39 pm and is filed under Dotnet/.NET - C#, Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

There are currently 42 responses to “Detecting Blank Images with C#”

  1. 1 On September 11th, 2008, Dew Drop - September 11, 2008 | Alvin Ashcraft's Morning Dew said:

    [...] Algorithm to Detect Blank Images (Chinh Do) [...]

  2. 2 On September 14th, 2008, Interesting Finds: 2008.09.10~2008.09.14 - gOODiDEA.NET said:

    [...] Algorithm to Detect Blank Images [...]

  3. 3 On October 21st, 2008, Eugene said:

    looking forward for more information about this. thanks for sharing. Eugene

  4. 4 On March 2nd, 2009, Prateek said:

    The above code gives error “Attempted to read or write protected memory. This is often an indication that other memory is corrupt.” at line byte blue = p[0];. Please advise how to resolve it.

  5. 5 On March 2nd, 2009, Chinh Do said:

    Prateek: Hmm I am not sure why you are getting that error. Do you get that on a specific image or any image? Chinh

  6. 6 On March 2nd, 2009, Prateek said:

    on every image. I am using VS 2008.

  7. 7 On March 2nd, 2009, Chinh Do said:

    Hi Prateek:

    I just tested the code against this JPG (http://upload.wikimedia.org/wikipedia/en/thumb/2/27/Asinara-Island01.jpg/800px-Asinara-Island01.jpg) file running from Visual Studio 2008 and it worked fine for me.

    Does the code work when you run it against the example JPG above?

    Chinh

  8. 8 On April 24th, 2009, virtro said:

    Thank you very much for sharing this function.But It only working for 24bit Image. when it check 1bit bitmap by this function,it will throw exception like 4Th floor said!(“Attempted to read or write protected memory. This is often an indication that other memory is corrupt.” at line byte blue = p[0];. )

  9. 9 On April 24th, 2009, Chinh Do said:

    Virtro: Thanks for letting me know. When I have some time I’ll see if I can debug and fix this. Chinh

  10. 10 On April 29th, 2009, virtro said:

    Chinh Do: Could you tell me what is the algorithm of detecting blank image based on?

  11. 11 On April 29th, 2009, Chinh Do said:

    Vitro: You mean the idea behind the algorithm? The general idea is that images are made up of pixels of different values. Blank images would then have pixels that have very similar values. Real images would have pixel values that are spread out all over the spectrum. For example, in a blank/white image, all pixels would have the value #FFFFFF.

    In statistics, Standard Deviation is used to measure the variability or dispersion of a population… so that’s what is used to calculate the similarity or dispersion value of each image. Hope this helps.

  12. 12 On May 3rd, 2009, virtro said:

    Chinh Do:Thanks very much for your help.

  13. 13 On May 26th, 2009, Timex said:

    What is the reason you chose 3 as your constant pointer advance interval? I believe virtro was on to something when he mentioned it worked ok for 24-bit images.

    In my case, I am actually trying to use this algorithm to analyze a bunch of DB BLOBs of diagram images, to see if they are blank or not. For me, they happen to be 8-bit JPEGs. So I had the same error “Attempted to read or write protected memory. This is often an indication that other memory is corrupt.”

    So, it looks like you need to [1] smartly use the LCD for the current bitmap format (note that they can be larger than 32bpp, even up to 256bpp) for your p interval or [2] just use a constant 2 since all formats will be multiples of 2.

    HTH

  14. 14 On May 26th, 2009, Timex said:

    CORRECTION:

    I goofed. I overlooked that there actually is a 1bpp format, as I believe virtro mentioned. In that case, you will have to p advance by 1.

    TIP: You can interrogate the pixel format via BitmapData.PixelFormat or Bitmap.PixelFormat. So, I thought of adding some code to choose an LCD based on the actual pixel format for the given image.

    Cheers

  15. 15 On May 26th, 2009, Timex said:

    Here is what I come up with:

    [code]
    ///
    /// Gets whether or not a given Bitmap is blank.
    ///
    /// The instance of the Bitmap for this method extension.
    /// Returns trueif the given Bitmap is blank; otherwise returns false.
    public static bool IsBlank(this Bitmap bitmap) {
    double stdDev = GetStdDev(bitmap);
    int tolerance = 100000;
    return stdDev < tolerance;
    }

    ///
    /// Gets the bits per pixel (bpp) for the given .
    ///
    /// The instance of the for this method extension.
    /// Returns a representing the bpp for the .
    internal static byte GetBitsPerPixel(this Bitmap bitmap) {
    byte bpp = 0x1;

    //return Regex.Match(Regex.Match(bitmap.PixelFormat.ToString(), @"\dbpp").Value, @"\d+").Value;
    switch (bitmap.PixelFormat) {
    case PixelFormat.Format1bppIndexed:
    bpp = 0x1;
    break;
    case PixelFormat.Format4bppIndexed:
    bpp = 0x4;
    break;
    case PixelFormat.Format8bppIndexed:
    bpp = 0x8;
    break;
    case PixelFormat.Format16bppArgb1555:
    case PixelFormat.Format16bppGrayScale:
    case PixelFormat.Format16bppRgb555:
    case PixelFormat.Format16bppRgb565:
    bpp = 0x16;
    break;
    case PixelFormat.Format24bppRgb:
    bpp = 0x24;
    break;
    case PixelFormat.Canonical:
    case PixelFormat.Format32bppArgb:
    case PixelFormat.Format32bppPArgb:
    case PixelFormat.Format32bppRgb:
    bpp = 0x32;
    break;
    case PixelFormat.Format48bppRgb:
    bpp = 0x48;
    break;
    case PixelFormat.Format64bppArgb:
    case PixelFormat.Format64bppPArgb:
    bpp = 0x64;
    break;
    }
    return bpp;
    }

    ///
    /// Get the standard deviation of pixel values.
    ///
    /// The instance of the for this method extension.
    /// Returns the standard deviation of pixel population of the Bitmap.
    public static double GetStdDev(this Bitmap bitmap) {
    double total = 0;
    double totalVariance = 0;
    int count = 0;
    double stdDev = 0;

    // First get all the bytes
    BitmapData bmData = bitmap.LockBits(new Rectangle(0, 0, bitmap.Width, bitmap.Height), ImageLockMode.ReadOnly, bitmap.PixelFormat);
    int stride = bmData.Stride;
    IntPtr Scan0 = bmData.Scan0;

    byte bitsPerPixel = GetBitsPerPixel(bitmap);
    byte bytesPerPixel = (byte)(bitsPerPixel / 8);

    unsafe {
    byte* p = (byte*)(void*)Scan0;
    int nOffset = stride - bitmap.Width * bytesPerPixel;
    for (int y = 0; y < bitmap.Height; ++y) {
    for (int x = 0; x < bitmap.Width; ++x) {
    count++;

    byte blue = p[0];
    byte green = p[1];
    byte red = p[2];

    int pixelValue = Color.FromArgb(0, red, green, blue).ToArgb();
    total += pixelValue;
    double avg = total / count;
    totalVariance += Math.Pow(pixelValue - avg, 2);
    stdDev = Math.Sqrt(totalVariance / count);
    p += bytesPerPixel;
    }
    p += nOffset;
    }
    }
    bitmap.UnlockBits(bmData);

    return stdDev;
    }
    [/code]

    Note: I wrote this in C# 3.0, so these are extension methods, that way they appear as helper methods for the Bitmap type, like so:

    [code]
    using(Bitmap = new Bitmap(@"someimage.bmp")){
    byte bpp = bitmap.GetBitsPerPixel();
    int stddev = bitmap.GetStdDev();
    bool isBlank = bitmap.IsBlank();
    }
    [/code]

    In any event, I believe this should work for non-indexed bitmaps at least. I’m not sure about true indexed bitmaps.

    Try it out and see if it works.

    Chinh Do: You still get most of the credit, though! :D Much appreciated.

  16. 16 On May 26th, 2009, Timex said:

    Opps, argh! I should have proofed before posting. The switch case statements should have decimal values, not hex. Or, convert them to the correct hex value (e.g. 24 = 0×18 etc.). Apologies.

  17. 17 On May 26th, 2009, Chinh Do said:

    Timex: Thanks very much for sharing the code to support various formats. Very nice!

  18. 18 On May 26th, 2009, Timex said:

    Chinh Do, no, thank you! I didn’t feel like digging in to try to figure out how to do that kind of thing. You did all the grunt work!

  19. 19 On June 24th, 2009, Roshan said:

    Is there similar code to check black images in C++?

  20. 20 On July 9th, 2009, steve said:

    I believe you have an error calculation your variance.
    you can compute the variance only after you know your average.
    you should go over the pixels again after you know the average and compute the squared difference.

  21. 21 On July 9th, 2009, Chinh Do said:

    Roshan: I am not aware of something like this in C++ that doesn’t mean that it doesn’t exist. I am sure you can translate the code to C++. Any C++ expert out there want to help us out?

    Steve:

    Thanks for your note and I think you are right. My algorithm does not produce a standard deviation number in the textbook definition. I think I used this modified “running” standard deviation algorithm to allow for this optimization: once the “running” standard deviation exceeds a certain threshold, I can short circuit the process and exit the loop. I do remember using this optimization but I guess I took it out at the end to keep the published code simple.

    Chinh

  22. 22 On October 28th, 2009, Geetesh said:

    Thanks for such a nice peice of code.

    I am new to C# and need your help Chinch Do or Timex.
    Some of the images(blank pages from both sides) when scanned get tested as blank while some are not blank, even though if they are blank.
    Can u pls help with it.
    If possible can u please give me some explanation on the code u or Timex have given.

  23. 23 On November 3rd, 2009, Chinh Do said:

    Geetesh: My guess is that some of the scanned images have scanning artifacts in them that cause the code to think they are not blank. You can try to increase the Standared Deviation threshold. Change the 100000 number to something bigger.

  24. 24 On February 16th, 2010, Cy said:

    Seems on certain images I get the pointer error even with Timex code.

    I’ve added try catch with continue inside:
    try
    {
    count++;

    byte blue = p[0];
    byte green = p[1];
    byte red = p[2];

    int pixelValue = Color.FromArgb(0, red, green, blue).ToArgb();
    total += pixelValue;
    double avg = total / count;
    totalVariance += Math.Pow(pixelValue – avg, 2);
    stdDev = Math.Sqrt(totalVariance / count);
    p += bytesPerPixel;
    }
    catch
    {
    continue;
    }

    seems to have fixed the issue, but I still dont understand why some images will throw an error.

    Times: What do you mean changing the hex to dec? all functions really on a byte not decimal value. Can you repost the decimal version?

    Thanks

  25. 25 On March 29th, 2010, Stowaway said:

    There is an error when working with BMPS that are lower than 8 bit per pixel.

    ie:
    byte bytesPerPixel = (byte)(bitsPerPixel / 8);

    if bitsperpixel < 8 the bytersperpixel = 0

    and p += bytesPerPixel; is p += 0;

    I tried a work around by making in p++ every 8 loops but that didnt work…. (for a 1 bpp bmp)
    im a programming n00b.. any ideas?

  26. 26 On March 30th, 2010, Chinh Do said:

    Hi Stowaway: Sorry I can’t devote time to investigate this right now, but perhaps a work around can be achieved by converting the 1 BPP bitmaps to the right format? It looks like that can be fairly easily done according to sample code from here: http://www.wischik.com/lu/programmer/1bpp.html

  27. 27 On March 30th, 2010, Stowaway said:

    thanks for your reply chinh do.
    I already did that work around actually..

    I’ll keep my eye on this page incase anyone else has a more efficient solution.
    :)

  28. 28 On May 14th, 2010, Phil said:

    I’m trying to convert this code to run under VB.net. I’m having a problem with the following line. byte* p = (byte*)(void*)Scan0;

    Has anyone convert this code to VB?

    Do I have to use the “Pointer” method or can the code be written without using a Pointer?

    Thank you for your Help!
    Phil

  29. 29 On September 14th, 2010, Rashid said:

    Hi Chinh

    need your suggestion as to how to put the code that you have given into a project?
    I want to do something similar, and scan through a folder, containing thousands of images, and get as output the filename of a blank image.

  30. 30 On September 25th, 2010, Pablo said:

    It’s posible to do this without unsafe code???.. thanks

  31. 31 On October 3rd, 2011, Srinivas said:

    Hi,

    I have an scanned document with multiple pages and every alternate page is blank.
    When I tried to convert each page to image and run the above code still it is showing Isblank has “false”.

    How do I overcome the above issue?

    Whether tolerance is same for all .tiff files (or) it varies for each page?

    For an given multi page scanned document, how do I find the tolerance value?

    Any help is highly appreciated.

  32. 32 On December 23rd, 2011, Leonardo Pignataro said:

    I use this algorithm in some of my programs, but with a few changes. One of the changes I did to the algorithm was on this line:

    int pixelValue = Color.FromArgb(0, red, green, blue).ToArgb();

    This is not very wise, because since the ToArgb() method will return a 32-bit integer in the form AARRGGBB, differences among the pixels on the red channel will account more on the calculated standard deviation than differences on the green channel, and even more than on the blue channel. This leads to some difficulty in setting a threshold for the standard deviation value of the images, because the algorithm may return quite different values for images which are near the blank/not-blank point.

    A more meaningful value to the pixelValue variable is:

    int pixelValue = red + green + blue / 3;

    or

    double pixelValue = red + green + blue / 3.0;

    This way, differences in the 3 channels are accounted with the same weight on the result. Also, the algorithm’s output becomes much more comprehensive, varying from 0 to 255. Setting a threshold with this modification to the algorithm has proven much easier. The algorithm usually returns a value close to 1 on blank images and above 8 on non-blank images. I am currently using 2 for threshold, with a near perfect accuracy rate.

    I know this means we’re interpreting the image as if it were in grayscale, but in my experience it makes no difference.

  33. 33 On December 23rd, 2011, Chinh Do said:

    Hi Leonardo: That makes a lot of sense. Thanks for sharing the info. I guess we can even calculate Standard Deviation for each color channel individually and make sure each of them is below the threshold.

  34. 34 On April 4th, 2012, Jan said:

    <i understand that this code works on bitmaps. Does anyoe have a hint whether there ist something similar on compressed files (JPEG, JPEG2000) without decompressing the image? Thx, Jan

  35. 35 On April 4th, 2012, Chinh Do said:

    Jan: This code should work on JPEGs as well.

  36. 36 On April 4th, 2012, Steve said:

    How awesome is this function? I am pulling hundreds of website snapshots. This will spot the blank ones and I can flag the listing.
    I just have to work out all-black or color screens. The other issue is to spot “navagation cancelled.
    Thanks for posting !!!!!

  37. 37 On April 4th, 2012, Chinh Do said:

    Thanks for your post Steve. Glad to hear that.

  38. 38 On April 12th, 2012, Adi said:

    Can someone please confirm what the final working code is and a how do we call it

  39. 39 On April 12th, 2012, Chinh Do said:

    Adi: Sorry I have not gotten around to take everyone’s feedback from this post and create an updated method. You will have to take my starting code, and incorporate the additions/changes from the comments in this post. Chinh

  40. 40 On September 11th, 2012, Detecting blank images (the image expected is a barcode) in c# | Jisku.com - Developers Network said:

    [...] have also checked out this but it has unsafe code. I am looking for a managed C# solution. Sep 11, 2012 No Comments » [...]

  41. 41 On July 1st, 2014, Seneca Trail Tour said:

    Hello tto all, since I am genuinely eage of reading this blog’s post to be updated regularly.
    It carries nice stuff.

  42. 42 On July 5th, 2014, www.3857266.ru said:

    Fantastic website. A lot of useful information here.
    I am sending it to some buddies ans also sharing in delicious.

    And of course, thank you for your effort!

Leave a Comment

*