Detecting Blank Images with C#

Greetings visitor from the year 2020! You can get the latest optimized working source code for this, including a version that does not use unsafe code, from my Github repo here. Thanks for visiting.

Recently I needed a way to find blank images among a large batch of images. I had tens of thousands of images to work with so I came up with this c# function to tell me whether an image is blank.

The basic idea behind this function is that blank images will have highly uniform pixel values throughout the whole image. To measure the degree of uniformity (or variability), the function calculates the standard deviation of all pixel values. An image is determined to be blank if the standard deviation falls below a certain threshold.

Here’s the code. In order to compile, the project to which this code resides must have “Allow Unsafe Code” checked.

public static bool IsBlank(string imageFileName)
{
    double stdDev = GetStdDev(imageFileName);
    return stdDev < 100000;
}

/// <summary>
/// Get the standard deviation of pixel values.
/// </summary>
/// <param name="imageFileName">Name of the image file.</param>
/// <returns>Standard deviation.</returns>
public static double GetStdDev(string imageFileName)
{
    double total = 0, totalVariance = 0;
    int count = 0;
    double stdDev = 0;

    // First get all the bytes
    using (Bitmap b = new Bitmap(imageFileName))
    {
        BitmapData bmData = b.LockBits(new Rectangle(0, 0, b.Width, b.Height), ImageLockMode.ReadOnly, b.PixelFormat);
        int stride = bmData.Stride;
        IntPtr Scan0 = bmData.Scan0;
        unsafe
        {
            byte* p = (byte*)(void*)Scan0;
            int nOffset = stride - b.Width * 3;
            for (int y = 0; y < b.Height; ++y)
            {
                for (int x = 0; x < b.Width; ++x)
                {
                    count++;
                    byte blue = p[0];                            
                    byte green = p[1];
                    byte red = p[2];

                    int pixelValue = red + green + blue;
                    total += pixelValue;
                    double avg = total / count;
                    totalVariance += Math.Pow(pixelValue - avg, 2);
                    stdDev = Math.Sqrt(totalVariance / count);

                    p += 3;
                }
                p += nOffset;
            }
        }

        b.UnlockBits(bmData);
    }

    return stdDev;
}

Chinh Do

I occasionally blog about programming (.NET, Node.js, Java, PowerShell, React, Angular, JavaScript, etc), gadgets, etc. Follow me on Twitter for tips on those same topics. You can also find me on GitHub. See About for more info.

View Comments

  • What is the reason you chose 3 as your constant pointer advance interval? I believe virtro was on to something when he mentioned it worked ok for 24-bit images.

    In my case, I am actually trying to use this algorithm to analyze a bunch of DB BLOBs of diagram images, to see if they are blank or not. For me, they happen to be 8-bit JPEGs. So I had the same error "Attempted to read or write protected memory. This is often an indication that other memory is corrupt."

    So, it looks like you need to [1] smartly use the LCD for the current bitmap format (note that they can be larger than 32bpp, even up to 256bpp) for your p interval or [2] just use a constant 2 since all formats will be multiples of 2.

    HTH

  • CORRECTION:

    I goofed. I overlooked that there actually is a 1bpp format, as I believe virtro mentioned. In that case, you will have to p advance by 1.

    TIP: You can interrogate the pixel format via BitmapData.PixelFormat or Bitmap.PixelFormat. So, I thought of adding some code to choose an LCD based on the actual pixel format for the given image.

    Cheers

  • Here is what I come up with:

    [code]
    ///
    /// Gets whether or not a given Bitmap is blank.
    ///
    /// The instance of the Bitmap for this method extension.
    /// Returns trueif the given Bitmap is blank; otherwise returns false.
    public static bool IsBlank(this Bitmap bitmap) {
    double stdDev = GetStdDev(bitmap);
    int tolerance = 100000;
    return stdDev < tolerance;
    }

    ///
    /// Gets the bits per pixel (bpp) for the given .
    ///
    /// The instance of the for this method extension.
    /// Returns a representing the bpp for the .
    internal static byte GetBitsPerPixel(this Bitmap bitmap) {
    byte bpp = 0x1;

    //return Regex.Match(Regex.Match(bitmap.PixelFormat.ToString(), @"\dbpp").Value, @"\d+").Value;
    switch (bitmap.PixelFormat) {
    case PixelFormat.Format1bppIndexed:
    bpp = 0x1;
    break;
    case PixelFormat.Format4bppIndexed:
    bpp = 0x4;
    break;
    case PixelFormat.Format8bppIndexed:
    bpp = 0x8;
    break;
    case PixelFormat.Format16bppArgb1555:
    case PixelFormat.Format16bppGrayScale:
    case PixelFormat.Format16bppRgb555:
    case PixelFormat.Format16bppRgb565:
    bpp = 0x16;
    break;
    case PixelFormat.Format24bppRgb:
    bpp = 0x24;
    break;
    case PixelFormat.Canonical:
    case PixelFormat.Format32bppArgb:
    case PixelFormat.Format32bppPArgb:
    case PixelFormat.Format32bppRgb:
    bpp = 0x32;
    break;
    case PixelFormat.Format48bppRgb:
    bpp = 0x48;
    break;
    case PixelFormat.Format64bppArgb:
    case PixelFormat.Format64bppPArgb:
    bpp = 0x64;
    break;
    }
    return bpp;
    }

    ///
    /// Get the standard deviation of pixel values.
    ///
    /// The instance of the for this method extension.
    /// Returns the standard deviation of pixel population of the Bitmap.
    public static double GetStdDev(this Bitmap bitmap) {
    double total = 0;
    double totalVariance = 0;
    int count = 0;
    double stdDev = 0;

    // First get all the bytes
    BitmapData bmData = bitmap.LockBits(new Rectangle(0, 0, bitmap.Width, bitmap.Height), ImageLockMode.ReadOnly, bitmap.PixelFormat);
    int stride = bmData.Stride;
    IntPtr Scan0 = bmData.Scan0;

    byte bitsPerPixel = GetBitsPerPixel(bitmap);
    byte bytesPerPixel = (byte)(bitsPerPixel / 8);

    unsafe {
    byte* p = (byte*)(void*)Scan0;
    int nOffset = stride - bitmap.Width * bytesPerPixel;
    for (int y = 0; y < bitmap.Height; ++y) {
    for (int x = 0; x < bitmap.Width; ++x) {
    count++;

    byte blue = p[0];
    byte green = p[1];
    byte red = p[2];

    int pixelValue = Color.FromArgb(0, red, green, blue).ToArgb();
    total += pixelValue;
    double avg = total / count;
    totalVariance += Math.Pow(pixelValue - avg, 2);
    stdDev = Math.Sqrt(totalVariance / count);
    p += bytesPerPixel;
    }
    p += nOffset;
    }
    }
    bitmap.UnlockBits(bmData);

    return stdDev;
    }
    [/code]

    Note: I wrote this in C# 3.0, so these are extension methods, that way they appear as helper methods for the Bitmap type, like so:

    [code]
    using(Bitmap = new Bitmap(@"someimage.bmp")){
    byte bpp = bitmap.GetBitsPerPixel();
    int stddev = bitmap.GetStdDev();
    bool isBlank = bitmap.IsBlank();
    }
    [/code]

    In any event, I believe this should work for non-indexed bitmaps at least. I'm not sure about true indexed bitmaps.

    Try it out and see if it works.

    Chinh Do: You still get most of the credit, though! :D Much appreciated.

  • Opps, argh! I should have proofed before posting. The switch case statements should have decimal values, not hex. Or, convert them to the correct hex value (e.g. 24 = 0x18 etc.). Apologies.

  • Chinh Do, no, thank you! I didn't feel like digging in to try to figure out how to do that kind of thing. You did all the grunt work!

  • I believe you have an error calculation your variance.
    you can compute the variance only after you know your average.
    you should go over the pixels again after you know the average and compute the squared difference.

  • Roshan: I am not aware of something like this in C++ that doesn't mean that it doesn't exist. I am sure you can translate the code to C++. Any C++ expert out there want to help us out?

    Steve:

    Thanks for your note and I think you are right. My algorithm does not produce a standard deviation number in the textbook definition. I think I used this modified "running" standard deviation algorithm to allow for this optimization: once the "running" standard deviation exceeds a certain threshold, I can short circuit the process and exit the loop. I do remember using this optimization but I guess I took it out at the end to keep the published code simple.

    Chinh

  • Thanks for such a nice peice of code.

    I am new to C# and need your help Chinch Do or Timex.
    Some of the images(blank pages from both sides) when scanned get tested as blank while some are not blank, even though if they are blank.
    Can u pls help with it.
    If possible can u please give me some explanation on the code u or Timex have given.

Recent Posts

How to switch to a different Kubernetes context or namespace?

To list available contexts: kubectl config get-contexts To show the current context: kubectl config current-context…

2 years ago

How to ssh into Kubernetes pod

kubectl exec -it <podname> -- sh To get a list of running pods in the…

2 years ago

How to Create a Soft Symbolic Link (symlink) in Unix/Linux

# Create a soft symbolic link from /mnt/original (file or folder) to ~/link ln -s…

3 years ago

How to Configure Git Username and Email Address

git config --global user.name "<your name>" git config --global user.email "<youremail@somewhere.com>" Related Commands Show current…

3 years ago

Getting the Last Monday for Any Month with TypeScript/JavaScript

TypeScript/JavaScript function getLastMonday(d: Date) { let d1 = new Date(d.getFullYear(), d.getMonth() + 1, 0); let…

5 years ago

How to View Raw SMTP Email Headers in Outlook

I had to do some SMTP relay troubleshooting and it wasn't obvious how to view…

5 years ago