Chinh Do

Splitting a Generic List<T> into Multiple Chunks

15th May 2008

Splitting a Generic List<T> into Multiple Chunks

“Chunking” is the technique used to break large amount of work into smaller and manageable parts. Here are a few reasons I can think of why you want to chunk, especially in a batch process where you have to process large number of items:

  • Manage/minimize peak memory requirement.
  • During failures, the entire process can resume at the last failure point, instead of all the way from the beginning.
  • Take advantage of multiple processors/cores (by having multiple threads, each processing a small chunk).

Here’s a helper method to quickly split a List<T> into chunks:

/// <summary>
/// Splits a <see cref="List{T}"/> into multiple chunks.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="list">The list to be chunked.</param>
/// <param name="chunkSize">The size of each chunk.</param>
/// <returns>A list of chunks.</returns>
public static List<List<T>> SplitIntoChunks<T>(List<T> list, int chunkSize)
{
    if (chunkSize <= 0)
    {
        throw new ArgumentException("chunkSize must be greater than 0.");
    }

    List<List<T>> retVal = new List<List<T>>();
    int index = 0;
    while (index < list.Count)
    {
        int count = list.Count - index > chunkSize ? chunkSize : list.Count - index;
        retVal.Add(list.GetRange(index, count));

        index += chunkSize;
    }

    return retVal;
}
 

If you want to be more efficient at the cost of readability, the second version below moves the items from the big list into the small chunks, so both types of lists will not need to be in memory at once:

 

/// <summary>
/// Break a <see cref="List{T}"/> into multiple chunks. The <paramref name="list="/> is cleared out and the items are moved
/// into the returned chunks.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="list">The list to be chunked.</param>
/// <param name="chunkSize">The size of each chunk.</param>
/// <returns>A list of chunks.</returns>
public static List<List<T>> BreakIntoChunks<T>(List<T> list, int chunkSize)
{
    if (chunkSize <= 0)
    {
        throw new ArgumentException("chunkSize must be greater than 0.");
    }

    List<List<T>> retVal = new List<List<T>>();

    while (list.Count > 0)
    {
        int count = list.Count > chunkSize ? chunkSize : list.Count;
        retVal.Add(list.GetRange(0, count));
        list.RemoveRange(0, count);
    }

    return retVal;
}
This entry was posted on Thursday, May 15th, 2008 at 11:07 pm and is filed under Dotnet/.NET - C#, Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

There are currently 17 responses to “Splitting a Generic List<T> into Multiple Chunks”

  1. 1 On May 16th, 2008, Dew Drop - May 16, 2008 | Alvin Ashcraft's Morning Dew said:

    […] Splitting a Generic List<T> Into Multiple Chunks (Chinh Do) […]

  2. 2 On June 3rd, 2008, Anjo said:

    (list.Count – index) gives the number of remaining elements including the index element. So it doesn’t have to be greater than chunkSize, rather greater than or equal.

    I think this line:

    int count = list.Count – index > chunkSize ? chunkSize : list.Count – index;

    should be:

    int count = list.Count – index >= chunkSize ? chunkSize : list.Count – index;

    Thanks for the post – helped me with a problem I was having.

  3. 3 On June 3rd, 2008, Chinh Do said:

    Hi Anjo: I checked my code again and it does work as expected (I did have a pretty comprehensive unit test for it). However, you have very sharp eyes and your version also works just fine. When list.Count – index == chunkSize, either the left side or right side of the equation will get you the same thing.

    Thanks for the comment. It was a good brain excercise.

    Chinh

  4. 4 On August 24th, 2010, Aruna said:

    great post. thanks a lot

  5. 5 On November 5th, 2012, Mahdi said:

    ///
    /// Paged ForEach
    ///
    ///
    ///
    ///
    ///
    public static void ForEachChunk(this List source, int count, Action<List> action) {
    int nTotal = source.Count();
    var nPageCount = (int)Math.Ceiling((double)(source.Count()) / (double)count);
    for (int i = 0; i < nPageCount; i++) {
    var l = source.Skip(i * count).Take(count).ToList();
    action(l);
    }
    }

  6. 6 On February 26th, 2013, arsham said:

    Great Post!Is it possible to also publish your UnitTests for this methods? it is not too much of a trouble for you.
    Best
    ~A

  7. 7 On February 26th, 2013, Chinh Do said:

    Hi arsham: Unfortunately, I did this a few years ago and now I am not sure where to find the project/unit tests. If I manage to find them I’ll post. Chinh

  8. 8 On October 8th, 2013, Martin Kirk said:

    here are 2 alternatives that does various things, in various speeds and in various manners:
    … both are extension methods…

    public static class ext
    {
    // Define other methods and classes here
    ///
    /// Break a into multiple chunks. The is cleared out and the items are moved
    /// into the returned chunks.
    ///
    ///
    /// The list to be chunked.
    /// The size of each chunk.
    /// Remove elements from input (reduce memory use)
    /// A list of chunks.
    public static IEnumerable<List> BreakIntoChunks(this List list, int chunkSize = 10, bool remove = false)
    {
    if (chunkSize = list.Count)
    yield return list;
    else
    {
    if(remove)
    {
    while (list.Count > 0)
    {
    int count = list.Count > chunkSize ? chunkSize : list.Count;
    var ret = list.GetRange(0, count);
    list.RemoveRange(0, count);
    yield return ret;
    }
    }
    else
    {
    int count = 0, max = list.Count – (list.Count % chunkSize);
    do
    {
    yield return list.GetRange(count, chunkSize);
    count += chunkSize;
    } while (count < max);

    if(count < list.Count)
    yield return list.GetRange(count, (list.Count % chunkSize));
    }
    }
    }

    //Quick and dirty…
    public static List<List> Split(this List source, int size)
    {
    if (size new { Index = i, Value = x })
    .GroupBy(x => x.Index / size)
    .Select(x => x.Select(v => v.Value).ToList())
    .ToList();
    }
    }

  9. 9 On June 6th, 2014, Dmitry Pavlov said:

    I would suggest to use this extension method to chunk the source list to the sub-lists by specified chunk size:

    using System.Collections.Generic;
    using System.Linq;

    ///
    /// Helper methods for the lists.
    ///
    public static class ListExtensions
    {
    public static List<List> ChunkBy(this List source, int chunkSize)
    {
    return source
    .Select((x, i) => new { Index = i, Value = x })
    .GroupBy(x => x.Index / chunkSize)
    .Select(x => x.Select(v => v.Value).ToList())
    .ToList();
    }
    }

  10. 10 On June 6th, 2014, Chinh Do said:

    Dmitry: Thanks for sharing.

  11. 11 On November 25th, 2014, galas said:

    Nice solution Thanks

  12. 12 On May 13th, 2016, deezer mobile said:

    ARIA will be ranking artists based on streaming results from
    deezer mobile, Rdio, Spotify,
    JB Now and Samsung Music Hub, The Guardian said. Audioboom (AIM: BOOM),
    a fairly new entrant in the field that is centered on spoken word audio, meanwhile,
    has just relaunched its website and i – OS app with new features.
    Stitcher’s content partners include NPR, Wall Street Journal,
    the BBC, and CBC.

  13. 13 On May 15th, 2016, burnt-juicebox.tumblr.com said:

    Getting popular in organization attracts much better
    revenues this approach is adopted really strongly by us to help our clients in cracking great offers.
    It is the thumb rule of the business right now that
    an improved visibility on Social Media Platforms like Instagram sooner or later strengthens
    the sources of earnings for an organization.

    My website buy active instagram followers (burnt-juicebox.tumblr.com)

  14. 14 On May 20th, 2016, Praseeda VP said:

    Very useful, Thanks..

  15. 15 On June 1st, 2016, Extra resources said:

    A long paragraph is intimidating to today’s reader.
    The fact is that just because you are running advertisements and the phone is ringing with new prospects doesn’t
    necessarily mean you are not missing out on new opportunities and first time customers.

    And if you are trying to break in, trying to get read and sold, trying to get attention ahead
    of the A list writers already in every studio rolodex.

  16. 16 On June 28th, 2016, m88 said:

    My brother suggested I may like this web site. He was entirely right.
    This put up actually made my day. You can not imagine just how
    so much time I had spent for this information! Thanks!

  17. 17 On July 5th, 2016, BRAVOLAW said:

    I used to be recommended this website through my cousin. I
    am no longer certain whether or not this publish is written through him
    as no one else understand such designated approximately my problem.
    You are wonderful! Thanks!

Leave a Comment

  • Calendar

  • May 2008
    M T W T F S S
    « Apr   Jun »
     1234
    567891011
    12131415161718
    19202122232425
    262728293031