The IEnumerable
interface is often used in C#, usually the generic form
IEnumerable<out T>
. An IEnumerable
instance is not just an
abstraction of a list, it is more like a view on a data source and can even be an infinite
enumeration of values. This article will go into the details of the interface and will
discuss the specialties it brings.
What is the IEnumerable interface?
The IEnumerable
interface has just a single method called GetEnumerator()
that returns an IEnumerator
instance. The IEnumerator
instance
is the object that manages the navigation thru the enumeration and provides a MoveNext()
method and a Current
property (it has also a Reset()
method,
but it is rarely used).
The generic IEnumerable<out T>
uses covariance (marked by the out
keyword for the generic type). This allows that an object with a more derived type argument can
be assigned to an object with a less derived type argument.
IEnumerable<string> listOfStrings = new List<string>();
IEnumerable<string> enumerableOfStrings = listOfStrings;
// not possible without the out keyword
IEnumerable<object> enumerableOfObjects = listOfStrings;
The Magic Behind the Foreach Statement
To iterate an enumeration, we can use the foreach
statement:
List<int> numbers = new List<int>() { 1, 2, 3, 4, 5 };
foreach (int number in numbers)
{
DoSomething(number);
}
This foreach
statement is syntactic sugar. In the background, it generates a call
to GetEnumerator()
, a while
loop and disposes the IEnumerator
instance again:
List<int> numbers = new List<int>() { 1, 2, 3, 4, 5 };
List<int>.Enumerator enumerator = numbers.GetEnumerator();
try
{
while (enumerator.MoveNext())
{
int current = enumerator.Current;
DoSomething(current);
}
}
finally
{
((IDisposable)enumerator).Dispose();
}
Although manual management is possible and sometimes required (see my post about
Enumerable.Index
).
The foreach
statement actually works also with objects that only have a method called
GetEnumerator()
that returns an IEnumerator
instance. The
IEnumerable
interface is not required but recommended for clarity of course.
How to Implement IEnumerable?
Collection types usually implement IEnumerable
to allow iterating its items.
List<T>
returns a custom IEnumerator
implementation. The implementation
has a reference to the List<T>
instance and since the list updates a version index
with every change, the enumerator could throw an InvalidOperationException
if the list
changed while iterating it. Implementing a custom IEnumerator
class is not required for
regular use cases. List<T>
uses a custom IEnumerator
implementation
because it has special requirements and focusses highly on performance.
The simpler option is to use the yield
keyword. A method that either returns
IEnumerable
or IEnumerator
can contain the yield
keyword:
public static IEnumerable<int> GetEvenNumbers(List<int>? numbers)
{
if (numbers == null)
{
yield break;
}
foreach (int number in numbers)
{
if (int.IsEvenInteger(number))
{
yield return number;
}
}
}
The yield break;
statement stops the iteration and jumps out of the method
(basically like return;
). The yield return
statement returns a single
value for the iteration. For methods that contain a yield
keyword, an
IEnumerator
implementation is generated in the background.
The Problem With the Deferred Execution
The foreach
statement hides a call to GetEnumerator()
and without
calling this method, some code might either not get executed or multiple times with multiple calls:
private static T IncreaseCounterAndReturn<T>(T value, ref int counter)
{
counter++;
return value;
}
public static void EnumerableCountTest()
{
int counter = 0;
List<int> numbers = new List<int>() { 1, 2, 3, 4, 5 };
IEnumerable<int> enumerable = numbers.Select(x => IncreaseCounterAndReturn(x, ref counter));
Console.WriteLine(counter); // returns 0
List<int> list1 = enumerable.ToList();
Console.WriteLine(counter); // returns 5
List<int> list2 = enumerable.ToList();
Console.WriteLine(counter); // returns 10
numbers.Remove(4);
numbers.Remove(5);
List<int> list3 = enumerable.ToList();
Console.WriteLine(counter); // returns 13
}
In the example above, we can see, that just calling Select()
does not execute the given
delegate without using a method that executes GetEnumerator()
. Calling ToList()
multiple times will also execute Select()
multiple times. Depending on the use case, this
can lead to performance problems (multiple times the same work with the same result) or different
results (something could change between the executions and return different results). A common
mistake is, to check if the enumeration contains any items with Any()
, and if so,
iterating it again. This would not reduce the work but increase it.
When an IEnumerable
instance is used multiple times, it could be stored in a list once
and worked with that instead. This would allocate memory but won’t execute twice. To detect such
cases, the analyzer
CA1851 (Possible multiple enumerations of IEnumerable collection)
in Microsoft.CodeAnalysis.CSharp.NetAnalyzers
could be activated.
Checking Arguments in Extension Methods
Well written public methods should validate their arguments. This is also the case for methods with
the yield
keyword. But since the call to GetEnumerator()
is deferred, the
argument checks are also deferred.
In the following example the ArgumentNullException
is thrown
too late. We get the exception on the call to ToList()
and not where the actual method was called:
public static IEnumerable<T> WhereIsInSearchValues<T>(this IEnumerable<T> items, List<T> searchValues)
{
ArgumentNullException.ThrowIfNull(items);
ArgumentNullException.ThrowIfNull(searchValues);
foreach (T item in items)
{
if (searchValues.Contains(item))
{
yield return item;
}
}
}
public static void ArgumentChecksTest()
{
List<int> numbers = new List<int>() { 1, 2, 3, 4, 5 };
// no ArgumentNullException
IEnumerable<int> result = numbers.WhereIsInSearchValues(null!);
// ArgumentNullException
List<int> resultList = result.ToList();
}
There is an easy solution to improve the WhereIsInSearchValues()
method. We can split
the argument checks and the iteration in two separate methods. The iteration method can be named
like the actual method, but with an Iterator
suffix:
public static IEnumerable<T> WhereIsInSearchValues<T>(this IEnumerable<T> items, List<T> searchValues)
{
ArgumentNullException.ThrowIfNull(items);
ArgumentNullException.ThrowIfNull(searchValues);
return WhereIsInSearchValuesIterator(items, searchValues);
}
private static IEnumerable<T> WhereIsInSearchValuesIterator<T>(this IEnumerable<T> items, List<T> searchValues)
{
foreach (T item in items)
{
if (searchValues.Contains(item))
{
yield return item;
}
}
}
With this fix, we get the ArgumentNullException
at the expected code line.
Infinite Enumerations
It is possible to create infinite enumerations and there are actual use cases for it. The
following GetRandomNumbers()
method is an endless loop of random numbers:
public static IEnumerable<int> GetRandomNumbers(int minValue, int maxValue)
{
Random random = new Random();
while (true)
{
yield return random.Next(minValue, maxValue);
}
}
While it looks like an endless loop, it actually has an end when it is used the correct way:
foreach (int number in GetRandomNumbers(1, 10).Take(10))
{
Console.WriteLine(number);
}
In this case, we use the Take()
method to only take 10 items and then stop.
Although it is possible, endless enumerations are rarely a useful solution and can easily lead to
unwanted endless loops since most methods won’t handle these cases. Or what would you think
ToList()
would do?
Conclusion
The IEnumerable
interface is an important feature of C#. The usage of the interface is
integrated in the language with the foreach
and yield
keywords. Using
IEnumerable
can lead to unexpected behavior, like multiple executions. Therefore, it is
important to know how foreach
and yield
actually work.