List<struct>.Contains(struct) - death by GC

My program slowed when it shouldn't have, and I traced it to huge amounts of garbage caused by using the List<struct>.Contains(MyItem) function.  Using the CLR profiler, I found out that all of the garbage is two classes: 1) A bunch of boxed MyItems (it's a struct) and 2) Some other class which looks like the function is using reflection (sorry, I didn't write down the name).

I think List<struct>.Contains(MyStruct) shouldn't be putting any garbage on the heap.  Maybe the CLR team could investigate why this kind of thing would happen and make modifications to the generic classes, or the implementation of generics to eliminate it.

In any case,  I replaced that line with my own implementation (simple for loop), and problem went away.

-Jeremy


Answer this question

List<struct>.Contains(struct) - death by GC

  • GreenStone90

    The problem is that ignoring IEquatable<T>, the only method we have to compare two objects is Object::Equals(Object).  Note that the parameter's type is Object.  Thus, to compare two value types, we must box.  Perhaps in a future world, an all-knowing JIT compiler could inline enough virtual method calls to make this not happen.  But I wouldn't hold your breath. 

    In the mean time, we added IEquatable<T> so we can successfully bind to T::Equals(T).  With a relatively small amount of special logic in the JIT, we can execute that without boxing.  That's what we implemented.

    Brian Grunkemeyer

    MS CLR Base Class Library team


  • mgraul

    The generic List<T> itself is a class, even though the type made generic isn't. Therefore, it gets put on the heap for GC later. Remember that generics box and unbox value types.

    What MSDN tells us about List:

    If a reference type is used for type T of the List class, the behavior of the two classes is identical. However, if a value type is used for type T, you need to consider implementation and boxing issues.

    If a value type is used for type T, the compiler generates an implementation of the List class specifically for that value type. That means a list element of a List object does not have to be boxed before the element can be used, and after about 500 list elements are created the memory saved not boxing list elements is greater than the memory used to generate the class implementation.

    Make certain the value type used for type T implements the IEquatable generic interface. If not, methods such as Contains must call the Object.Equals(Object) method, which boxes the affected list element. If the value type implements the IComparable interface and you own the source code, also implement the IComparable generic interface to prevent the BinarySearch and Sort methods from boxing list elements. If you do not own the source code, pass an IComparer object to the BinarySearch and Sort methods.



  • Claudio Aguilar

    > Make certain the value type used for type T implements the IEquatable generic interface

    Ok, I didn't implement that interface.

    > If a value type is used for type T, the compiler generates an implementation of the List class specifically for that value type.

    My point is that since the compiler is generating a whole new class, it should have had all the information necessary to create a function that didn't box or unbox anything.  Even if I had implemented IEQuatable, the compiler would have called a slow function (through an interface) instead of just doing the compare directly.  Not good for performance critical code.

    My preference would be for generics to be a little more like C++ templates (at least for structs, and the comparison operators).

    -Jeremy



  • List<struct>.Contains(struct) - death by GC