A: background

1. Tell a story

Will linq query result create new memory? If so, is it a deep copy of the elements in the original sequence set or just a copy of their references?

In fact, I think this question is quite good, many friends who are beginners to learn C# more or less have such a question, and even friends who have 3 or 4 years of work experience may not be very clear, which leads to when writing code will always be afraid of hands and feet, but also inexplicably worried about whether memory will rise and fall in this way to play. This article I will use WinDBG to help a friend to thoroughly analyze.

Two: Search for answers

1. A small case

What happens to linq as a collection of reference types? Here I first simulate a set, the code is as follows:


    class Program
    {
        static void Main(string[] args)
        {
            var personList = new List<Person>() {
                                              new Person() { Name="jack", Age=20 },
                                              new Person() { Name="elen",Age=25,},new Person() {  Name="john", Age=22}};var query = personList.Where(m => m.Age > 20).ToList();

            Console.WriteLine($"query.count={query.Count}"); Console.ReadLine(); }}class Person
    {
        public string Name { get; set; }

        public int Age { get; set; }}Copy the code

2. Is it really deep copy?

If you use WinDBg, it is very simple. If you use a deep copy, then after query, there will be 5 persons on the managed heap, right? Use! Dumpheap-stat-type Person to the managed heap to verify.


0:000> !dumpheap -stat -type Person
Statistics:
              MT    Count    TotalSize Class Name
00007ff7f27c3528        1           64 System.Func`2[[ConsoleApp5.Person, ConsoleApp5],[System.Boolean, System.Private.CoreLib]]
00007ff7f27c2b60        2           64 System.Collections.Generic.List`1[[ConsoleApp5.Person, ConsoleApp5]]
00007ff7f27c9878        1           72 System.Linq.Enumerable+WhereListIterator`1[[ConsoleApp5.Person, ConsoleApp5]]
00007ff7f27c7a10        3          136 ConsoleApp5.Person[]
00007ff7f27c2ad0        3           96 ConsoleApp5.Person

Copy the code

From the last line of output you can see: Consoleapp5. Person Count=3, which indicates that there is no such thing as a deep copy. If you are not convinced, you can change the Age of a Person in your query to see if the original personList collection is updated synchronously.


        static void Main(string[] args)
        {
            var personList = new List<Person>() {
                                              new Person() { Name="jack", Age=20 },
                                              new Person() { Name="elen",Age=25,},new Person() {  Name="john", Age=22}};var query = personList.Where(m => m.Age > 20).ToList();

            // set Age=25 to 100;
            query[0].Age = 100;

            Console.WriteLine($"query[0].Age={query[0].Age}, personList[2].Age={personList[1].Age}");

            Console.ReadLine();
        }

Copy the code

From the screenshots more verified that there is no so-called deep copy said.

3. Is it really a copy reference?

The roughest way to check whether a copy reference is a query is to look at the rows stored on the managed heap. You can also check with Windbg by first finding the query variable on the thread stack and printing the query with the da command.


0:000> !clrstack -l
OS Thread Id: 0x809c (0)
        Child SP               IP Call Site
000000E143D7E9B0 00007ff7f26f18be ConsoleApp5.Program.Main(System.String[]) [E:\net5\ConsoleApp5\ConsoleApp5\Program.cs @ 20]
    LOCALS:
        0x000000E143D7EA38 = 0x00000218266aab70
        0x000000E143D7EA30 = 0x00000218266aad98

0:000>!do 0x00000218266aad98
Name:        System.Collections.Generic.List`1[[ConsoleApp5.Person, ConsoleApp5]]
MethodTable: 00007ff7f27b2b60
EEClass:     00007ff7f27abad0
Size:        32(0x20) bytes
File:        C:\Program Files\dotnet\shared\Microsoft.NETCore.App\3.19.\System.Private.CoreLib.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
0000000000000000  4001c35        8              SZARRAY  0 instance 00000218266aadb8 _items
00007ff7f26bb1f0  4001c36       10         System.Int32  1 instance                2 _size
00007ff7f26bb1f0  4001c37       14         System.Int32  1 instance                2 _version
0000000000000000  4001c38        8              SZARRAY  0   static dynamic statics NYI                 s_emptyArray

0:000> !da 00000218266aadb8
Name:        ConsoleApp5.Person[]
MethodTable: 00007ff7f27b7a10
EEClass:     00007ff7f26b6580
Size:        56(0x38) bytes
Array:       Rank 1, Number of elements 4, Type CLASS
Element Methodtable: 00007ff7f27b2ad0
[0] 00000218266aac00
[1] 00000218266aac20
[2] null
[3] null

Copy the code

The first two rows of the array hold the memory address, and the last two rows are null. How can there be four cells? This is because query is a List structure, and the bottom of the List is an array. By default, the List starts with four cells.


    public class List<T>
    {
        private void EnsureCapacity(int min)
        {
            if (_items.Length < min)
            {
                int num = (_items.Length == 0)?4 : (_items.Length * 2);   // The default size is 4
                if ((uint)num > 2146435071u)
                {
                    num = 2146435071;
                }
                if(num < min) { num = min; } Capacity = num; }}}Copy the code

If you want to further see what the first two elements in the array 00000218266aAC00 and 00000218266aAC20 point to, you can use! Do print it.


0:000>!do 00000218266aac00
Name:        ConsoleApp5.Person
MethodTable: 00007ff7f27b2ad0
EEClass:     00007ff7f27c2a00
Size:        32(0x20) bytes
File:        E:\net5\ConsoleApp5\ConsoleApp5\bin\Debug\netcoreapp31.\ConsoleApp5.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ff7f2771e18  4000001        8        System.String  0 instance 00000218266aab30 <Name>k__BackingField
00007ff7f26bb1f0  4000002       10         System.Int32  1 instance               25 <Age>k__BackingField
0:000>!do 00000218266aac20
Name:        ConsoleApp5.Person
MethodTable: 00007ff7f27b2ad0
EEClass:     00007ff7f27c2a00
Size:        32(0x20) bytes
File:        E:\net5\ConsoleApp5\ConsoleApp5\bin\Debug\netcoreapp31.\ConsoleApp5.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ff7f2771e18  4000001        8        System.String  0 instance 00000218266aab50 <Name>k__BackingField
00007ff7f26bb1f0  4000002       10         System.Int32  1 instance               22 <Age>k__BackingField

Copy the code

At this point, I think there is no question to answer my friend’s question, but since we are talking about reference types in collections, we have to say what are the value types in collections?

What is the copy mode of the value type in the set

1. Use WinDBG authentication

With that in mind, it’s easy to verify the answer to this question, starting with the test code


        static void Main(string[] args)
        {
            var list = new List<int> () {1.2.3.4.5.6.7.8.9.10 };

            var query = list.Where(m => m > 5).ToList();

            Console.ReadLine();
        }

Copy the code

And then just print out the entire array


// list
0:000> !DumpArray /d 0000019687c8aba8
Name:        System.Int32[]
MethodTable: 00007ff7f279f090
EEClass:     00007ff7f279f010
Size:        88(0x58) bytes
Array:       Rank 1, Number of elements 16, Type Int32
Element Methodtable: 00007ff7f26cb1f0
[0] 0000019687c8abb8
[1] 0000019687c8abbc
[2] 0000019687c8abc0
[3] 0000019687c8abc4
[4] 0000019687c8abc8
[5] 0000019687c8abcc
[6] 0000019687c8abd0
[7] 0000019687c8abd4
[8] 0000019687c8abd8
[9] 0000019687c8abdc
[10] 0000019687c8abe0
[11] 0000019687c8abe4
[12] 0000019687c8abe8
[13] 0000019687c8abec
[14] 0000019687c8abf0
[15] 0000019687c8abf4

// query
0:000> !DumpArray /d 0000019687c8ae68
Name:        System.Int32[]
MethodTable: 00007ff7f279f090
EEClass:     00007ff7f279f010
Size:        56(0x38) bytes
Array:       Rank 1, Number of elements 8, Type Int32
Element Methodtable: 00007ff7f26cb1f0
[0] 0000019687c8ae78
[1] 0000019687c8ae7c
[2] 0000019687c8ae80
[3] 0000019687c8ae84
[4] 0000019687c8ae88
[5] 0000019687c8ae8c
[6] 0000019687c8ae90
[7] 0000019687c8ae94

Copy the code

A close comparison of the array rendering for list and Query reveals two interesting things:

  • As with reference types, arrays contain addresses.

  • All the cells in the value type array are filled, unlike the case where there is null in the reference type array.

Select the address of element 0 in each array and run the dp command to see:


//list
0:000> dp 0000019687c8abb8
00000196`87c8abb8  00000002`00000001 00000004`00000003
00000196`87c8abc8  00000006`00000005 00000008`00000007
00000196`87c8abd8  0000000a`00000009 00000000`00000000

//query
0:000> dp 0000019687c8ae78
00000196`87c8ae78  00000007`00000006 00000009`00000008
00000196`87c8ae88  00000000`0000000a 00000000`00000000

Copy the code

See no, the original address stored above are digital values, deep copy undoubtedly ha.

Four:

All of the above analysis can be concluded: reference type array is reference copy, value type array is deep copy, sometimes recite things are always easy to forget, only real operation verification can really imprint is engraved on my heart! 🤭 🤭 🤭

For more high-quality dry goods: See my GitHub:dotnetfly