A: background
1. Tell a story
In C#, it is not recommended to use Pointers, but can you say that Pointers are not important? You know FCL in libraries of the use of Pointers, such as String, Encoding, FileStream etc, such as the example code:
private unsafe static bool EqualsHelper(string strA, string strB) { fixed (char* ptr = &strA.m_firstChar) { fixed (char* ptr3 = &strB.m_firstChar) { char* ptr2 = ptr; char* ptr4 = ptr3; while (num >= 12) {... } while (num > 0 && *(int*)ptr2 == *(int*)ptr4) {... } } } } public unsafe Mutex(bool initiallyOwned, string name, out bool createdNew, MutexSecurity mutexSecurity) { byte* ptr = stackalloc byte[(int)checked(unchecked((ulong)(uint)securityDescriptorBinaryForm.Length))] } private unsafe int ReadFileNative(SafeFileHandle handle, byte[] bytes, out int hr) { fixed (byte* ptr = bytes) { num = ((! _isAsync) ? Win32Native.ReadFile(handle, ptr + offset, count, out numBytesRead, IntPtr.Zero) : Win32Native.ReadFile(handle, ptr + offset, count, IntPtr.Zero, overlapped)); }}Copy the code
For, what do you think of the wonderful world of, in fact is others to help you carry something, to say the least, the understanding of the pointer and don’t understand, for you to study the underlying source influence can’t be ignored, pointer is relatively abstract, take an examination of is your space imagination ability, may the existing many programmers still don’t quite understand, because of your lack of wysiwyg tools, I hope this one will help you avoid some detours.
Two: Windbg helps you understand
Pointers are abstract, but if you use Windbg to look at the memory layout in real time, it’s easy to understand how Pointers work. Here are some simple Pointers to understand.
1. The & and * operators
The & operator, which is used to get the memory address of a variable, and the * operator, which is used to get the value that the address points to in a pointer variable.
unsafe { int num = 10; int* ptr = # var num2 = *ptr; Console.WriteLine(num2); } 0:00 0 >! clrstack -l OS Thread Id: 0x41ec (0) Child SP IP Call Site 0000005b1efff040 00007ffc766208e2 *** WARNING: Unable to verify checksum for ConsoleApp4.exe ConsoleApp4.Program.Main(System.String[]) [C:\dream\Csharp\ConsoleApp1\ConsoleApp4\Program.cs @ 25] LOCALS: 0x0000005b1efff084 = 0x000000000000000a 0x0000005b1efff078 = 0x0000005b1efff084 0x0000005b1efff074 = 0x000000000000000aCopy the code
Look closely at the three key-value pairs in LOCALS.
< 1 >int* ptr = # => 0x0000005b1efff078 = 0x0000005b1efff084
Int * PTR is called a pointer variable. It must have its stack address 0x0000005b1EFFf078, which is 0x0000005b1EFFf084, which is the stack address of num.
< 2 >var num2 = *ptr; => 0x0000005b1efff074 = 0x000000000000000a
* PTR = value [0x0000005b1efff084];
If you don’t understand, let me draw a picture. This is the most important thing
2. The ** operator
** also called two pointer, pointer to a pointer variable address, a bit interesting, as follows: pTR2 refers to the PTR on the stack address, a picture of 1000 words.
unsafe { int num1 = 10; int* ptr = &num1; int** ptr2 = &ptr; var num2 = **ptr2; } 0:00 0 >! clrstack -l ConsoleApp4.Program.Main(System.String[]) [C:\dream\Csharp\ConsoleApp1\ConsoleApp4\Program.cs @ 26] LOCALS: 0x000000305f5fef24 = 0x000000000000000a 0x000000305f5fef18 = 0x000000305f5fef24 0x000000305f5fef10 = 0x000000305f5fef18 0x000000305f5fef0c = 0x000000000000000aCopy the code
3. ++, — operators
This arithmetic operation is often used with arrays or strings of equivalent types, such as the following code:
fixed (int* ptr = new int[3] { 1, 2, 3 }) { }
fixed (char* ptr2 = "abcd") { }Copy the code
By default, PTR refers to the first address of the array allocated on the heap, which is the memory address of 1. When PTR ++ goes to the memory address of the next integer, which is 2. When PTR ++ goes to the memory address of the next integer, which is 3.
unsafe { fixed (int* ptr = new int[3] { 1, 2, 3 }) { int* cptr = ptr; Console.WriteLine(((long)cptr++).ToString("x16")); Console.WriteLine(((long)cptr++).ToString("x16")); Console.WriteLine(((long)cptr++).ToString("x16")); }} 0:00 0 >! clrstack -l LOCALS: 0x00000070c15fea50 = 0x000001bcaac82da0 0x00000070c15fea48 = 0x0000000000000000 0x00000070c15fea40 = 0x000001bcaac82dac 0x00000070c15fea38 = 0x000001bcaac82da8Copy the code
C# is a managed language, and reference types are allocated in the managed heap, so it is possible to change the addresses on the heap. This is because the GC periodically reclaims memory. So the VS compiler needs you to use fixed on the heap to avoid GC pressure, in this case 0x000001bCAac82DA0 – (0x000001bCAac82DA8 +4).
Three: Use two examples to help you understand
As the old saying goes, a thousand words are not worth a word. You have to use some examples. All right, prepare two examples.
1. Use Pointers to replace characters in string
We all know that string has a replace method, which replaces the specified character with the character you want, but C# string is immutable, you spit on it and it will generate a new string. 🐮👃 is different, you can find the memory address of the replacement character first. And then I’m going to assign the new character directly to that memory address, right? So I’m going to write a piece of code that replaces abcgef with abcdef, which is g with d.
Unsafe {// replace 'g' with 'd' string s = "abcgef"; char oldchar = 'g'; char newchar = 'd'; Console.WriteLine($" before :{s}"); var len = s.Length; Fixed (char* PTR = s) {// fixed (char* PTR = s); for (int i = 0; i < len; i++) { if (*cptr == oldchar) { *cptr = newchar; break; } cptr++; }} console. WriteLine($" replace :{s}"); } ----- output ------ < abcgef > < abcdef >Copy the code
Windbg is used to find out how many string references there are on the thread stack. You can catch a dump file at break.
From the 10 addresses provided by LOCALS in the figure, the next 9 addresses are all close to the first address of the string: 0x000001EF1ded2d48, indicating that no new string has been generated.
2. Pointer and index traversal speed competition
We usually walk through an array by index, so if we crash test it with a pointer, who do you think is faster? If I say index mode is pointer encapsulation, you should know the answer, let’s watch how fast??
To make the test results more entertaining, I’m going to walk through 100 million numbers in netframework4.8 in release mode
static void Main(string[] args) { var nums = Enumerable.Range(0, 100000000).ToArray(); for (int i = 0; i < 10; i++) { var watch = Stopwatch.StartNew(); Run1(nums); watch.Stop(); Console.WriteLine(watch.ElapsedMilliseconds); } Console.WriteLine(" -------------- "); for (int i = 0; i < 10; i++) { var watch = Stopwatch.StartNew(); Run2(nums); watch.Stop(); Console.WriteLine(watch.ElapsedMilliseconds); } Console.WriteLine(" Finish! ") ); Console.ReadLine(); } public static void Run1(int[] nums) {unsafe {fixed (int* ptr1 = &nums[nums.length - 1])} public static void Run1(int[] nums) {fixed (int* ptr1 = &nums[nums.length - 1]) { Fixed (int* ptr2 = nums) {int* SPTR = ptr2; int* eptr = ptr1; while (sptr <= eptr) { int num = *sptr; sptr++; } } } } } public static void Run2(int[] nums) { for (int i = 0; i < nums.Length; i++) { int num = nums[i]; }}Copy the code
It’s almost twice as fast to walk a pointer as it is to walk an array index.
Four:
If you are running on the framework, don’t forget to use the pointer. Other people advocate not to use the pointer in the underlying framework is a lot of use oh ~