A memory leak

  • Normally, the memory consumption is stable. However, if the memory is abnormal, the operating system kills the SRS process because the memory is exhausted within several hours

The difficulty of the February

  • Memory leaks happen occasionally, which is the difficulty of troubleshooting. If necessary, you can easily locate the problem by using mature tools such as Valgrind, Memleak, mTrace, etc. Besides, these tools are widely documented on the Internet and easy to use, which will not be described here.

The screening process

Troubleshooting using Tools

  • Without knowing the specific reason, we still use mature tools such as Valgrind and Memleak for troubleshooting, but no obvious memory leak is found, so we give up using them.

Analyzing memory Mirroring

Analyzing core files

  • gdb attach {pid}
  • Run the (GDB)gcore command to dump the core file. Note that the main process will be temporarily tampered.
  • Pmap {pid} can obtain the memory distribution as shown in the following figure. You can also use cat /proc/{pid}/maps to view the memory distribution, but it is not as intuitive as pmap to view the virtual memory distribution.
  • Open the core file with GDB and output the memory contents starting from 0000000005C2b000 to the file
  • GDB -c {core file} {your_application}
  • (gdb)set height 0
  • (gdb)set logging on
  • (gdb) x/612205568a 0x0000000005c2b000
  • X / 612205568A
    • 612205568=4782856*1024/8
    • A outputs one of the formats for GDB (see GDB X phenomenon memory data root description) blog.csdn.net/yasi_xi/art…
    • Finally, analyze the gdb.txt generated above
    • Directly look not good-looking so formatting a cat GDB. TXT | c + + filt waste > demo. TXT can see is the class class of SRS
    • By counting the number of objects in demo.txt to determine whether there is a memory leak in the object, there is a comparison of multiple objects may be a memory leak point, of course, there is also the possibility that in fact your program is so many normal objects, so need to be rational judgment.
      • cat demo.txt|awk ‘BEGIN{stat[“”]=0}{stat[$5]++}END{for(i in stat) print i”=”stat[i]}’
    • Through the above methods, the memory leakage problem is indeed found. SrsStatisticStream objects have a large number of memory, but the leaking objects occupy a small amount of memory combined with the code analysis. It is impossible to exhaust 5G memory in a few hours, and the analysis results do not conform to the situation of online.

Analyze the phenomenon and put forward guesses based on business scenarios

  • Given the phenomenon of a memory leak, the first thing that comes to mind is a problem that could be caused by a stream
  • Combined with the phenomenon, it is found that the memory consumption is serious, which may be caused by audio and video packages in the memory.
  • Combined with SRS source code and memory-intensive business scenarios, there are two areas that are memory intensive
    • The GOP cache, however, will clear the cached audio and video packages after hitting the keyframe.
    • The FLV source will have a SrsMessageQueue to store the audio and video packages, but the shrink logic will release the memory. The only case where the memory will be occupied is if the SHRINK does not occur and the DTS rollback causes a memory leak (possibly). DTS rollback may occur in third-party streams.
    • Simulating FLV back source lag while making DTS back indeed memory will jump, guess is this problem, if you can find the SrsMessageQueue object in online memory will further confirm the problem, continue to analyze the Coredump file with the problem.

Virtual memory distribution

C++ virtual function table

  • The realization principle of polymorphism
  • Use demo to learn about virtual function tables
#include <iostream>
using namespace std;
class A
{
public:
    int i=20001111; // Hexadecimal 1313157
    virtual void func(a) {}
    virtual void func2(a) {}};class B : public A
{


public:
   int j=10001111;// Hexadecimal 989ad7
   void func(a);
};

void B::func(a)
{
    cout<<"===B==="<<endl;
}

int main(a)
{
    cout << sizeof(A) << "," << sizeof(B); 
    B* b=new B(a); b->func(a);return 0;
}
Copy the code
  • Debug the above simple program by setting breakpoints through GDB
    • G ++ -g -o demo demo.cpp
    • gdb demo
    • Breakpoint debugging outputs the B pointer
    • Get B object this pointer (heap address) Get the entire B object with the p command
    • The vtable address is the this pointer

Then analyze the core file

  • With this knowledge in mind, it is easy to parse heap memory to get SrsMessageQueue objects
  • Core file is relatively large how to quickly obtain objects
    • GDB supports Python script parsing
    • Reference sourceware.org/gdb/current…
    • The Python script looks like heap.py
import gdb, traceback;
class heap(gdb.Command) :
    def __init__(self) :
        super(heap, self).__init__("all-heap", gdb.COMMAND_DATA)

    def invoke(self, arg, from_tty) :
        try: 
            # From the above pmap, we can find that the starting address of the heap memory is 0x0000000005C2b000
            next = "0x0000000005c2b000";
            while True:
                #name1 = gdb.execute('x/2xa %s' % (next), to_string=True)
                #print('name2 %s' % (name1))
                name = gdb.execute('x/2xa %s'% (next), to_string=True).split('\t') [1].strip()
                #print('name: %s,%s' % (name,next))
                if name.find('SrsMessageQueue') >0 :
                    print('SrsMessageQueue find %s'% (next))
                    print('SrsMessageQueue find ======%s' % (gdb.execute('x/2xa %s'% (next), to_string=True)))
                next = gdb.parse_and_eval('%s + 16' % (next)).__str__()
        except gdb.error:
            print("error")
            return

heap()

Copy the code
- (GDB) source heap.py - (GDB) all-heap - Obtains the SrsMessageQueue object this pointer from the scriptCopy the code

Find the flow Id by SrsMessageQueue->SrsConnection->SrsConsumer->SrsSource->SrsRequest

To solve

  • The cause is that the FLV pull flow is stuck, and the DTS rollback of the source stream causes the memory accumulation of audio and video packets
  • To solve

conclusion

  • Tools such as Valgrind, memleak, and MTrace solve memory leaks, which can be easily located if they can be reproduced, and can be used if they are not.
  • The occasional memory leak problem still needs to be analyzed with the business scenario to find the problem.

harvest

  • Understand the memory layout, and the c++ virtual function table understanding.

reference

  • C.biancheng.net/view/267.ht…
  • Blog.csdn.net/weixin_3156…