Interview series:

Click here to jump directly to the interview experience post column

[1] C++ software development engineer concept manual

[2] Enter a URL (www.baidu.com) from the browser to execute the process

[3] Const Pointers and Pointers to constants

[4] #define,#ifdef,#ifndef,#endif…

[5] The difference between heap and stack (Classic explanation)

[6] Stack, BSS segment, code segment, data segment, RO, RW, ZI, etc

[7] Automatic, static, register, extern, volatile, restrict

Yesterday, after interviewing bytedance toutiao for its test development, I wanted to write an interview on campus recruitment for college graduates to make a summary. No matter which direction you are interviewing for, the main test content is C++ software knowledge, operating systems, computer networks, databases and so on.

Therefore, I sorted out the main content of bytedance interview according to these categories for your reference, and I will sort out and improve it later on.


@TOC


I. Basic concept questions

Part one C++ knowledge

Please refer to another article of mine. C++ software development engineer concept manual

1. The difference between overloading and overloading

Overloading and overwriting are different manifestations of object-oriented polymorphism. Among them, overloading is a manifestation of polymorphism in a class, which means that multiple methods with the same name are defined in a class, and they either have different number of parameters, or different parameter types, or different parameter order. Independent of access modifiers and return value types. When using overloading, note the following:

  1. Overloading is distinguished by different method parameters, such as different parameter numbers, different parameter types, or different parameter sequences.

  2. Overloading is independent of a method’s access modifier, return value type, or exception type thrown.

  3. For inheritance, if the access modifier of a parent class method is private, it cannot be overridden in a subclass. If the subclass also defines a function with the same name, this is just a new method and does not achieve the effect of overloading.

Overwriting is when a subclass function overrides a parent class function. Overrides a method and overrides it to do different things. Note the following when using overrides:

  1. Overriding methods in subclasses must have the same function names and arguments as overriding methods in the parent class.

  2. The return value of an overridden method in a subclass must be the same as the return value of the overridden method in the parent class.

  3. Exceptions thrown by overriding methods in subclasses must be the same as those thrown by overriding methods in the parent class.

  4. An overridden method in a parent class cannot be private, otherwise its subclass simply defines a method and does not override it.

The difference between overwriting and overloading is as follows:

  1. Overrides are vertical relationships between subclasses and superclasses. Overloading is a horizontal relationship between methods in the same class.

  2. Overrides can only be a relationship between a pair of methods, and overloading is a relationship between multiple methods.

  3. Overwriting requires the same parameter list, while overloading requires different parameter list.

  4. In an override relationship, the call method is determined according to the type of the object. The overload relation selects the method body according to the argument list and parameter list at the time of call.

Development:

  1. Overloading: Function overloading means that there can be a set of functions with the same function name and different argument lists in the same scope (namespace).

  2. Override: A reimplementation of a virtual function in a base class in a derived class. That is, the function name and parameters are the same, but the function implementation body is different;

  3. Hiding: refers to a function in a derived class that hides functions of the same name from the base class. Hiding is similar to the other two concepts on the surface and is difficult to distinguish, but the key difference is in the implementation of polymorphism.

Example:

Which of the following is an overloaded myFunc function declaration? A.namespace IBM
{
    int myfunc(int a);
}
namespace SUN
{
    int myfunc(double b);
}
B. namespace IBM
{
    int myfunc(int a);
}
namespace SUN
{
    using IBM::myfunc;
    int myfunc(double b);
}
C. namespace IBM
{
    int myfunc(int a);
    namespace SUN
    {
        int myfunc(double b);
    }
}
D. class A
{
public:
    int myfunc(int a);
}
class SubA: public A
{
public:
    int myfunc(double b);
}
Copy the code

The answer is B, A and C are different namespaces; D is hide, only B is overload!

The following declaration exists:void f (a); // Global function
class A
{
public:
 void f(int);
};

class B: public A
{
public:
 void f(int *);
 static void f(int* *);
 void test(a);
 / /...}; Which function calls are legal in the following B::test implementation:void B::test()
{
 int x = 0;
 int *p = NULL;
 f();    / / (1)
 f(x);   / / (2)
 f(&x);  / / (3)
 f(&p);  / / (4)
};

A.(1) (2) (3) (4) B.(1) (3) (4) C.(2) (3) (4) D.(3) (4)
Copy the code

The answer is D, class member function overloading: a local function with the same name will hide the global declaration, but not overload it. A subclass’s function with the same name will not overload its parent without introducing a superclass namespace. Static member functions can overload non-static member functions.

Add: when template functions and overloaded functions are present in the same program, the C++ compiler evaluates them in the following order:

  • 1. Call overloaded functions
  • 2. If no match is found, the template function is called
  • 3. If no match is found, cast the overloaded function
  • 4. Error

2. C language and C++ differences

  1. Function default Value C89 Standard C language does not support function default values, C++ supports function default values, and must follow the right to left assignment of initial values.

  2. Inline function C89 does not, expands directly at the call point, does not generate symbols, does not open the stack frame back, only in Release. Usually in a header file.

  3. C language does not exist function overloading, C++ according to the function name parameter number parameter type judge overloading, belongs to static polymorphism, must be under the same scope is called overloading.

  4. Const C const C is a read-only variable that cannot be an lvalue. Const is a true constant in C++, but it can degenerate into a C constant that generates the local symbol by default.

  5. The underlying reference is a pointer, which is dereferenced directly, and can be used with const references to an immediate number.

  6. Malloc,free && new,delete see point 4 in this section

  7. Scope There are only two scopes in C: local and global. C++ is: local scope, class scope, namespace scope three kinds.

Extension:

2.1 the namespace

A namespace contains multiple variables and functions so that they will not conflict with any variables or functions outside the namespace.

In many of these instances, there is a statement like this: using namespace STD; , which specifies that the library functions used in the file are defined in the standard namespace STD.

Methods using namespace members (1) use namespace aliases

namespace TV=Television;
TV::func();
Copy the code

(2) Use the using namespace member name

using TV::func();
Copy the code

Func (), a member function of the TV namespace, is used in this scope. If a member of the TV namespace is used in this scope, there is no need to use the namespace qualification.

(3) Use the using namespace member name

In a scope declared with a using namespace, members of a namespace are declared as if they were declared globally. So you don’t have to be namespace qualified. Obviously, this kind of processing is more convenient for writing programs. However, declaring multiple namespaces with a using namespace at the same time is often an error prone situation because two namespaces may have classes and functions with the same name.

(4) Nameless namespaces

Since the namespace has no name and is obviously not referenced in other files, it is only valid within the scope of this file. The fun function cannot be used in other files in this program, which limits the scope of fun to this file. In C, you can declare a function static, which limits the scope of the function to the file. C++ retains the use of static to declare functions and provides nameless namespaces to do so.

(5) Standard namespace STD

All identifiers in standard C++ libraries are defined in a namespace named STD, or functions, classes, objects, and class templates in standard header files such as iostream are defined in the namespace STD. All identifiers defined and declared in STD can be used as global quantities in this file. However, it should be absolutely guaranteed that no identifiers with the same name as members of namespace STD occur in the program.

Because there are so many entities defined in namespace STD, programmers sometimes get confused about which identifiers have already been defined in namespace STD. To reduce errors, some professionals prefer to replace the using namespace declaration with several using namespace member declarations. But most of the C++ libraries in use today were developed years ago, when there was no namespace, and the contents of the library were not in the STD namespace, so STD did not have to be declared in programs.

Reference:

  • 【C++】 a namespace

  • 3. Namespaces and memory management

2.2 Differences between structures and classes

C language structure only has data members, no function members; Structures in C++ can have data members and function members.

By default, data members and member functions are public in a structure and private in a class.

Generally we use structures only when describing data members and classes when we have both data members and member functions.

2.3 Why C must have main function?

C has data and functions. The function part is in the code area, and the data is divided into two types: local and global. The difference is whether they are in the static data area or on the stack. And global and static variables are created before the function is executed.

C language has another stipulation: the global area can not have executable code, executable code must enter the function. However, functions in C are global, which prevents functions from having nested definitions: nested definitions cause functions defined inside functions to become local functions. Therefore, the execution of each function can only be solved by nested function calls. A function needs to be executed first to call a series of other functions, and the first function to be called is main.

3. Difference between Pointers and references

(1) Pointer: pointer is a variable, but this variable stores an address, pointing to a memory location; A reference is essentially the same thing as the original variable, just an alias for the original variable. Such as:

int a=1;int *p=&a;

int a=1;int &b=a;
Copy the code

An integer variable and a pointer variable p are defined. The pointer variable points to the storage unit of A, that is, the value of P is the address of the storage unit of A. A and B are the same thing, occupying the same memory location.

(2) Can have a const pointer, but no const reference;

(3) Pointers can have multiple levels, but references can only be one level (int **p; Int &&a is not legal.

(4) The value of a pointer can be NULL, but the value of a reference cannot be NULL, and the reference must be initialized when it is defined;

(5) The value of a pointer can change after initialization, that is, to point to another storage location, while the reference will not change after initialization.

(6)”sizeof reference “is the sizeof the variable (object) to which it refers, while “sizeof pointer” is the sizeof the pointer itself;

(7) The operation meaning of pointer and reference increment (++) is different;

4. New and malloc

  1. The attributes new and delete are C++ keywords that require compiler support; Malloc and free are library functions that require header support.

  2. Parameter The new operator does not specify the size of the block. The compiler calculates the block size based on the type information. Malloc, on the other hand, needs to explicitly indicate the size of memory required.

  3. The return type new operator returns a pointer to the object type when memory allocation succeeds. The type strictly matches the object and no type conversion is required. Therefore, new is a type-safe operator. Malloc returns void*, which requires a cast to convert the void* pointer to the desired type.

  4. A custom type new calls the operator new function to allocate sufficient memory (usually using malloc at the bottom). It then calls the constructor of the type, initializes the member variable, and finally returns a pointer to the custom type. Delete first calls the destructor and then the operator delete function to free memory (usually implemented using free underneath).

    Malloc /free is a library function that can only dynamically allocate and free memory, but cannot force it to do custom object construction and destruction.

    New can call malloc(), but malloc cannot call new.

  5. Overloading C++ allows overloading of the new/delete operators; malloc does not.

  6. The memory region new does two things: allocate memory and call the constructor of the class; delete is: call the destructor of the class and free memory. Malloc and free just allocate and free memory.

    The new operator dynamically allocates memory for objects from the Free Store, while the malloc function dynamically allocates memory from the heap. Free storage is an abstract concept based on the new operator in C++. Any memory that is allocated through the new operator is called free storage. Heap is a term in the operating system. It is a special piece of memory maintained by the operating system and used for dynamic memory allocation of programs. C language uses MALloc to allocate memory from the heap and free to release allocated corresponding memory. Free storage is not equal to the heap, and as mentioned above, layout new can be out of the heap.

  7. New When memory allocation fails, the BAC_alloc exception is raised. Returns NULL if malloc fails to allocate memory.

  8. Memory leaks Memory leaks can be detected for both new, which indicates which line in which file, and malloc, which does not.

Constructors and destructors

5.1 concept

A constructor is a special member function with the same name as a class that returns no value. It is used to initialize the data members of a class. Whenever an object is created (including dynamically creating the object with new, but not creating a pointer to the object), the compiler automatically calls the constructor. The constructor of a class is usually public. Constructors can be overloaded.

The destructor frees the memory used by an object when it is destroyed.

  • Like constructors, you cannot specify any return type or use void when defining a destructor.
  • Unlike constructors, destructors take no arguments and can have only one destructor per class. The destructor’s function name is preceded by ~.

(1) When a dynamically allocated object is deleted, that is, when the object is deleted using delete, the compiler automatically calls the destructor; (2) when the program ends; (3) when a compiler-generated temporary object is no longer needed.

5.2 Copy Constructor

The copy constructor uses an existing object to initialize a created object of the same class. The copy constructor is declared as the class name (class name & object name);

Four cases of automatically calling the copy constructor: ① Use an object of the class to initialize another object

cat cat1; 
cat cat2(cat1); /* The copy constructor is automatically called to initialize cat2 with cat1 */
Copy the code

Another form of initializing an object of a class from another object

cat cat2=cat1;   // Note that not cat cat1,cat2; cat2=cat1;
Copy the code

The copy constructor is called when the object is passed as a function parameter.

f(cat a){ }      // define the f function as a cat object
	cat b;           // Define object B
	f(b); // Copy constructor is automatically called when f is called
Copy the code

(4) If the return value of the function is an object of class, the copy constructor is called when the function call returns.

cat f(a)          // define a function f that returns an object of class cat
{ cat a; ...return a;
}

cat b;          // Define object B
b=f();       // When f is called, the copy constructor is automatically called
Copy the code

5.3 How Do I Access Non-public Members of a Base Class

A protected segment member can be accessed by its derived class.

(2) Declare a derived class as a friend of the base class to access the private members of the base class

(3) A derived class indirectly uses a private member of the base class using an interface provided by the base class.

6. Virtual functions

Simply put, member functions that are decorated with the virtual keyword are virtual functions. The virtual function’s job, in technical terms, is to achieve Polymorphism, which is to separate the interface from the implementation; To explain in graphic language is to achieve in a common way, but because of individual differences, different strategies.

This part of the detailed reference to an article I wrote before C++ software development engineers concept manual

Part ii Computer network

1. Enter a URL (www.baidu.com) in the browser and perform the entire process

Enter a URL (www.baidu.com) from the browser to execute the process

Summary: Implementation process:

(1) The browser obtains the entered domain name www.baidu.com (2) the browser sends a DNS request to the IP address of www.baidu.com (3) the DNS system obtains the IP address of the Baidu server (4) The browser sends an HTTP request. Request Baidu home page browser (5) to establish a TCP connection with the server (the default port number is 80). (6) The server sends the home page file to the browser through HTTP response. (7) The TCP connection is released.

DNS Lookup Procedure

(1) The browser checks the cache to see if there is a resolved IP address for the domain name. If there is, the resolution process ends.

(2) If not, the browser checks the hosts file of the operating system for the DNS resolution result of the domain name.

(3) If no, the system sends a data packet to the DNS server. The DNS server returns the resolved IP address to the user.

2. Differences between TCP and UDP

The difference between TCP and UDP is analyzed and summarized

UDP TCP
Whether connection There is no connection connection-oriented
reliable Unreliable transmission, not using flow control and congestion control Reliable transmission, using flow control and congestion control
Number of connected objects Supports one-to-one, one-to-many, many-to-one and many-to-many interactive communication It can only be one-to-one communication
transport For a message Word oriented stream
The first overhead The header overhead is small, only 8 bytes Minimum 20 bytes, maximum 60 bytes
Applicable scenario For real-time applications (IP phone calls, video conferencing, live streaming, etc.) Suitable for applications that require reliable transmission, such as file transfer

3. TCP three-way handshake and four-way wave

TCP three times shake hands, four times wave and often meet

4. Meanings of HTTP status codes 100, 200, 300, 400, 500, and 600

  • 1XX (temporary response) represents a status code for a temporary response that requires the requester to continue executing the operation. 100 (Continue) The requester should continue to make the request. The server returns this code to indicate that it has received the first part of the request and is waiting for the rest.

  • 2xx (success) indicates that the request’s status code was successfully processed. 200 (Success) The server has successfully processed the request. Typically, this means that the server has provided the requested web page.

  • 3xx (redirect) indicates that further action is required to complete the request. Typically, these status codes are used for redirects. 300 (Multiple options) The server can perform a variety of operations on a request. The server can select an operation based on the requester (UserAgent) or provide a list of operations for the requester to select.

  • 4XX (Request error) These status codes indicate that the request may be in error, preventing the server from processing it. 400 (Error request) Server does not understand the syntax of the request. The 401 (unauthorized) request requires authentication. The server may return this response for a web page that requires login. 402 This status code is reserved for possible future requirements. 403 (Forbidden) The server rejects the request. 404 (Not found) The server could not find the requested page.

  • 5XX (Server Errors) These status codes indicate that an internal error occurred while the server was trying to process the request. These errors may be server errors rather than request errors. 500 (Server internal error) The server encountered an error and could not complete the request.

  • 600 The source site does not return the response header, only the entity content

5. The difference between HTTP and HTTPS

  • 1. HTTPS requires a ca to apply for a certificate. Generally, there are few free certificates, so some fees are required.

  • 2. HTTP is a hypertext transmission protocol, and information is transmitted in plain text. HTTPS is a secure SSL encryption transmission protocol.

  • 3. HTTP and HTTPS use completely different connections and use different ports, the former 80 and the latter 443.

  • 4. HTTP connections are simple and stateless; HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identity. It is more secure than HTTP.

Part III Operating System (Linux)

1. The difference between processes and threads

  1. A thread can only belong to one process, and a process can have multiple threads, but at least one thread. Threads are dependent on processes.

  2. A process has a separate memory unit during execution, and multiple threads share the process’s memory. Resources are allocated to a process, and all threads of the same process share all resources of that process. Multiple threads in the same process share code segments (code and constants), data segments (global and static variables), and extended segments (heap storage). However, each thread has its own stack segment, also known as the run time, which holds all local and temporary variables.

  3. Process is the smallest unit of resource allocation, thread is the smallest unit of CPU scheduling;

  4. System overhead: Because the system allocates or reclaims resources, such as memory space, I/O devices, etc. to a process when it is created or destroyed. Therefore, the overhead incurred by the operating system will be significantly greater than the overhead incurred when creating or undoing threads. Similarly, process switching involves saving the CPU environment of the entire current process and setting the CPU environment of the new scheduled running process. Thread switching only requires saving and setting the contents of a small number of registers and does not involve memory management. As can be seen, the cost of process switching is also much higher than the cost of thread switching.

  5. Communication: Because multiple threads in the same process have the same address space, it is easier to synchronize and communicate between them. Interprocess communication IPC, threads can directly read and write process data segments (such as global variables) to communicate — with the help of process synchronization and mutual exclusion to ensure data consistency. On some systems, threads can be switched, synchronized, and communicated without the intervention of the operating system kernel

  6. The process programming debugging is simple and reliable, but the creation and destruction cost is high. Threads, on the other hand, have low overhead, fast switching, but relatively complex programming and debugging.

  7. Processes do not interact; Threads The failure of one thread causes the failure of the entire process

  8. Process ADAPTS to multi-core and multi-machine distribution; Threads work with multiple cores

2. Interprocess communication

Interprocess communication mainly includes pipe, named pipe FIFO, MessageQueue MessageQueue, shared storage, Semaphore, Signal and Socket.

Pipes pipes, often referred to as nameless pipes, are the oldest form of UNIX system IPC.

Features:

  • It is half-duplex (that is, data can flow only in one direction) and has fixed read and write ends.
  • It can only be used for communication between related processes (also between parent and sibling processes).
  • It can be thought of as a special file, and it can be read or written using ordinary read and write functions. But it is no ordinary file, does not belong to any other file system, and exists only in memory.

A FIFO, also known as a named pipe, is a file type.

Features:

  • A FIFO can exchange data between unrelated processes, unlike a nameless pipe.
  • The FIFO has pathnames associated with it, and it exists in the file system as a special device file.

(3) Message queue Message queue, is a linked table of messages, stored in the kernel. A message queue is identified by an identifier (queue ID).

Features:

  • Message queues are records-oriented, where messages have a specific format and a specific priority.
  • Message queues are independent of the sending and receiving processes. When a process terminates, the message queue and its contents are not deleted.
  • Message queues can be used to query messages randomly. Messages are not necessarily read in first-in, first-out order, but can also be read by message type.

A semaphore is a counter, unlike the IPC structures already described. Semaphores are used to implement mutual exclusion and synchronization between processes, not to store interprocess communication data.

Features:

  • Semaphores are used for interprocess synchronization, and to pass data between processes requires a combination of shared memory.
  • Semaphore based on the PV operation of the operating system, program operations on the semaphore are atomic operations.
  • Each PV operation on a semaphore is not limited to adding or subtracting 1 to the semaphore value, but can also add or subtract any positive integer.
  • Semaphore groups are supported.

Shared Memory refers to two or more processes sharing a given storage area.

Features:

  • Shared memory is the fastest type of IPC because processes access memory directly.
  • Because multiple processes can operate at the same time, synchronization is required.
  • Semaphores + shared memory are often used together to synchronize access to shared memory.

(5) Socket is also an interprocess communication mechanism, and different from other communication mechanisms, it can be used for process communication between different hosts.

3. Communication between threads

  • Critical section: access to common resources or a piece of code through multi-threaded serialization, fast, suitable for controlling data access;
  • Synchronized/Lock: Uses the mutex mechanism. Only the thread that owns the mutex can access public resources. Because there is only one mutex, it is guaranteed that common resources will not be accessed by multiple threads at the same time
  • Semaphore: Designed to control a finite number of user resources, it allows multiple threads to access the same resource at the same time, but generally limits the maximum number of threads that can access the resource at the same time.
  • Event (signal), Wait/Notify: The Wait/Notify operation is used to keep the synchronization of multiple threads, and it is convenient to compare the priorities of multiple threads

4. Tell me about the Linux virtual address space

4.1 Concept of virtual memory

In order to prevent different processes from running in physical memory at the same time and competing for and trampling on physical memory, virtual memory is adopted.

Virtual memory technology allows different processes to see themselves occupying the current system’s 4G memory while running. All processes share the same physical memory, and each process maps and stores only the virtual memory space it currently needs in physical memory. In fact, when each process is created and loaded, the kernel simply “creates” the layout of virtual memory for the process by initializing the memory-related linked list in the process control table. It does not immediately copy the program data and code (such as.text.data) in the corresponding location of virtual memory into physical memory. The mapping between virtual memory and disk files (called memory mapping) is established, and when the corresponding program is run, the data will be copied through the page missing exception. In addition, when a process is running, memory needs to be dynamically allocated. For example, when malloc is used, only virtual memory is allocated, that is, the corresponding page table entry of this virtual memory is set accordingly. When the process actually accesses this data, a page missing exception occurs.

Request paging system, request segmental system and request segmental page system are all targeted at virtual memory, which realizes information replacement between memory and external memory through request.

4.2 Benefits of Virtual Memory:

  1. Expand address space;
  2. Memory protection: Each process runs in its own virtual memory address space and cannot interfere with each other. Virtual memory also provides write protection to specific memory addresses, preventing code or data from being maliciously tampered with.
  3. Fair memory allocation. With virtual memory, each process is equivalent to having the same amount of virtual memory.
  4. When processes communicate, virtual memory sharing can be adopted.
  5. When different processes use the same code, such as code in a library file, only one copy of the code can be stored in physical memory, and the different processes only need to map their virtual memory to it to save memory
  6. Virtual memory is well suited for use in multi-programming systems, where fragments of many programs are kept in memory at the same time. While a program is waiting for part of it to read into memory, the CPU can be handed over to another process. Multiple processes can be kept in memory, and the system concurrency is improved
  7. When a program needs to allocate contiguous memory space, it only needs to allocate contiguous memory space in virtual memory space, but does not need contiguous space in physical memory, and can take advantage of fragmentation

4.3 Cost of Virtual Memory:

  1. The management of virtual memory requires the establishment of many data structures, which take up additional memory
  2. Translation of virtual address to physical address increases instruction execution time.
  3. Paging in and out of pages requires disk I/O, which can be time consuming
  4. If there is only a portion of data on a page, memory is wasted.

5. What are the conditions under which deadlocks occur and how to resolve them

Deadlock refers to the phenomenon that two or more processes are waiting for each other during execution because they are competing for resources. The four necessary conditions for a deadlock to occur are as follows:

  1. Mutually exclusive conditions: A process does not allow other processes to access the allocated resource. If other processes access the resource, they can only wait until the process that occupies the resource releases the resource.
  2. Request and hold condition: after a process obtains a certain resource, it sends a request for another resource, but the resource may be occupied by another process. In this case, the request is blocked, but the process does not release the occupied resource
  3. Inalienable condition: a resource acquired by a process cannot be taken away until it has been used and can only be released after use
  4. Loop waiting condition: when a process is deadlocked, there must be a loop chain between process and resource

The way to resolve deadlocks is to break one of the above four conditions. The main methods are as follows:

  • Resources are allocated at once, depriving request and hold conditions
  • Deprivable resource: that is, when the process’s new resource is not satisfied, the occupied resource is released, thus breaking the inalienable condition
  • Orderly allocation of resources: The system assigns a serial number to each type of resources, and each process incrementally requests resources by serial number. Releasing resources is the opposite, thus destroying the condition of loop waiting

6. Please talk about structure alignment, byte alignment in the operating system

6.1 reasons:

1) Platform reasons (porting reasons) : Not all hardware platforms can access any data on any address; Some hardware platforms can only fetch certain types of data at certain addresses or throw hardware exceptions.

2) Performance reasons: Data structures (especially stacks) should be aligned on natural boundaries as much as possible. The reason is that in order to access unaligned memory, the processor needs to make two memory accesses; Aligned memory access requires only one access.

6.2 rules

1) Alignment rules for data members: Data members of struct (or union), the first data member is placed at offset 0, and then each data member is aligned according to the value specified by #pragma Pack and the smaller of the length of the data member.

2) Overall alignment rules for structures (or unions) : After the data members have completed their respective alignment, the structures (or unions) themselves are also aligned according to the value specified by #pragma Pack and the smaller of the maximum data member length of the structures (or unions).

3) Struct as members: If a struct has some struct members, the struct members are stored from an integer multiple of their largest internal element size.

6.3 Defining structure alignment

This can be changed by using the precompiled command #pragma pack(n), n=1,2,4,8,16, where n is the specified “alignment factor.”

For example, 6.4

#pragma pack(2) 
struct AA { 
	int a;       // Press 2 for length 4 > 2; Offset is 0; Storage location range [0,3]
	char b;  // Align length 1 < 2 with 1; The offset is 4; Storage location range [4]
	short c;     // Length 2 = 2 align with 2; The offset should be raised to a multiple of 2, 6; Storage location range [6,7]
	char d;  // Align length 1 < 2 with 1; The offset is 7; Storage location range [8]; Nine bytes
}; 
#pragma pack() 
Copy the code

7. Please tell us the way of virtual memory replacement

More common memory replacement algorithms are: FIFO, LRU, LFU, LRU-K, 2Q.

7.1 FIFO (FIFO Algorithm)

Thought: recently visited, the possibility of future visit is relatively large. Implementation: the use of a queue, the new page into the end of the queue, each elimination of the first team page, that is, the first to enter the data, the first to be eliminated. Disadvantages: can not reflect the hot and cold information page

7.2 LFU (Least Frequently accessed elimination algorithm)

Idea: If data has been accessed multiple times in the past, it will be accessed more often in the future.

Implementation: each data block has a reference count, all data blocks are sorted by reference count, and data blocks with the same reference count are sorted by time. Each elimination team tail data block.

Cost: sorting cost.

Disadvantages: Cache bumping.

7.3 LRU (least recently used replacement algorithm)

Idea: If the data has been accessed recently, it has a higher chance of being accessed in the future.

Implementation: Using a stack, new pages or hit pages are moved to the bottom of the stack, replacing cached pages at the top of the stack each time.

Advantages: LRU algorithm for hot data hit ratio is very high.

Defect:

  • 1) cache bump, when cache (1, 2, 3) is full, after data access (0,3,2,1,0,3,2,1…) .
  • 2) Cache pollution, a sudden large number of accidental data access, will store a large amount of cold data in memory.

7.4 LRU-K (LRU-2, LRU-3)

Idea: the longest time has not used K elimination algorithm.

The K in lru-k represents the number of recent uses, so LRU can be considered lru-1. The main purpose of LRU-K is to solve the problem of “cache pollution” of LRU algorithm. Its core idea is to extend the judgment criterion of “recently used once” to “recently used K times”.

Compared to LRU, LRU-K needs to maintain one more queue to record the history of all cached data being accessed. Data is only put into the cache when it has been accessed K times. When data needs to be eliminated, lRU-K will eliminate the data whose KTH access time is the largest from the current time.

Implementation:

  • 1) Data is accessed for the first time and added to the access history list;
  • 2) If the data in the access history list does not reach K access times, it will be eliminated according to certain rules (FIFO, LRU);
  • 3) When the number of times of accessing the data in the history queue reaches K, the data index is deleted from the history queue, the data is moved to the cache queue, and this data is cached, and the cache queue is sorted by time again;
  • 4) Reorder the cache data queue after it is accessed again;
  • 5) When data needs to be eliminated, the data at the end of the cache queue will be eliminated, that is, the data that has been accessed for the last KTH time will be eliminated.

Lru-k aims to solve the problem of “cache pollution” of LRU algorithm. Its core idea is to extend the judgment criterion of “recently used once” to “recently used K times”.

7.5 2 q

Similar to LRU – 2. Use a FIFO queue and an LRU queue.

Implementation:

  • 1) The newly accessed data is inserted into the FIFO queue;
  • 2) If the data has not been accessed again in THE FIFO queue, it will be eliminated according to FIFO rules;
  • 3) If the data is accessed again in the FIFO queue, the data is moved to the LRU queue head;
  • 4) If the data is accessed again in the LRU queue, it is moved to the head of the LRU queue;
  • 5) LRU queues eliminate the data at the end.

Problem: LRU cache contamination drawback: When FIFO capacity is 2, the access load is: ABCABCABC will degrade to FIFO and LRU will not be used.


Part IV Database

In this part of the interview, I asked a lot of questions and I don’t know much, I will not write a detailed summary, I am also referring to these several learning.

Database often meet questions (with answers)

1. What does an index do? And what are its advantages and disadvantages?

An index is a special query table that a database search can use to speed up the retrieval of data. It is very similar to the real life book catalog, you do not need to search the entire book to find the desired data. Indexes can be unique, and creating an index allows you to specify a single column or multiple columns. The disadvantage is that it slows down data entry and increases the size of the database.

2. Four basic properties of ACID

1. Atomicity: Either all or none. 2. Consistency: Only valid data can be written. 3. Isolation: Allows concurrent access by multiple users. 4. Persistence: After a transaction, the results of the transaction must be solidified. That is, once committed, changes to the database are permanent.


Second, online programming questions

Likou and Jianzhi Offer are recommended. The wechat public account of electronic PDF is required to reply “Jianzhi Offer” to obtain the link of web disk.

1. Longest common substring

Given two strings, find the length of the longest common subsequence in which each character can be found in the same order as in the original string.

Sample Input  
abcfbc abfcab  
programming contest 
abcd mnp  
 
Sample Output  
4 
2 
0
Copy the code

Longest Common Subsequence (POJ1458)

Tip: if you click on this link, you can skip to the rest of this question and skip to the extension “Longest Common Subsequence (LCS problem).”

Algorithm idea:

  1. Form two strings in rows and columns to form a two-dimensional matrix.

  2. Compare whether each point in the two-dimensional matrix is equal in the row and column characters, if so, set the value to 1, otherwise set it to 0.

  3. The longest common substring is found by finding the longest diagonal with a value of 1.

For the above two strings, the two-dimensional matrix can be obtained as follows:

As you can see from the figure above, str1 and STR2 have 5 common substrings, but the longest common substring is 5.

In order to further optimize the efficiency of the algorithm, we can calculate the value of a two-dimensional matrix by the way to calculate the length of the current longest common substring, that is, the value of a two-dimensional matrix elements by record[I][j]=1 evolved into record[I][j]=1 +record[I-1][J-1], This avoids subsequent attempts to find the diagonal length. The modified two-dimensional matrix is as follows:

When A [I]! = B [j], dp [I] [j] = 0 when A [I] = = B [j], if I = 0 | | j = = 0, dp [I] [j] = 1 or dp [I] [j] = dp [I - 1]] [j - 1 + 1Copy the code

Implementation source code: Violence method:

string getLCS(string str1, string str2) {
	vector<vector<int> > record(str1.length(), vector<int>(str2.length()));
	int maxLen = 0, maxEnd = 0;
	for(int i=0; i<static_cast<int>(str1.length()); ++i)
		for (int j = 0; j < static_cast<int>(str2.length()); ++j) {
			if (str1[i] == str2[j]) {
				if (i == 0 || j == 0) {
					record[i][j] = 1;
				}
				else {
					record[i][j] = record[i - 1][j - 1] + 1; }}else {
				record[i][j] = 0;
			}
 
 
			if (record[i][j] > maxLen) {
				maxLen = record[i][j];
				maxEnd = i; LCS = str1; // LCS = str1}}return str1.substr(maxEnd - maxLen + 1, maxLen);
}
Copy the code

Dynamic programming method:

public int getLCS(String s, String t) {
        if (s == null || t == null) {
            return 0;
        }
        int result = 0;
        int sLength = s.length();
        int tLength = t.length();
        int[][] dp = new int[sLength][tLength];
        for (int i = 0; i < sLength; i++) {
            for (int k = 0; k < tLength; k++) {
                if (s.charAt(i) == t.charAt(k)) {
                    if (i == 0 || k == 0) {
                        dp[i][k] = 1;
                    } else {
                        dp[i][k] = dp[i - 1][k - 1] + 1;
                    }
                    result = Math.max(dp[i][k], result);
                } else {
                    dp[i][k] = 0; }}}return result;
    }
Copy the code

To simplify the recursion formula:

When A [I]! = B[j], dp[I][j] = 0 otherwise, dp[I][j] = dp[i-1][j-1] + 1 all boil down to a formula, the default value of two-dimensional array is 0Copy the code

Rows and columns have one more row to fit the formula.

Reference: Longest common substring (dynamic programming)

Longest Common Subsequence and Longest Common Substring (Python)

Longest Common Subsequence (LCS problem)

Given two strings A and B of length m and n, find their longest common subsequence and return the length. For example: A = “HelloWorld”; B = “loop”

Then the longest common subsequence of A and B is “loo” and the returned length is 5.

Dynamic programming algorithm is often tested in the interview content, more programming exercises can refer to the following several topics (attached with analysis and answers).

  • Analysis and solution of common dynamic programming problems
  • 47: sword refers to offer the great value of the gift analytic: www.jianshu.com/p/c02a8f14f…
  • Offer 48: the longest substring without repeated characters: blog.csdn.net/HackQ_sxj/a…

2. Randomly generate n positive integers between [0. M]. What is the time complexity?

var nums = new int[100];
var random = new Random();
for (int i = 0; i < 100; i++)
{
    nums[i] = i;
}
for (int i = 0; i < 100; i++)
{
    var r = random.Next(i, 99);
    Swap(ref nums[i], ref nums[r]);
}
Copy the code

This problem analysis reference:

  • Algorithm: How to efficiently generate m non-repeating random numbers in the range of N (m<=n)

  • How to efficiently generate m non-repeating random numbers in n range (m<=n)

3. Database: Find a list of students with scores higher than 80

Analysis: To query the each course is more than 80 points, the student’s name, because a student has many subjects, may all courses more than 80 points, there may be some courses more than 80 points, other course is less than 80 points, may all the courses are less than 80 points, so we have to find out all the more than 80 points in the course of the student’s name, we can reverse thinking, Find the names of the students who scored less than 80 in some courses and all the students who scored less than 80 in all the courses. Then exclude those students and the remaining names are the students who scored more than 80 in all the courses.

-- Query each subject score is greater than90Student name id name course score1The small white Chinese91
2Small white mathematical88
3The little black Chinese79
4The little black mathematical92
5The little flower language99
6The little flower mathematical95
7The little flower English96
Copy the code

Implementation source code:

Create table aa(name varchar(10), kecheng varchar(10), Fengshu int) -- insert into AA values(' zhang3 ',' Chinese ',81) insert into AA values(' zhang3 ',' math ',75 Values (' 四',' 中 文',76) insert into AA values(' 四',' 中 文', 89) insert into AA values(' 四',' 中 文', 89 Select * from AA select * from AA select * from AA select * from AA select * from AA select * from AA select * from AA select * from AA select * from AA select * from AA  name not in (select distinct name from aa where fengshu<=80)Copy the code

This problem analysis reference:

  • Select * from student where score > 90
  • Interview question: Use an SQL statement to find the names of students who scored more than 80 in “each” course

Method 1: (binary code) With 3 mice, you can combine them into 8 states.

The first one was fed [1, 3, 5, 7] four reagents. The second one was fed [2, 3, 6, 7] four reagents. The third one was fed [4, 5, 6, 7] four reagents

  [3 2 1]
1  0 0 1 = 1  # 2, 3 are not dead, 1 is dead, indicating that the first reagent is toxic2, 0, 1, 0 is 2# 1, 3 are not dead, 2 is dead, indicating that the second reagent is toxic3, 0, 1, 1 is 3# 3 is not dead, 1 and 2 are dead, which means the third reagent is toxic4, 1, 0, 0 is 4# 1, 2 are not dead, 3 is dead, indicating that the fourth reagent is toxic5, 1, 0, 1 is 5# 2 is not dead. # 1 and # 3 are dead, indicating that the 5th value reagent is toxic6, 1, 1, 0 is 6# 1 is not dead, and # 2 and # 3 are dead, indicating that the sixth reagent is toxic7, 1, 1, 1 is 7# All three are dead, indicating reagent number 7 is toxic8, 0, 0, 0 is 0# All three are not dead, indicating reagent number 8 is toxic
Copy the code

Method 2 :(binary search)

Dichotomy, divide the reagent into two piles at a time, test it on two mice, and if one dies, determine which pile is toxic. And then continue to divide. So the number of mice is the number of times the reagent can be divided. Eight reagents can be divided three times, so you need three mice.

By the same token, it takes at least 10 drugs out of a thousand to find the one that works. This is a typical binary search algorithm, in general, we use serial dichotomy, if there is no time limit, we can use serial dichotomy to find the poison, the steps are as follows:

(1) First, the reagents were numbered from 1 to 1000. (2) The first mouse was fed with reagents from 1 to 500 mixed and waited for 24 hours. (3) If the mouse died, the second mouse was fed with reagents from 1 to 250 mixed, otherwise, reagents from 501 to 750 were fed. It would take up to 10 mice to find the poison.Copy the code

However, this problem has a time limit, so we need to give the drug to certain mice at the same time, and then find the poison when the mice die. The steps were as follows: (1) the first mouse was 1 to 500 (2) the second mouse was 1 to 250 + 501 to 750 (3) the third mouse was 1 to 125 + 251 to 500 + 501 to 625 + 751+875… In turn, since 2^9 < 1000 < 2^10, it takes 10 mice to find the poison.

== Test 2: == jump step problem

There are n steps, you can jump one step at a time, or you can jump two steps at a time. How many ways can you jump?

First, let’s consider the simplest case. If there is only one step, then obviously there is only one way to jump. If there are two steps, there are two ways to jump: one is divided into two jumps, each jump 1 step; The other is to jump two steps at a time.

Then, let’s talk about the general case. Let’s take the hop at n steps as a function of n, and call it f(n). When n>2, there are two different choices for the first jump: one is to jump only 1 step for the first time, and the number of jumps is equal to the number of jumps for the remaining N-1 steps, namely f(n-1); Another option is to jump 2 steps at a time, where the number of jumps is equal to the number of jumps for the remaining N-2 steps, which is f(n-2). So the total number of different jumps for n steps f(n)= F (n-1)+f(n-2). At this point, we can see that this is actually the Fibonacci sequence.

2 program implementation: C++

class Solution {
public:
    int jumpFloor(int number) {
        if(number <= 0) {return 0;
        }
        else if(number < 3) {return number;
        }
        int first = 1, second = 2, third = 0;
        for(int i = 3; i <= number; i++){
            third = first + second;
            first = second;
            second = third;
        }
        returnthird; }};Copy the code

Python

# -*- coding:utf-8 -*-
class Solution:
    def jumpFloor(self, number):
        # write code here
        if number < 3:
            return number
        first, second, third = 1.2.0
        for i in range(3, number+1):
            third = first + second
            first = second
            second = third
        return third
Copy the code

3 Draw inferences


Follow the wechat public number: Mai Microelectronics RESEARCH and development Club, get more exciting content, first in the personal public number.

Knowledge Planet: the community aims to prepare for fall/spring college entrance exams (including brush exams), interview and internal promotion opportunities, study routes, knowledge questions bank, etc.