It’s a little long because it summarizes so many things, and this is what I’ve been doing for a long time.

Py2 VS Py3

  • Print becomes a function, python2 is the keyword

  • There are no more Unicode objects, and STR is Unicode by default

  • Python3 Divisor returns a floating point number

  • No longer long

  • Xrange does not exist. Range replaces xrange

  • You can define function names and variable names in Chinese

  • Advanced unpacking and * unpacking

  • Variables that qualify the keyword argument * must have the name = value

  • raise from

  • Iteritems removed to items()

  • Yield from links to child generators

  • Asyncio,async/await Native coroutines support asynchronous programming

  • The new enum, mock, ipaddress, concurrent futures, asyncio urllib, the selector

    • Different enumeration classes cannot be compared
    • Only equal comparisons can be made between the same enumerated classes
    • Use of enumeration classes (numbering starts at 1 by default)
    • To avoid the occurrence of the same enumeration value in an enumeration class, you can decorate the enumeration class with @unique
# Considerations for enumeration
from enum import Enum

class COLOR(Enum):
    YELLOW=1
# # YELLOW = 2 complains
    GREEN=1GREEN is an alias for YELLOW
    BLACK=3
    RED=4
print(COLOR.GREEN)#COLOR.YELLOW, will still print YELLOW
for i in COLOR:# If I iterate over COLOR, I won't get GREEN
    print(i)
# color.yellow \ ncolor.black \ ncolor.red \n
for i in COLOR.__members__.items():
    print(i)
# output:('YELLOW', 
      
       )\n('GREEN', 
       
        )\n('BLACK', 
        
         )\n('RED', 
         
          )
         
        
       
      
for i in COLOR.__members__:
    print(i)
# output:YELLOW\nGREEN\nBLACK\nRED

# enumeration conversion
It is better to use enumerated values in database access than tag name strings
# Use enumerated classes in your code
a=1
print(COLOR(a))# output:COLOR.YELLOW
Copy the code

Py2/3 conversion tool

  • Six modules: Modules compatible with Pyton2 and Pyton3
  • 2TO3 tools: Change the syntax version of the code
  • Future: Use the functionality of the next version

The commonly used libraries

  • Collections you must know

    Segmentfault.com/a/119000001…

  • Python sorting and heAPQ modules

    Segmentfault.com/a/119000001…

  • Itertools module super utility method

    Segmentfault.com/a/119000001…

An uncommon but important library

  • Dis (Code bytecode Analysis)

  • Inspect (generator state)

  • CProfile (Performance Analysis)

  • Bisect (Maintaining ordered lists)

  • fnmatch

    • Fnmatch (string,”*.txt”) #win case insensitive
    • Fnmatch is determined by the system
    • Fnmatchcase is completely case sensitive
  • Timeit (code execution time)

    def isLen(strString):
        We should still use ternary expressions, faster
        return True if len(strString)>6 else False

    def isLen1(strString):
        Note the false and true positions here
        return [False.True][len(strString)>6]
    import timeit
    print(timeit.timeit('isLen1("5fsdfsdfsaf")',setup="from __main__ import isLen1"))

    print(timeit.timeit('isLen("5fsdfsdfsaf")',setup="from __main__ import isLen"))
Copy the code
  • contextlib

    • @ contextlib. Contextmanager generator function into a context manager
  • Types (type objects that contain all types defined by the standard interpreter and can be modified to asynchronous mode for generator functions)

    import types
    types.coroutine # implements __await__
Copy the code
  • HTML (implement escaping HTML)
    import html
    html.escape("<h1>I'm Jim</h1>") # output:'< h1> I' m Jim< /h1> '
    html.unescape('< h1> I' m Jim< /h1> ') # <h1>I'm Jim</h1>
Copy the code
  • Mock (resolve test dependencies)
  • Concurrent (create process pool and thread pool)
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
from concurrent.futures importThreadPoolExecutor pool = ThreadPoolExecutor() task = pool.submit()# this method does not block and returns immediately
task.done()Check whether the task is completed
task.result()# block method to view the return value of the task
task.cancel()# Cancel an unexecuted task, return True or False, or return True on success
task.add_done_callback()# callback function
task.running()# Whether task is being executed is a Future object

for data inPool.map (function, argument list):# return a list of completed tasks, executed in order of argumentsPrint (returns the execution result data of the completed task)from concurrent.futures importAs_completed AS_completed (Task list)Return a list of completed tasks, one by oneWait (task list,return_when= condition)Block the main thread based on four conditions
Copy the code
  • Selector (encapsulating select, user multiplexing IO programming)
  • asyncio
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' 'Future =asyncio.ensure_future(coroutine) = future=loop.create_task(coroutine) Future.add_done_callback () adds a completed callback function Loop.run_until_complete (future) future.result() view written as return result asyncio.wait() accepts an iterable coroutine object asynicio. Gather (* iterable,* iterable) The results are the same, but Gather can cancel in bulk, Only one loop in a thread must be loop.run_forever() when loop.stop otherwise an error will be reported loop.run_forever() can perform non-coroutine last executionfinallyLoop.close () asyncio.task.all_tasks () gets all tasks and iterates through them in turn. Cancel () Cancels partial() Loop. call_soon(function, argument) call_soon_threadsafe() thread-safe loop.call_later(time, function, argument) In the same code block, call_soon takes precedence, and multiple later times are executed in ascending order of time. If you must run blocking code, wrap it with loop.run_in_executor(executor, function, argument) into a multithread and place it in a task list. Implement HTTP Reader,writer= via Asyncio to run via wait(Task list)awaitAsyncio.open_connection (host,port) writer.writer() Sends requestsasync for data in reader:
    data=data.decode("utf-8") list.append(data) Then the list stores HTML AS_completed (Tasks) one by one, and returns an iterable coroutine lockasync with Lock():
Copy the code

P * * * * ython advanced

  • Interprocess communication:

    • Manager(built-in data structure, memory sharing between multiple processes)
from multiprocessing import Manager,Process
def add_data(p_dict, key, value):
    p_dict[key] = value

if __name__ == "__main__":
    progress_dict = Manager().dict()
    from queue import PriorityQueue

    first_progress = Process(target=add_data, args=(progress_dict, "bobby1".22))
    second_progress = Process(target=add_data, args=(progress_dict, "bobby2".23))

    first_progress.start()
    second_progress.start()
    first_progress.join()
    second_progress.join()

    print(progress_dict)
Copy the code
    • Pipe(for both processes)
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
from multiprocessing import Pipe,Process
Pipe performs better than queue
def producer(pipe):
    pipe.send("bobby")

def consumer(pipe):
    print(pipe.recv())

if __name__ == "__main__":
    recevie_pipe, send_pipe = Pipe()
    # PIPE can only be used for two processes
    my_producer= Process(target=producer, args=(send_pipe, ))
    my_consumer = Process(target=consumer, args=(recevie_pipe,))

    my_producer.start()
    my_consumer.start()
    my_producer.join()
    my_consumer.join()
Copy the code
    • Queue(not for process pools, Manager().queue () is used for communication between process pools)
from multiprocessing import Queue,Process
def producer(queue):
    queue.put("a")
    time.sleep(2)

def consumer(queue):
    time.sleep(2)
    data = queue.get()
    print(data)

if __name__ == "__main__":
    queue = Queue(10)
    my_producer = Process(target=producer, args=(queue,))
    my_consumer = Process(target=consumer, args=(queue,))
    my_producer.start()
    my_consumer.start()
    my_producer.join()
    my_consumer.join()
Copy the code
    • The process of pool
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
def producer(queue):
    queue.put("a")
    time.sleep(2)

def consumer(queue):
    time.sleep(2)
    data = queue.get()
    print(data)

if __name__ == "__main__":
    queue = Manager().Queue(10)
    pool = Pool(2)

    pool.apply_async(producer, args=(queue,))
    pool.apply_async(consumer, args=(queue,))

    pool.close()
    pool.join()
Copy the code
  • Sys module several common methods

    • Argv command line arguments list, the first being the path to the program itself
    • Path Returns the search path of the module
    • Modules.keys () returns a list of all modules that have been imported
    • Exit (0) exits the program
  • A in s or B in s or C in S

    • Use any: all() returns True for any iterable that is empty
    Method a #
    True in [i in s for i in [a,b,c]]
    Method # 2
    any(i in s for i in [a,b,c])
    Method # 3
    list(filter(lambda x:x in s,[a,b,c]))
Copy the code
  • Set application

    • {1,2}. Issubset ({1,2,3}
    • {1, 2, 3}. Issuperset ({1, 2})
    • {}.isdisjoint({})# Check whether the intersection of two sets is null
  • Chinese matching in the code

    • [u4e00-u9FA5] 正 式 正 式 [u4e00-u9FA5]
  • View the default encoding format

    import sys
    sys.getdefaultencoding()    # setDefaultencodeing () sets the system encoding mode
Copy the code
  • getattr VS getattribute
class A(dict):
    def __getattr__(self,value):# return if the access attribute does not exist
        return 2
    def __getattribute__(self,item):# block access to all elements
        return item
Copy the code
  • Class variables are not stored in the instance __dict__, only in the class __dict__

  • Globals /locals

    • Globals holds all the variable attributes and values in the current module
    • Locals holds all variable properties and values in the current environment
  • Python variable Name resolution mechanism (LEGB)

    • Local scope (Local)
    • Enclosing locals is the local scope where the current scope is embedded.
    • Global/module scope (Global)
    • Built-in scope
  • Implement groups of three from 1 to 100

    print([[x for x in range(1.101)][i:i+3] for i in range(0.100.3)])
Copy the code
  • What is a metaclass?

    • When you create aclass, you just set metaclass= metaclass. Metaclass needs to inherit type, not object, because type is a metaclass
type.__bases__  #(<class 'object'>,)
object.__bases__    # ()
type(object)    #<class 'type'>
Copy the code
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
    class Yuan(type):
        def __new__(cls,name,base,attr,*args,**kwargs):
            return type(name,base,attr,*args,**kwargs)
    class MyClass(metaclass=Yuan):
        pass
Copy the code
  • What is a duck type (i.e., polymorphism)?

    • Python does not determine the type of an incoming argument by default when it is used; it executes as long as the argument is executable
  • Deep copy and shallow copy

    • Deep copy copy content, shallow copy copy address (increase reference count)
    • Copy module to achieve god copy
  • Unit testing

    • The generic test classes inherit TestCase from the unitTest module
    • Pytest module quick tests (methods start with test_ / Test files start with test_ / Test classes start with Test and cannot have init methods)
    • Coverage Statistics test coverage
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
    class MyTest(unittest.TestCase):
        def tearDown(self):Execute before each test case is executed
            print('This method is being tested')

        def setUp(self):Do this before each test case is executed
            print('End of test of this method')

        @classmethod
        def tearDownClass(self):The @classMethod decorator must be used and run once after all tests have been run
            print('Start testing')
        @classmethod
        def setUpClass(self):The @classMethod decorator must be used once before all tests are run
            print('End of test')

        def test_a_run(self):
            self.assertEqual(1.1)  # Test cases
Copy the code
  • The GIL will be released according to the number of bytecode lines and time slices executed. The GIL will be released when IO operations are encountered

  • What is Monkey Patch?

    • Monkey patch that replaces blocking syntax with non-blocking methods at runtime
  • What is Introspection?

    • Runtime ability to determine the type of an object, id,type,isinstance
  • Is Python passed by value or by reference?

    • Neither. Python is a shared argument, and the default argument is executed only once
  • Difference between else and finally in try-exception-else -finally

    • Else is executed when no exception occurs, and finally is executed whether or not an exception occurs
    • Except can catch more than one exception at a time, but in order to handle different exceptions differently, we usually catch them in batches
  • GIL Global interpreter lock

    • Only one thread can execute at a time. CPython(IPython) features no other interpreter
    • CPU intensive: multiple processes + process pool
    • IO intensive: multithreading/coroutine
  • What is a Cython

    • Interpret Python as a C code tool
  • Generators and iterators

    • Iterables only need to implement the __iter__ method

      • Objects that implement the __next__ and __iter__ methods are iterators
    • Generator functions that use generator expressions or yields (generators are a special type of iterator)

  • What is a coroutine

    • yield

    • async-awiat

      • A lighter way to multitask than threads
      • implementation
  • Dict Underlying structure

    • Hash tables are used as the underlying structure to support fast lookups
    • Average hash table lookup time complexity is O (1)
    • The CPython interpreter uses second probes to resolve hash conflicts
  • Hash Expansion and Hash conflict resolution

    • Link method

    • Secondary probing (open addressing) : Used by Python

      • Cyclic copy to new space for expansion
      • Conflict resolution:
    for gevent import monkey
    monkey.patch_all()  # change all blocking methods in the code. You can specify which method to change
Copy the code
  • Determines whether it is a generator or coroutine
    co_flags = func.__code__.co_flags

    Check if it is coroutine
    if co_flags & 0x180:
        return func

    Check if it is a generator
    if co_flags & 0x20:
        return func
Copy the code
  • Fibonacci solved the problem and deformation
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
# A frog can jump up one step or two at a time. Find out how many ways the frog can jump up n steps.
How many ways are there to cover a large 2* N rectangle with n small 2*1 rectangles?
# Method 1:
fib = lambda n: n if n <= 2 else fib(n - 1) + fib(n - 2)
# Method 2:
def fib(n):
    a, b = 0.1
    for _ in range(n):
        a, b = b, a + b
    return b

# A frog can jump up one step or two at a time... It can also jump up n levels. Find out how many ways the frog can jump up n steps.
fib = lambda n: n if n < 2 else 2 * fib(n - 1)
Copy the code
  • Gets the environment variables set by the computer
    import os
    os.getenv(env_name,None)Get the environment variable None if it does not exist
Copy the code
  • Garbage collection mechanism

    • Reference counting
    • Mark clear
    • Generational recycling
    # View generation collection trigger
    import gc
    gc.get_threshold()  #output:(700, 10, 10)
Copy the code
  • True and False are the exact equivalent of 1 and 0 in code and can be evaluated directly with numbers. Inf denotes infinity

  • C10M/C10K

    • C10M:8 core CPUS, 64G memory, maintains 10 million concurrent connections on a 10Gbps network
    • C10K: 1 GHZ CPU,2 GB memory, and 1 GBPS network maintain the FTP service provided by 10,000 clients
  • The difference between yield and yield:

    • Yield from follows an iterable, and there is no constraint after yield
    • Triggered when the GeneratorExit generator stops
  • Several uses of the single underscore

    • When defining a variable, it is represented as a private variable
    • During unpacking, useless data is discarded
    • Represents the result of the last code execution in interactive mode
    • Can do concatenation of numbers (111_222_333)
  • If you use break, else is not executed

  • Go from base 10 to base 2

"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
    def conver_bin(num):
        if num == 0:
            return num
        re = []
        while num:
            num, rem = divmod(num,2)
            re.append(str(rem))
        return "".join(reversed(re))
    conver_bin(10)
Copy the code
  • List1 = [‘ A ‘, ‘B’, ‘C’, ‘D’] how to get to A list element named after the new list A = [], B = [], C = [], D = []
    list1 = ['A'.'B'.'C'.'D']

    Method a #
    for i in list1:
        globals()[i] = []   Can be used to implement reflection in Python

    Method # 2
    for i in list1:
        exec(f'{i}= [] ')   # exec executes string statements
Copy the code
  • Memoryview and bytearray containing
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
    Bytearray is mutable, bytes are immutable, and memoryView does not generate new slices and objects
    a = 'aaaaaa'
    ma = memoryview(a)
    ma.readonly  A read-only memoryView
    mb = ma[:2]  # no new string will be generated

    a = bytearray('aaaaaa')
    ma = memoryview(a)
    ma.readonly  Writable memoryView
    mb = ma[:2]      # new byteArray will not be generated
    mb[:2] = 'bb'    # change to MB is change to MA
Copy the code
  • Ellipsis type
# the code appears... The Ellipsis phenomenon is an Ellipsis object
L = [1.2.3]
L.append(L)
print(L)    # the output: [1, 2, 3, [...]]
Copy the code
  • Lazy computing
    class lazy(object):
        def __init__(self, func):
            self.func = func

        def __get__(self, instance, cls):
            val = self.func(instance)    This is equivalent to executing area(c), where c is the Circle object below
            setattr(instance, self.func.__name__, val)
            return val`

    class Circle(object):
        def __init__(self, radius):
            self.radius = radius

        @lazy
        def area(self):
            print('evalute')
            return 3.14 * self.radius ** 2
Copy the code
  • Walk through the file, pass in a folder, print out the path of all the files in it (recursion)
"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
all_files = []
def getAllFiles(directory_path):
    import os
    for sChild in os.listdir(directory_path):
        sChildPath = os.path.join(directory_path,sChild)
        if os.path.isdir(sChildPath):
            getAllFiles(sChildPath)
        else:
            all_files.append(sChildPath)
    return all_files
Copy the code
  • File storage, file name processing
#secure_filename converts a string to a secure filename
from werkzeug import secure_filename
secure_filename("My cool movie.mov") # output:My_cool_movie.mov
secure_filename(".. /.. /.. /etc/passwd") # output:etc_passwd
secure_filename(u'i contain cool \xfcml\xe4uts.txt') # output:i_contain_cool_umlauts.txt
Copy the code
  • Date formatting
from datetime import datetime

datetime.now().strftime("%Y-%m-%d")

import time
Only localtime can be formatted, time cannot be formatted
time.strftime("%Y-%m-%d",time.localtime())
Copy the code
  • Tuple uses += strange problem
# will return an error, but the tuple value will change because t[1]id does not change
t=(1[2.3])
t[1=] + [4.5]
# t[1] using the append\extend method does not report an error and can be executed successfully
Copy the code
  • __missing__ you should know
class Mydict(dict):
    def __missing__(self,key): The value returned when Mydict uses slices to access properties that do not exist
        return key
Copy the code
  • + and + =
# + cannot be used to connect lists to primitives, while += can (via iadd, internally extends(), so you can add tuples), and + creates a new object
# immutable objects do not have an __iadd__ method, so the __add__ method is used directly, so the ancestor can be added to each other using +=
Copy the code
  • How do I turn every element of an iterable into all the keys of a dictionary?
dict.fromkeys(['jim'.'han'].21) # output:{'jim': 21, 'han': 21}
Copy the code
  • Wireshark capture software

Network knowledge

  • What is the HTTPS?

    • Secure HTTP protocol, HTTPS requires CS certificate, data encryption, port 443, security, the same website HTTPS seo rank will be higher
  • Common response status code

    204No Content // The request was processed successfully. No body of the entity is returned206The Partial Content //Get scope request was processed successfully303See Other // Temporary redirection, expect to use GET to get304Not Modified // Request cache resources307Temporary Redirect // Temporary Redirect, Post does not change to Get401Unauthorized // Authentication fails403Forbidden // The resource request is rejected400// Request parameters are incorrect201// Added or changed successfully503// Server maintenance or overloadCopy the code
  • Idempotency and security of HTTP request methods
  • WSGI
    # environ: A dict object that contains information about all HTTP requests
    # start_response: a function that sends an HTTP response
    def application(environ, start_response):
        start_response('200 OK', [('Content-Type'.'text/html')])
        return '

Hello, web!

'
Copy the code
  • RPC

  • CDN

  • Secure Sockets Layer (SSL) and its successor Transport Layer Security (TLS) is a Security protocol that provides Security and data integrity for network communications.

  • SSH is short for Secure Shell and formulated by the Network Working Group of the IETF. SSH is a security protocol based on the application layer. SSH is a reliable protocol that provides security for remote login sessions and other network services. The SSH protocol can effectively prevent information leakage during remote management. SSH began as a program on UNIX systems and quickly expanded to other operating platforms. SSH, when used correctly, makes up for network vulnerabilities. SSH clients are applicable to multiple platforms. Almost all UNIX platforms — including HP-UX, Linux, AIX, Solaris, Digital UNIX, Irix, and others — can run SSH.

  • TCP/IP

    • TCP: connection-oriented/reliable/byte stream based

    • UDP: connectionless/unreliable/packet oriented

    • Three handshakes and four waves

      • Three-way handshake (SYN/SYN+ACK/ACK)
      • Four waves (FIN/ACK/FIN/ACK)
    • Why is there a three-way handshake when you connect and a four-way handshake when you close?

      • After receiving a SYN request packet from the Client, the Server sends a SYN+ACK packet. ACK packets are used for reply, and SYN packets are used for synchronization. However, when the Server receives a FIN packet, the SOCKET may not be closed immediately. Therefore, the Server can only reply with an ACK packet to tell the Client, “I received the FIN packet you sent.” I can send FIN packets only after all packets on the Server are sent. Therefore, THE FIN packets cannot be sent together. Therefore, a four-step handshake is required.
    • Why does the TIME_WAIT state take 2MSL to return to CLOSE?

      • Although logically, all four packets are sent and we can directly enter the CLOSE state, we must pretend that the network is unreliable and the last ACK can be lost. Therefore, the TIME_WAIT state is used to resend ACK packets that may be lost.
  • XSS/CSRF

    • HttpOnly prevents JAVASCRIPT scripts from accessing and manipulating cookies, effectively preventing XSS

Mysql

  • Index improvement process

    • Linear structure -> binary lookup -> Hash -> binary lookup Tree -> balanced binary Tree -> multiple lookup Tree -> multiple balanced lookup Tree (B-tree)
  • Mysql Interview Summary basics

    Segmentfault.com/a/119000001…

  • Mysql Interview Summary advanced section

    Segmentfault.com/a/119000001…

  • Simple Mysql

    Ningning. Today / 2017/02/13 /…

  • InnoDB deletes the entire table row by row, while MyISAM deletes the entire table

  • Text/BLOb data types cannot have default values, and there is no case conversion when querying

  • When does an index fail

    • A like fuzzy query starting with %

    • Hermit type conversion occurs

    • The leftmost prefix rule is not satisfied

      • For multi-column indexes that are not used for the first part, the index is not used
    • Failure scenario:

      • Use in where clauses should be avoided! = or <> otherwise the engine will abandon the index and perform a full table scan
      • Avoid using OR to join conditions in the WHERE clause. This will cause the engine to abandon indexes and perform a full table scan, even if there are conditional indexes
      • If the column type is a string, the data must be quoted in the condition, otherwise the index will not be used
      • Try to avoid functional manipulation of fields in the WHERE clause, which will cause the engine to abandon indexes for full table scans
For example, select IDfrom t where substring(name,1.3) = 'abc'- the name; Select ID if the value starts with ABCfrom t where name like 'abc%'For example, select IDfrom t where datediff(day, createdate, '2005-11-30') = 0'2005-11-30'; Should be changed to:Copy the code
      • Do not perform functions, arithmetic operations, or other expression operations to the left of the “=” in the WHERE clause, or the system may not use the index properly
      • Expression operations on fields in the WHERE clause should be avoided as much as possible, which can cause the engine to abandon indexes for a full table scan
Such as: select idfrom t where num/2 = 100Should be changed to: select IDfrom t where num = 100*2;Copy the code
      • For example, the set enum column is not suitable. (Enum types can add NULL, and the default value automatically filters out Spaces. Set values are similar to enumerations, but only 64 values can be added.)

      • If MySQL estimates that a full table scan is faster than an index, then the index is not used

  • What is a clustered index

    • The B+Tree leaf holds data or Pointers
    • MyISAM index and data separation, using non-aggregation
    • InnoDB data files are index files and primary key indexes are clustered indexes

Summary of Redis command

  • Why so fast?

    • Based on memory, written by C language

    • Use multiplex I/O multiplexing model, non-blocking IO

    • Use single threads to reduce interthread switching

      • Since Redis is a memory-based operation, CPU is not the bottleneck of Redis. The bottleneck of Redis is most likely the size of the machine’s memory or network bandwidth. Since single-threading is easy to implement and the CPU is not a bottleneck, it makes sense to go with a single-threaded solution (there is a lot of trouble with multi-threading after all!). .
    • Simple data structure

    • The VM mechanism was built to reduce the time to call system functions

  • advantage

    • High performance – Redis can read 110000 times /s and write 81000 times /s
    • Rich data types
    • Atomic – All operations in Redis are atomic, and Redis also supports atomic execution after several operations have been combined
    • Rich features – Redis also supports publish/subscribe, notifications, key expiration, and more
  • What are Redis transactions?

    • A mechanism for packaging multiple requests and executing multiple commands in sequence at once
    • The transaction function is implemented through multi,exec,watch and other commands
    • Python redis-py pipeline=conn.pipeline(transaction=True)
  • Persistent mode

    • RDB (snapshot)

      • Save (synchronization to ensure data consistency)
      • Bgsave (asynchronous, default for shutdown, no AOF)
    • AOF(Append logs)

  • How to implement queues

    • push
    • rpop
  • Commonly used data types (Bitmaps,Hyperloglogs, range queries, etc not commonly used)

    • String: counter

      • Integer or SDS (Simple Dynamic String)
    • List: a List of users’ concerns, a List of fans

      • Ziplist or double Linked List (contiguous blocks of memory where the length of each entry node is stored in the header)
    • To Hash:

    • Set: A user’s followers

      • Intset or hashtable
    • Zset(Ordered Collection) : Real-time information leaderboard

      • Skiplist
  • With Memcached difference

    • Memcached can only store string keys
    • The Memcached user can only APPEND data to the end of an existing string and use the string as a list. But when deleting these elements, Memcached hides them by blacklisting them, preventing them from being read, updated, or deleted
    • Both Redis and Memcached store data in memory. Both are in-memory databases. But Memcached can also be used to cache other things, such as images, videos, and so on
    • Virtual memory – Redis When physical memory runs out, some values that have not been used for a long time can be swapped to disk
    • Storage data security – When Memcached is down, the data is gone; Redis can be saved to disk periodically (persistent)
    • The application scenario is different: Redis can be used as NoSQL database, but also can be used as message queue, data stack and data cache; Memcached is good for caching SQL statements, data sets, temporary user data, deferred query data, sessions, and more
  • Redis implements distributed locking

    • Using setnx to implement locking, you can also add timeout through expire
    • The value of a lock can be a random UUID or a specific name
    • When a lock is released, the uUID is used to determine whether the lock is the same as the uUID. If yes, run delete to release the lock
  • Q&A

    • Cache avalanche

      • Cache data expires in a short period of time and a large number of requests to access the database
    • The cache to penetrate

      • When the data is requested, it does not exist in the query cache, nor in the database
    • Cache warming

      • Initialize the project to add some commonly used data to the cache
    • The cache update

      • Data expires. Cache data is updated
    • Cache the drop

      • When traffic surges, service problems (such as slow or unresponsive response times) occur, or non-core services affect the performance of the core process, you still need to ensure that the service is still available, even at the expense of the service. The system can automatically degrade according to some key data, or manually degrade by configuring switches
  • Consistent Hash algorithm

    • Ensure data consistency when using clusters
  • Based on Redis to achieve a distributed lock, requiring a timeout parameter

    • setnx
  • Virtual memory

  • Memory jitter

Linux

  • Unix five I/O models

    • Blocking IO

    • Non-blocking IO

    • Multiplexing IO using Selectot in Python

      • select

        • Concurrency is not high and the number of connections is very active
      • poll

        • Not much better than SELECT
      • epoll

        • This applies to the case where the number of connections is large but the number of active links is small
    • Signal driven IO

    • Asynchronous IO (Gevent/Asyncio implements asynchrony)

  • Better command manual than MAN

    • TLDR: a manual with command examples
  • Difference between kill -9 and -15

    • -15: The program stops immediately/stops after the program releases resources/the program may continue running
    • -9: Because of the uncertainty of -15, use -9 to kill the process immediately
  • Paging mechanism (memory allocation management scheme for separating logical and physical addresses) :

    • The operating system manages memory efficiently and reduces fragmentation
    • The logical address of the program is divided into fixed-size pages
    • Physical addresses are divided into frames of the same size
    • The page table corresponds to logical and physical addresses
  • Segmented mechanism

    • To satisfy some logical requirements of the code
    • Data sharing/data protection/dynamic linking
    • Each segment has a continuous internal memory allocation, and each segment is allocated discretely
  • Check the CPU memory usage?

    • top
    • Free Check available memory and rectify memory leaks

Design patterns

The singleton pattern

"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
    Style #
    def Single(cls,*args,**kwargs):
        instances = {}
        def get_instance (*args, **kwargs):
            if cls not in instances:
                instances[cls] = cls(*args, **kwargs)
            return instances[cls]
        return get_instance
    @Single
    class B:
        pass
    Way # 2
    class Single:
        def __init__(self):
            print("Singleton pattern implementation method two...")

    single = Single()
    del Single  Call single every time
    # Method 3 (the most common method)
    class Single:
        def __new__(cls,*args,**kwargs):
            if not hasattr(cls,'_instance'):
                cls._instance = super().__new__(cls,*args,**kwargs)
            return cls._instance
Copy the code

The factory pattern

    class Dog:
        def __init__(self):
            print("Wang Wang Wang")
    class Cat:
        def __init__(self):
            print("Miao Miao Miao")


    def fac(animal):
        if animal.lower() == "dog":
            return Dog()
        if animal.lower() == "cat":
            return Cat()
        print("Sorry, it has to be: dog,cat.")
Copy the code

Structure mode

"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ''' class Computer: def __init__(self,serial_number): self.serial_number = serial_number self.memory = None self.hadd = None self.gpu = None def __str__(self): info = (f'Memory:{self.memoryGB}', 'Hard Disk:{self.hadd}GB', 'Graphics Card:{self.gpu}') return '.join(info) class ComputerBuilder: def __init__(self): self.computer = Computer('Jim1996') def configure_memory(self,amount): Self.com puter.memory = amount return self # def configure_hdd(self,amount): pass def configure_gpu(self,gpu_model): pass class HardwareEngineer: def __init__(self): self.builder = None def construct_computer(self,memory,hdd,gpu) self.builder = ComputerBuilder() self.builder.configure_memory(memory).configure_hdd(hdd).configure_gpu(gpu) @property def computer(self): return self.builder.computerCopy the code

Data structures and algorithms Built-in data structures and algorithms

Python implements a variety of data structures

Quick sort

    def quick_sort(_list):
            if len(_list) < 2:
                return _list
            pivot_index = 0
            pivot = _list(pivot_index)
            left_list = [i for i in _list[:pivot_index] if i < pivot]
            right_list = [i for i in _list[pivot_index:] if i > pivot]
        return quick_sort(left) + [pivot] + quick_sort(right)
Copy the code

Selection sort

"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
    def select_sort(seq):
        n = len(seq)
        for i in range(n- 1)
        min_idx = i
            for j in range(i+1,n):
                if seq[j] < seq[min_inx]:
                    min_idx = j
            ifmin_idx ! = i: seq[i], seq[min_idx] = seq[min_idx],seq[i]Copy the code

Insertion sort

    def insertion_sort(_list):
        n = len(_list)
        for i in range(1,n):
            value = _list[i]
            pos = i
            while pos > 0 and value < _list[pos - 1]
                _list[pos] = _list[pos - 1]
                pos -= 1
            _list[pos] = value
            print(sql)
Copy the code

Merge sort

    def merge_sorted_list(_list1,_list2):   # merge ordered lists
        len_a, len_b = len(_list1),len(_list2)
        a = b = 0
        sort = []
        while len_a > a and len_b > b:
            if _list1[a] > _list2[b]:
                sort.append(_list2[b])
                b += 1
            else:
                sort.append(_list1[a])
                a += 1
        if len_a > a:
            sort.append(_list1[a:])
        if len_b > b:
            sort.append(_list2[b:])
        return sort

    def merge_sort(_list):
        if len(list1)<2:
            return list1
        else:
            mid = int(len(list1)/2)
            left = mergesort(list1[:mid])
            right = mergesort(list1[mid:])
            return merge_sorted_list(left,right)
Copy the code

Heapq module for heap sorting

    from heapq import nsmallest
    def heap_sort(_list):
        return nsmallest(len(_list),_list)
Copy the code

The stack

"Have a problem and no one to answer it? We have created a Python learning QQ group: 857662006 to find like-minded friends and help each other. There are also good video tutorials and PDF e-books in the group. ' ' '
    from collections import deque
    class Stack:
        def __init__(self):
            self.s = deque()
        def peek(self):
            p = self.pop()
            self.push(p)
            return p
        def push(self, el):
            self.s.append(el)
        def pop(self):
            return self.pop()
Copy the code

The queue

    from collections import deque
    class Queue:
        def __init__(self):
            self.s = deque()
        def push(self, el):
            self.s.append(el)
        def pop(self):
            return self.popleft()
Copy the code

Binary search

    def binary_search(_list,num):
        mid = len(_list)//2
        if len(_list) < 1:
            return Flase
        if num > _list[mid]:
            BinarySearch(_list[mid:],num)
        elif num < _list[mid]:
            BinarySearch(_list[:mid],num)
        else:
            return _list.index(num)
Copy the code

Interview enhancement questions:

About database optimization and design

  • How to implement a queue with two stacks

  • Reverse a linked list

  • Merges two ordered lists

  • Delete the linked list node

  • Invert the binary tree

  • Design short url service? Base 62 implementation

  • Design a feed stream?

  • Why is it better to use incremented integers as primary keys in mysql database? Is it ok to use uUID? Why is that?

    • When InnoDB tables are written in the same order as leaves in the B+ tree index, access efficiency is highest. Use self-growing ID primary keys for storage and query performance.
    • For InnoDB’s primary index, data is sorted by primary key. Due to the disorder of UUID, InnoDB will have a huge I/O pressure. In this case, it is not suitable to use UUID as physical primary key, it can be used as logical primary key, and the physical primary key still uses self-increment ID. For global uniqueness, you should index other tables with uuid or do foreign keys
  • How do we generate the increment ID of the database in a distributed system?

    • Using redis
  • Based on Redis to achieve a distributed lock, requiring a timeout parameter

    • setnx
    • setnx + expire
  • What happens if a single Redis node goes down? Are there any other industry solutions to implement distributed lock codes?

    • Use the hash consistent algorithm

Caching algorithm

  • LRU(least-recently-used): replaces the least recently used object
  • LFU(Least frequently used): the Least frequently used data. If a data is rarely used in the recent period, it is less likely to be used in the future

Direction of server performance optimization

  • Use data structures and algorithms

  • The database

    • The index optimization

    • Slow query elimination

      • Slow_query_log_file Enables and queries slow query logs
      • Troubleshoot index problems through Explain
      • Adjust data Modify index
    • Batch operations to reduce I/O operations

    • Use NoSQL: such as Redis

  • Network IO

    • The batch operation
    • pipeline
  • The cache

    • Redis
  • asynchronous

    • Asyncio Implements asynchronous operations
    • Use Celery to reduce IO blocking
  • concurrent

    • multithreading
    • Gevent