Python code hot update implementation

Hot updates to Python code can find implementations that actually work, such as IPython’s autoreload. Py, PyDev’s Pydevd_realod.py.

But if you implement it yourself, how do you start from scratch?

What is hot update

In simple terms, hot updating means that the process loads the modified program code without restarting it, and that it performs as expected. In practical development, the main uses of hot update are,

During development, improve development efficiency, make code changes visible immediately, avoid frequent restarts
During operation and maintenance, fix urgent bugs in the case of server being offline

For bug fixes, hot updates are less necessary if the server does not maintain state, but if the server holds complex state, hot updates can be a more appropriate option.

Hot update essentials

Python code is organized in modules, and code hot updates are module hot updates.

The built-in Python function reload is used to reload modules, but using reload directly does not solve the hot update problem. The point of hot updates is that you need to enable the created objects to execute the updated code. This is where most of the code in autoreload. Py and Pyded_reload.

Update operation disassembly

Updating ordinary functions

In the whole hot update logic, the update of function is the most important, because function is the execution unit of concrete logic. Referring to the above implementation, define the implementation of function update as follows,

def update_function(old_func, new_func):
  old_func.__doc__ = new_func.__doc__
  old_func.__dict__ = new_func.__dict__
  old_func.__defaults__ = new_func.__defaults__
    old_func.__code__ = new_func.__code__
Copy the code

The above functions can be verified with a simple example,

def old_foo():
  return 'old_foo'


def new_foo():
  return 'new_foo'


class ReloadTest(unittest.TestCase):
  def test_update_function(self):
    self.assertEqual('old_foo', old_foo())
    update_function(old_foo, new_foo)
    self.assertEqual('new_foo', old_foo())
Copy the code

Update the decorator decorated function

The current implementation is consistent with _update_function in Pydevd_reload. py through the above test case, but according to pydevd_reload.py,

Functions and methods using decorators (other than classmethod and staticmethod) are not handled correctly.
Copy the code

That is, such implementations do not support decorators, so extend the use cases,

def decorator(func):
  def _(*args, **kwargs):
    return func(*args, **kwargs)
  return _


@decorator
def old_foo_with_decorator():
  return 'old_foo'


@decorator
def new_foo_with_decorator():
  return 'new_foo'


class ReloadTest(unittest.TestCase):
  def test_update_function_with_decorator1(self):
    self.assertEqual('old_foo', old_foo_with_decorator())
    update_function(old_foo_with_decorator, new_foo_with_decorator)
    self.assertEqual('new_foo', old_foo_with_decorator())

  def test_update_function_with_decorator2(self):
    self.assertEqual('old_foo', old_foo())
    update_function(old_foo, old_foo_with_decorator)
    self.assertEqual('new_foo', old_foo())
Copy the code

Both cases will fail. To resolve the first case where update_function does not take effect while all functions that need to be updated are decorated by decorators, you can recursively fix this by modifying update_function as follows:

def both_instance_of(first, second, klass):
  return isinstance(first, klass) and isinstance(second, klass)


def update_function(old_func, new_func):
  old_func.__code__ = new_func.__code__
  old_func.__defaults__ = new_func.__defaults__
  old_func.__doc__ = new_func.__doc__
  old_func.__dict__ = new_func.__dict__
  if not old_func.__closure__ or not new_func.__closure__:
    return
  for old_cell, new_cell in zip(old_func.__closure__, new_func.__closure__):
    if not both_instance_of(old_cell.cell_contents, new_cell.cell_contents, types.FunctionType):
      continue
    update_function(old_cell.cell_contents, new_cell.cell_contents)
Copy the code

The function decorated by the decorator can be found by the free variable of the final return function, so it can be handled recursively by updating the closure of the function.

The second case encounters the following exception,

ValueError: _() requires a code object with .. free vars, not ..
Copy the code

This exception is thrown because Python enforces a check in the func_set_code function of funcobject.c, and it should be impossible to get around without modifying the Python source code. So update_function needs to be tweaked a little bit, in which case it doesn’t update, it doesn’t throw exceptions,

def update_function(old_func, new_func): if not both_instance_of(old_func, new_func, types.FunctionType): return if len(old_func.__code__.co_freevars) ! = len(new_func.__code__.co_freevars): return old_func.__code__ = new_func.__code__ old_func.__defaults__ = new_func.__defaults__ old_func.__doc__ = new_func.__doc__ old_func.__dict__ = new_func.__dict__ if not old_func.__closure__ or not new_func.__closure__: return for old_cell, new_cell in zip(old_func.__closure__, new_func.__closure__): if not both_instance_of(old_cell.cell_contents, new_cell.cell_contents, types.FunctionType): continue update_function(old_cell.cell_contents, new_cell.cell_contents)Copy the code

Update the class

After processing the function update, you can realize the class update, which involves ordinary member functions, class functions, static functions, property updates, and the need to add and delete attributes for processing.

def update_class(old_class, new_class):
  for name, new_attr in new_class.__dict__.items():
    if name not in old_class.__dict__:
      setattr(old_class, name, new_attr)
    else:
      old_attr = old_class.__dict__[name]
      if both_instance_of(old_attr, new_attr, types.FunctionType):
        update_function(old_attr, new_attr)
      elif both_instance_of(old_attr, new_attr, staticmethod):
        update_function(old_attr.__func__, new_attr.__func__)
      elif both_instance_of(old_attr, new_attr, classmethod):
        update_function(old_attr.__func__, new_attr.__func__)
      elif both_instance_of(old_attr, new_attr, property):
        update_function(old_attr.fdel, new_attr.fdel)
        update_function(old_attr.fget, new_attr.fget)
        update_function(old_attr.fset, new_attr.fset)
      elif both_instance_of(old_attr, new_attr, (type, types.ClassType)):
        update_class(old_attr, new_attr)
Copy the code

However, __slots__ and __metaclass__ cannot be updated correctly if they change.

The update module

Updates on modules are similar to class-update processing, where only types that might normally exist directly in a module are handled.

def update_module(old_module, new_module):
  for name, new_val in new_module.__dict__.iteritems():
    if name not in old_module.__dict__:
      setattr(old_module, name, new_val)
    else: 
      old_val = old_module.__dict__[name]
      if both_instance_of(old_val, new_val, types.FunctionType):
          update_function(old_val, new_val)
      elif both_instance_of(old_val, new_val, (type, types.ClassType)):
          update_class(old_val, new_val)
Copy the code

Define the callback interface

After analyzing all the way, we can see that hot update not only has many restrictions, but also some problems are not dealt with.

Properties defined on a module or class are not processed
New member attributes are not processed
An action cannot be performed when an update occurs

Therefore, some fixed calls need to be made in place so that the upper-level logic can step in to update the process and do specific processing to fulfill the actual requirements.

The module update callback can be added at the end of the update_module by specifying the function name for the callback.

def update_module(old_module, new_module):
  ...
  if hasattr(old_module, '_on_reload_module'):
    old_module._on_reload_module()
Copy the code

Class update callback, processed at the end of update_class,

def update_class(old_class, new_class):
  ...
  if hasattr(old_class, '_on_reload_class'):
    old_class._on_reload_class()
Copy the code

Like the __reload_update__ hook function in pydevd_reload. Py, it’s easy to see why it needs to be defined.

If you add attribute definitions to __init__, __init__ will not be executed again for the old object, so there is no chance to create attributes. If you want to update your code in a hot update mode, you need to avoid this situation. If you can’t get around it, there are two ways to do this. One is to use getattr whenever you use a new attribute, which is a temporary workaround. The other is to find all the previously created objects during the update and actively set the initial value of the new property to the object. Class instances can be found through the GC module,

def update_class(old_class, new_class):
  ...
  if hasattr(old_class, '_on_reload_instance'):
    for x in gc.get_referrers(old_class):
      if isinstance(x, old_class):
        x._on_reload_instance()
Copy the code

conclusion

Once you’ve implemented the above functions, you have the logic to execute the code hot update, just find the module that needs to be updated and call update_Module. But is there anything left to write about? Is how to find the modules that need to be updated. But IT’s a little bit hard to write, so I’ll sum it up later.

When it comes to hot updates, do a lot of searching, and when you find some implementation, read the code and try to understand it. It doesn’t necessarily make sense, but if you try to implement it step by step in this case, you’ll have a better understanding of the problem, and it’ll be easier to understand why the code that’s already implemented is being implemented the way it is.

reference

IPython
PyDev
Python Data Model
Class method differences in Python: bound, unbound and static
In Python, how do you change an instantiated object after a reload
Easy, Automated Reloading in Python