It was revised on January 30th. This article was originally intended to be reprinted, but later it was found that there were some logical errors in this article, so I referred to several other articles and updated the article again. The address of several reference articles will be attached at the end of the article, and interested readers can click on it.

Django start

When you start a Django project, whether you run it from the command line or click on PyCharm, you are actually executing the ‘runServer’ operation. Ruserver is a Web server that comes with Django and is mainly used for development and debugging. The nginx+ UWSGI pattern is generally used.

Either way, when starting a project, you do two things:

  • Create an instance of the WSGIServer class to accept the user’s request.
  • When a user’s HTTP request arrives, a WSGIHandler is assigned to the user to handle the user’s request and response. This Handler is the core of processing the entire request.

WSGI

WSGI: Web Server Gateway Interface. WSGI is not a Server, nor is it an API for interacting with programs, nor is it code, but defines an Interface that describes the specification for how a Web Server communicates with a Web Application. When a client sends a request, it is actually the Web server that handles the request first, usually a Web server such as Nginx or Apache. The Web server then hands the request to a Web application such as Django. WSGI is the intermediary between the two. It connects the Web server to the Web framework (Django).

# This code comes from python core programming
def simplr_wsgi_app(environ, start_response):
	Djangos use the same variable names
	status = '200 OK'
	headers = [{'Content-type': 'text/plain'}]
	Initialize the response, which must be called before returning
	start_response(status, headers)
	# return iterable
	return ['hello world! ']
Copy the code

In Django, the same logic is implemented through the WSGIHandler class, which we’ll focus on next! If you are interested in WSGI and uWSGI, please read this article, WSGI & uWSGI. Great!

Basic Concepts of Middleware

As the name implies, middleware is between the Web server side and the Web application, and it can add additional functionality. When we create a Django project (via PyCharm), it automatically sets up the necessary middleware for us.

MIDDLEWARE_CLASSES = [
    'django.middleware.security.SecurityMiddleware'.'django.contrib.sessions.middleware.SessionMiddleware'.'django.middleware.common.CommonMiddleware'.'django.middleware.csrf.CsrfViewMiddleware'.'django.contrib.auth.middleware.AuthenticationMiddleware'.'django.contrib.auth.middleware.SessionAuthenticationMiddleware'.'django.contrib.messages.middleware.MessageMiddleware'.'django.middleware.clickjacking.XFrameOptionsMiddleware',]Copy the code

The middleware either preprocesses the data from the user and sends it to the application; Or make some final adjustments to the resulting data before the application returns the response load to the user. More generally, in Django, the middle helps you prepare the Request object. Then the application can use the Request object to retrieve data directly. It also helps you add the response to the header, status code, and so on.

The data flow

When Django receives a request, it initializes a WSGIHandler, which you can trace in the wsgi.py file under your project. You’ll find this class.

class WSGIHandler(base.BaseHandler):
    def __call__(self, environ, start_response):
	    pass

Copy the code

This class follows the rules of the WSGI application and takes two parameters: one containing server-side environment variables and the other a callable object that returns an iterable. This handler controls the entire process from request to response.

See another picture online, more complete:

Roughly several steps: 1. The user requests a page from the browser 2. The Request arrives at Request Middlewares, 3. The URLConf finds the View via the urls.py file and the requested URL. 4. It can also do something with the request or return response 5 directly. 6. Methods in the View can optionally access the underlying data through Models 7. All model-to-DB interactions are done through the Manager 8. 9. The Context is passed to the Template to generate the page a.template using Filters and Tags to render output B. The output is returned to View C. Httpresponse is sent to Response Middlewares D. Any Response Middlewares can either enrich the Response or return a completely different Response to the browser to present to the userCopy the code

Order and method in intermediate classes

Djangos middleware classes contain at least one of the following four methods: Process_request, process_VIEW, process_exception, process_response WSGIHandler adds these methods to _request_middleware, _view_middleware, _response_middleware, and load_middleware, respectively _EXCEPtion_middleware is one of four lists. Not every middleware has these four methods, and if a method does not exist, the class is skipped during loading.

for middleware_path inSettings. MIDDLEWARE_CLASSES:...if hasattr(mw_instance, 'process_request'):
        request_middleware.append(mw_instance.process_request)
    if hasattr(mw_instance, 'process_view'):
        self._view_middleware.append(mw_instance.process_view)
    if hasattr(mw_instance, 'process_template_response'):
        self._template_response_middleware.insert(0, mw_instance.process_template_response)
    if hasattr(mw_instance, 'process_response'):
        self._response_middleware.insert(0, mw_instance.process_response)
    if hasattr(mw_instance, 'process_exception'):
        self._exception_middleware.insert(0, mw_instance.process_exception)
Copy the code

Process Request and Process Response are loaded in reverse order. In the loop process_request is appended to the end of the list. Process_request was inserted to the front.

process_request

Take a few examples of middleware

class CommonMiddleware(object):
# pseudocode
    def process_request(self, request):

        # Check for denied User-Agents
        if 'HTTP_USER_AGENT' in request.META:
            for user_agent_regex in settings.DISALLOWED_USER_AGENTS:
                if user_agent_regex.search(request.META['HTTP_USER_AGENT']):
                    raise PermissionDenied('Forbidden user agent')
        host = request.get_host()

        if settings.PREPEND_WWW and host and not host.startswith('www.'):
            host = 'www.' + host
		pass
Copy the code

CommonMiddleware’s process_request is used to determine whether the user agent meets the requirements and to refine the URL, such as adding WWW or ending with a /.

class SessionMiddleware(object):
    def process_request(self, request):
        session_key = request.COOKIES.get(settings.SESSION_COOKIE_NAME)
        request.session = self.SessionStore(session_key)
Copy the code

SessionMiddleware’s process_request takes the session_key out of the cookies and puts it into Request. session.

class AuthenticationMiddleware(MiddlewareMixin):
    def process_request(self, request):
        assert hasattr(request, 'session'), (
              "The Django authentication middleware requires session middleware "
              "to be installed. Edit your MIDDLEWARE%s setting to insert "
              "'django.contrib.sessions.middleware.SessionMiddleware' before "
              "'django.contrib.auth.middleware.AuthenticationMiddleware'."
        ) % ("_CLASSES" if settings.MIDDLEWARE is None else "")
        request.user = SimpleLazyObject(lambda: get_user(request))
Copy the code

As mentioned earlier, middleware loads in a certain order (in reverse order). AuthenticationMiddleware’s process_request method is loaded based on the session middleware, and then through the request session, Take the user out and put it into request.user.

Process_request should return None or An HTTPResponse object. When None is returned, the WSGI handler continues loading the methods in process_request, or in the latter case, Handlers directly load a list of _response_middleware and respond.

Parsing the url

When process_request _request_middleware list be traversed out, get a after processing of the request object (joined the request. The session, request. User attributes). Django will match urls in order and throw an exception if the match is unsuccessful. If the request middleware returns None, Django will parse the requested URL. In setting there is a ROOT_URLCONF that points to the urls.py file from which a URlconf is generated, which is essentially a mapping table between urls and view functions. The user’s URL is then parsed by the resolver to find the first matching view.

process_view

By matching the URL, you get the view function and related parameters. Before calling the view function, Django loads the process_view methods in _view_middleware. I went through each of the default middleware and only saw that CSRF had this method

# pseudocode
class CsrfViewMiddleware(object):

    def process_view(self, request, callback, callback_args, callback_kwargs):

        if getattr(request, 'csrf_processing_done', False):
            return None

        try:
            csrf_token = _sanitize_token(
                request.COOKIES[settings.CSRF_COOKIE_NAME])
            # Use same token next time
            request.META['CSRF_COOKIE'] = csrf_token
        except KeyError:
            csrf_token = None
        if getattr(callback, 'csrf_exempt', False):
            return None
        pass

Copy the code

This method checks whether a CSRF field exists in cookiers. If it does not, it raises an exception, and if it does, it returns None. The View middleware, like the ReqUST middleware, must return None or an httpResponse. If an httpResponse is returned, Handlers simply load a list of _response_middleware and return httpResponse. Handlers simply load a list of _response_middleware and respond

Executing view logic

The view function needs to satisfy:

  1. Function-based (FBV) or class-based (CVB) views.
  2. The first argument accepted must be Request and a Response object must be returned.

If the view function throws an exception, the Handler loops through the _Exception_Middleware list, and if an exception is thrown, subsequent process_exception will not be executed.

process_response

At this stage, we get an HTTPResponse object, either returned by process_VIEW or by the view function. Now we’ll loop through the response middleware. This is the middleware’s last chance to tweak the data. Here’s an example:

class XFrameOptionsMiddleware(object):

    def process_response(self, request, response):
        # Don't set it if it's already in the response
        if response.get('X-Frame-Options') is not None:
            return response

        # Don't set it if they used @xframe_options_exempt
        if getattr(response, 'xframe_options_exempt', False):
            return response

        response['X-Frame-Options'] = self.get_xframe_options_value(request,
                                                                    response)
        return response
Copy the code

XFrameOptionsMiddleware adds X-frame-options to response to prevent sites from being nested and hijacked.

class CsrfViewMiddleware(object):
    def process_response(self, request, response):
        if getattr(response, 'csrf_processing_done', False):
            return response

        if not request.META.get("CSRF_COOKIE_USED", False):
            return response

        # Set the CSRF cookie even if it's already set, so we renew
        # the expiry timer.
        response.set_cookie(settings.CSRF_COOKIE_NAME,
                            request.META["CSRF_COOKIE"],
                            max_age=settings.CSRF_COOKIE_AGE,
                            domain=settings.CSRF_COOKIE_DOMAIN,
                            path=settings.CSRF_COOKIE_PATH,
                            secure=settings.CSRF_COOKIE_SECURE,
                            httponly=settings.CSRF_COOKIE_HTTPONLY
                            )
        # Content varies with the CSRF cookie, so set the Vary header.
        patch_vary_headers(response, ('Cookie',))
        response.csrf_processing_done = True
        return response
Copy the code

CsrfViewMiddleware sets CSRF cookies in Response

The last

When the response middleware is loaded, the system will call the start_Response method object passed from the WSGI server to initialize the response and then respond before returning.

conclusion

This paper focuses on:

  1. When Django starts, it starts a WSGIserver and generates a handler for each user that requests it.
  2. Understand the WSGI protocol, and the WSGIHandler class controls the request-to-response process as well as the basic process of the process.
  3. Middleware concepts, and at which steps each of the process_request, process_response, process_view, and process_exception methods plays a role.
  4. Intermediate values are executed sequentially, request and View are executed sequentially, and Response and exception are executed in reverse order. This step is done by WSGIHandler when it loads its lists.

Django middleware tutorial notes: Django middleware middleware tutorial notes: Django Middleware tutorial notes: WSGI & UWSGi 3, What does Django do from request to response 4, What does Django go through from request to return