How does the Requests project use PyTest for unit testing

Pytest is a unit testing framework for Python. It is easy to use and is used in many well-known projects. Requests, python’s well-known HTTP crawler library, is also easy to use and is a TOP10 open source python project. We’ve covered both projects before, but this article focuses on how the Requests project uses PyTest for unit testing with the following three goals:

Good command of PYTest
Learn how to unit test a project
Dig into the implementation details of Requests

This paper is divided into the following parts:

Requests project unit test status
How are simple utility classes tested
How is Request-API tested
Underlying API testing

Requests project unit test status

The unit test code for Requests is all in the Tests directory, configured using PyTest.ini. In addition to PyTest, you also need to install:

The library	describe
httpbin	Flask is an HTTP service implemented by flask that can be used to test the HTTP protocol by defining HTTP responses on the client side
pytest-httpbin	Pytest plug-in that encapsulates the implementation of Httpbin
pytest-mock	A plug-in for PyTest that provides mocks
pytest-cov	A pyTest plug-in that provides coverage

The above dependency master version is defined in the major-dev file; Version 2.24.0 will be defined in Pen V.

The test case uses the make command, which is defined in the Makefile, and runs all the unit tests with make CI as follows:

$ make ci pytest tests --junitxml=report.xml ======================================================================================================= test session starts ======================================================================================================= platform Linux - Python 3.6.8, pytest - 3.10.1, py - 1.10.0 pluggy - 0.13.1 rootdir: / home/work6 / project/requests, inifile: Pytest. ini plugins: mock-2.0.0, httpbin-1.0.0, coV-2.9.0 collected 552 items tests/test_help.py... [ 0%] tests/test_hooks.py ... [ 1%] tests/test_lowlevel.py ............... [ 3%] tests/test_packages.py ... [ 4%] tests/test_requests.py . . [39%] 127.0.0.1 - - [10/Aug/2021 08:41:53] "GET /stream/4 HTTP/1.1" 200 756.127.0.0.1 - - [10/Aug/2021 08:41:53] "GET /stream/4 HTTP/1.1 59 / stream / 4 HTTP / 1.1 "500 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the Exception happened during processing of the request from Traceback ('127.0.0.1', 46048) : The File "/ usr/lib64 / python3.6 / wsgiref/handlers. Py", line 138, in run self.finish_response() x.........................................................................................  [ 56%] tests/test_structures.py .................... [ 59%] tests/test_testserver.py ...... s.... [ 61%] tests/test_utils.py .. s....................................................................................................................... . ssss [ 98%] ssssss..... [100%] ----------------------------------------------------------------------------------- generated xml file: /home/work6/project/requests/report.xml ----------------------------------------------------------------------------------- ======================================================================================= 539 passed, 12 skipped, 1 xfailed in 64.16 seconds = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =Copy the code

You can see that Requests passed a total of 539 test cases in one minute, which is still good. Use Make Coverage to view unit test coverage:

$ make coverage ----------- coverage: platform linux, Python 3.6.8 - final - 0 -- -- -- -- -- -- -- -- -- -- -- the Name Stmts Miss Cover -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- requests/__init__.py 71 71 0% requests/__version__.py 10 10 0% requests/_internal_utils.py 16 5 69% requests/adapters.py  222 67 70% requests/api.py 20 13 35% requests/auth.py 174 54 69% requests/certs.py 4 4 0% requests/compat.py 47 47 0% requests/cookies.py 238 115 52% requests/exceptions.py 35 29 17% requests/help.py 63 19 70% requests/hooks.py 15 4 73% requests/models.py 455 119 74% requests/packages.py 16 16 0% requests/sessions.py 283 67 76% requests/status_codes.py 15  15 0% requests/structures.py 40 19 52% requests/utils.py 465 170 63% ------------------------------------------------- TOTAL 2189 844 61% Coverage XML written to file coverage.xmlCopy the code

The results show 61% overall coverage for the Requests project, with coverage for each module clearly visible.

Unit test coverage is judged by the number of lines of code, Stmts shows the number of valid lines of the module, and Miss shows the lines that have not been executed. If you generate an HTML report, you can also locate specific lines that are not covered; Pycharm’s Coverage has a similar feature.

The files and test classes under Tests are as follows:

file	describe
compat	Python2 is compatible with Python3
conftest	Pytest configuration
test_help,test_packages,test_hooks,test_structures	Simple test classes
utils.py	Tool function
test_utils	Test tool function
test_requests	Test requests
testserver\server	Simulation service
test_testserver	Simulated service testing
test_lowlevel	Simulate network tests using mock service tests

How are simple utility classes tested

Test_help implements analysis

Start with the simplest test_help, where the test class names the object being tested. Take a look at the module under test, help.py. This module consists of two functions: info and _implementation:

import idna

def _implementation():
    ...
    
def info():
    ...
    system_ssl = ssl.OPENSSL_VERSION_NUMBER
    system_ssl_info = {
        'version': '%x' % system_ssl if system_ssl is not None else ''
    }
    idna_info = {
        'version': getattr(idna, '__version__', ''),
    }
    ...
    return {
        'platform': platform_info,
        'implementation': implementation_info,
        'system_ssl': system_ssl_info,
        'using_pyopenssl': pyopenssl is not None,
        'pyOpenSSL': pyopenssl_info,
        'urllib3': urllib3_info,
        'chardet': chardet_info,
        'cryptography': cryptography_info,
        'idna': idna_info,
        'requests': {
            'version': requests_version,
        },
    }
Copy the code

Info provides information about the system environment, and _implementation is its internal implementation, starting with the underscore *_*. Look at the test class test_help:

from requests.help import info def test_system_ssl(): """Verify we're actually setting system_ssl when it should be available.""" assert info()['system_ssl']['version'] ! = '' class VersionedPackage(object): def __init__(self, version): self.__version__ = version def test_idna_without_version_attribute(mocker): """Older versions of IDNA don't provide a __version__ attribute, verify that if we have such a package, we don't blow up. """ mocker.patch('requests.help.idna', new=None) assert info()['idna'] == {'version': ''} def test_idna_with_version_attribute(mocker): """Verify we're actually setting idna version when it should be available.""" mocker.patch('requests.help.idna', New = VersionedPackage (' 2.6 ')) assert info () [' idna] = = {' version ':' 2.6 '}Copy the code

As you can see from the import information in the header, testing only the INFO function is easy to understand. Info passes the test and naturally overrides the _implementation internal function. This is where unit testing tip # 1 comes in:

Only public interfaces are tested

Both test_idna_without_version_attribute and test_idna_with_version_attribute have a mocker parameter, which is provided by Pytest-mock and automatically infuses a mock implementation. Use this mock to simulate the IDNA module

Mocker. patch('requests. Help. idna', new=None) mocker.patch('requests. New = VersionedPackage (' 2.6 '))Copy the code

Patch emulates requests. Help. Idna, while we import inda module in help. This is because inda was redirected to the module name in requests. Packages:

for package in ('urllib3', 'idna', 'chardet'):
    locals()[package] = __import__(package)
    # This traversal is apparently necessary such that the identities are
    # preserved (requests.packages.urllib3.* is urllib3.*)
    for mod in list(sys.modules):
        if mod == package or mod.startswith(package + '.'):
            sys.modules['requests.packages.' + mod] = sys.modules[mod]
Copy the code

Using Mocker, idNA’s __version__ information can be controlled so that the IDNA results in info can be expected. So, tip # 2:

Use mocks to aid unit testing

Test_hooks implementation analysis

Let’s continue to see how hooks test:

from requests import hooks

def hook(value):
    return value[1:]

@pytest.mark.parametrize(
    'hooks_list, result', (
        (hook, 'ata'),
        ([hook, lambda x: None, hook], 'ta'),
    )
)
def test_hooks(hooks_list, result):
    assert hooks.dispatch_hook('response', {'response': hooks_list}, 'Data') == result

def test_default_hooks():
    assert hooks.default_hooks() == {'response': []}
Copy the code

The two interfaces of the hooks module, Default_hooks and dispatch_hook, are tested. Default_hooks are pure functions with no arguments and return values, which are the easiest to test by simply checking that the return values are as expected. Dispatch_hook is a bit more complicated and involves a call to the hook function:

def dispatch_hook(key, hooks, hook_data, **kwargs): """Dispatches a hook dictionary on a given piece of data.""" hooks = hooks or {} hooks = hooks.get(key) if hooks: If hasattr(hooks, '__call__'): hooks = [hooks] for hooks: _hook_data = hook(hook_data, **kwargs) if _hook_data is not None: hook_data = _hook_data return hook_dataCopy the code

Pytest.mark. parametrize provides two sets of parameters to test. The first set of arguments hook and ATA are simple. Hook is a function that crops the argument with the first element removed, and ATA is the expected return value. Test_hooks response takes Data, so the result should be ATA. The first argument in the second set of arguments is a bit more complicated, becoming an array with a hook function at the beginning and an anonymous function in the middle that returns no value, overwriting the by-pass branch of if _hook_data is not None:. The execution process is as follows:

Hook function clippingDataFirst, surplusata
Anonymous function does not modify the result, remainingata
Hook functions continue to cropataFirst, surplusta

After the test, it can be found that the design of Dispatch_hook is very clever. The pipeline mode is used to chain all the hooks, which is different from the event mechanism. No bypass tests were performed, which is not rigorous enough and violates tip # 3:

Tests cover as many branches of the target function as possible

Test_structures implementation analysis

LookupDict tests are used as follows:

class TestLookupDict:

    @pytest.fixture(autouse=True)
    def setup(self):
        """LookupDict instance with "bad_gateway" attribute."""
        self.lookup_dict = LookupDict('test')
        self.lookup_dict.bad_gateway = 502

    def test_repr(self):
        assert repr(self.lookup_dict) == "<lookup 'test'>"

    get_item_parameters = pytest.mark.parametrize(
        'key, value', (
            ('bad_gateway', 502),
            ('not_a_key', None)
        )
    )

    @get_item_parameters
    def test_getitem(self, key, value):
        assert self.lookup_dict[key] == value

    @get_item_parameters
    def test_get(self, key, value):
        assert self.lookup_dict.get(key) == value
Copy the code

You can see that the setup method used in conjunction with @pytest.fixture initializes a lookup_dict object for all test cases; While PyTest.Mark.Parametrize can be reused across different test cases, we’ll have a fourth tip:

Fixture is used to reuse the tested object, and PyTest.Mark.Parametriz is used to reuse test parameters

LookupDict’s get and __getitem__ methods are test_getitem and test_get.

class LookupDict(dict):
    ...
    def __getitem__(self, key):
        # We allow fall-through here, so values default to None
        return self.__dict__.get(key, None)

    def get(self, key, default=None):
        return self.__dict__.get(key, default)
Copy the code

Get customizes the dictionary to make it usablegetMethod to get a value
__getitem__ Customizes the dictionary to make it usable[]Match value

CaseInsensitiveDict’s test cases are tested in both TEST_Structures, which is mostly basic, and test_Requests, which is business-oriented, and we can see the differences:

Def test_repr(self): assert repr(self.case_insensitive_dict) == "{'Accept': 'application/json'}" def test_copy(self): copy = self.case_insensitive_dict.copy() assert copy is not self.case_insensitive_dict assert copy == Self. case_insensitive_dict class TestCaseInsensitiveDict: def test_delitem(self): cid = CaseInsensitiveDict() cid['Spam'] = 'someval' del cid['sPam'] assert 'spam' not in cid assert len(cid) == 0 def test_contains(self): cid = CaseInsensitiveDict() cid['Spam'] = 'someval' assert 'Spam' in cid assert 'spam' in cid assert 'SPAM' in cid assert 'sPam' in cid assert 'notspam' not in cidCopy the code

Using the above test method, it is not difficult to come up with tip 5:

You can unit test the same object at different levels

This technique is also applied to test_lowlevel and test_Requests later

utils.py

Utils builds a generator that can be written to env (provided by the yield keyword) and can be used as a context decorator:

import contextlib
import os

@contextlib.contextmanager
def override_environ(**kwargs):
    save_env = dict(os.environ)
    for key, value in kwargs.items():
        if value is None:
            del os.environ[key]
        else:
            os.environ[key] = value
    try:
        yield
    finally:
        os.environ.clear()
        os.environ.update(save_env)
Copy the code

Here is an example of how to use:

# test_requests. Py kwargs = {var: proxy} # test_requests. Py kwargs = {var: proxy} # proxies = session.rebuild_proxies(prep, {}) def rebuild_proxies(self, prepared_request, proxies): bypass_proxy = should_bypass_proxies(url, no_proxy=no_proxy) def should_bypass_proxies(url, no_proxy): ... get_proxy = lambda k: os.environ.get(k) or os.environ.get(k.upper()) ...Copy the code

Where environment variables are involved, a variety of environment variables can be simulated using context decorators

Utils test case

Utils has many test cases, so we select some of them for analysis. To_key_val_list first:

Def to_key_val_list(value): if value is None: return None if isinstance(value, (STR, bytes, bool)): raise ValueError('cannot encode objects that are not 2-tuples') if isinstance(value, Mapping): value = value.items() return list(value)Copy the code

Corresponding test case TestToKeyValList:

class TestToKeyValList:

    @pytest.mark.parametrize(
        'value, expected', (
            ([('key', 'val')], [('key', 'val')]),
            ((('key', 'val'), ), [('key', 'val')]),
            ({'key': 'val'}, [('key', 'val')]),
            (None, None)
        ))
    def test_valid(self, value, expected):
        assert to_key_val_list(value) == expected

    def test_invalid(self):
        with pytest.raises(ValueError):
            to_key_val_list('string')
Copy the code

Test_invalid: pytest.raise: pytest.raise: pytest.raise

Exceptions are caught using PyTest.raises

TestSuperLen introduces several methods for conducting IO simulation tests:

class TestSuperLen:

    @pytest.mark.parametrize(
        'stream, value', (
            (StringIO.StringIO, 'Test'),
            (BytesIO, b'Test'),
            pytest.param(cStringIO, 'Test',
                         marks=pytest.mark.skipif('cStringIO is None')),
        ))
    def test_io_streams(self, stream, value):
        """Ensures that we properly deal with different kinds of IO streams."""
        assert super_len(stream()) == 0
        assert super_len(stream(value)) == 4

    def test_super_len_correctly_calculates_len_of_partially_read_file(self):
        """Ensure that we handle partially consumed file like objects."""
        s = StringIO.StringIO()
        s.write('foobarbogus')
        assert super_len(s) == 0
    
    @pytest.mark.parametrize(
        'mode, warnings_num', (
            ('r', 1),
            ('rb', 0),
        ))
    def test_file(self, tmpdir, mode, warnings_num, recwarn):
        file_obj = tmpdir.join('test.txt')
        file_obj.write('Test')
        with file_obj.open(mode) as fd:
            assert super_len(fd) == 4
        assert len(recwarn) == warnings_num

    def test_super_len_with_tell(self):
        foo = StringIO.StringIO('12345')
        assert super_len(foo) == 5
        foo.read(2)
        assert super_len(foo) == 3

    def test_super_len_with_fileno(self):
        with open(__file__, 'rb') as f:
            length = super_len(f)
            file_data = f.read()
        assert length == len(file_data)
Copy the code

Using StringIO to simulate IO operations, you can configure various IO tests. You can also use BytesIO/cStringIO, but unit test cases generally don’t focus on performance, and StringIO is simple enough.
Pytest provides tmpdir fixtures for testing file reads and writes
A file can be tested read-only using __file__, which represents the current file and has no side effects.

Unit tests were performed using IO emulation

How is Request-API tested

The test for Requests requires httpbin, which starts a local service, and Pytest-Httpbin, which installs a PyTest plug-in. The test case has httpbin fixtures that operate on the URL of the service.

class	function
TestRequests	Requests Business Tests
TestCaseInsensitiveDict	Case-insensitive dictionary tests
TestMorselToCookieExpires	Cookie expiration test
TestMorselToCookieMaxAge	Cookie size
TestTimeout	Tests that respond to timeouts
TestPreparingURLs	URL pretreatment
.	Some fragmented test cases

Let’s be honest: this test case is huge, 2,500 lines long. It seems to be scattered cases for various businesses, and I have not completely sorted out their organizational logic. I will select some interesting services to introduce, starting with the test of TimeOut:

TARPIT = 'http://10.255.255.1' class TestTimeout: def test_stream_timeout(self, httpbin): try: Requests. The get (httpbin (' delay / 10 '), a timeout = 2.0) except requests. Exceptions. A timeout as e: Assert 'Read timed out' in e.args[0]. Args [0] @pytest.mark.parametrize('timeout', ((0.1, None), Urllib3Timeout(connect=0.1, read=None)) def test_connect_timeout(self, timeout): try: requests.get(TARPIT, timeout=timeout) pytest.fail('The connect() request should time out.') except ConnectTimeout as e: assert isinstance(e, ConnectionError) assert isinstance(e, Timeout)Copy the code

Test_stream_timeout uses httpbin to create an interface that delays the response by 10s, and then sets the request itself to 2s to receive a local timeout error. Test_connect_timeout catches the connection timeout error by accessing a nonexistent service.

TestRequests are both business process tests of Requests, and there are at least two types:

class TestRequests: def test_basic_building(self): req = requests.Request() req.url = 'http://kennethreitz.org/' req.data = {'life': '42'} pr = req.prepare() assert pr.url == req.url assert pr.body == 'life=42' def test_path_is_not_double_encoded(self):  request = requests.Request('GET', "http://0.0.0.0/get/test case").prepare() assert Request. Path_url == '/get/test%20case... def test_HTTP_200_OK_GET_ALTERNATIVE(self, httpbin): r = requests.Request('GET', httpbin('get')) s = requests.Session() s.proxies = getproxies() r = s.send(r.prepare()) assert r.status_code == 200 ef test_set_cookie_on_301(self, httpbin): s = requests.session() url = httpbin('cookies/set? foo=bar') s.get(url) assert s.cookies['foo'] == 'bar'Copy the code

To validate the URL, only prepare the request. In this case, the request is not sent and the test case is faster without network traffic
In cases where response data is required, build the real request-response data using Httbin

Underlying API testing

Testserver builds a simple thread-based TCP service with __enter__ and __exit__ methods that can also be used as a context.

class TestTestServer: def test_basic(self): """messages are sent and received properly""" question = b"success?" answer = b"yeah, success" def handler(sock): text = sock.recv(1000) assert text == question sock.sendall(answer) with Server(handler) as (host, port): sock = socket.socket() sock.connect((host, port)) sock.sendall(question) text = sock.recv(1000) assert text == answer sock.close() def test_text_response(self): ""the text_response_server sends the given text""" server = server.text_response_server ("HTTP/1.1 200 OK\r\n" + "Content-Length: 6\r\n" + "\r\nroflol" ) with server as (host, port): r = requests.get('http://{}:{}'.format(host, port)) assert r.status_code == 200 assert r.text == u'roflol' assert r.headers['Content-Length'] == '6'Copy the code

The test_basic method performs basic check on the Server to ensure that both the sender and the receiver can send and receive data correctly. First, the sock on the client sends a question. Then, the server determines that the received data is a question in the handler and returns an answer after confirmation. Finally, the client confirms that the answer response can be correctly received. The test_text_response method incompletely tests the HTTP protocol. If an HTTP request is sent according to the HTTP protocol specification, server.text_response_server displays the request. Here is a testcase that simulates the browser’s anchor location without being transmitted over the network:

def test_fragment_not_sent_with_request(): """Verify that the fragment portion of a URI isn't sent to the server.""" def response_handler(sock): Req = consume_socket_content(sock, timeout=0.5) socket. send(b'HTTP/1.1 200 OK\r\n' b' content-length: '+bytes(len(req))+b'\r\n' b'\r\n'+req ) close_server = threading.Event() server = Server(response_handler, wait_to_close_event=close_server) with server as (host, port): url = 'http://{}:{}/path/to/thing/#view=edit&token=hunter2'.format(host, port) r = requests.get(url) raw_request = r.content assert r.status_code == 200 headers, body = raw_request.split(b'\r\n\r\n', 1) status_line, headers = headers.split(b'\r\n', 1) Assert status_line == b'GET /path/to/thing/ HTTP/1.1' for Frag in (b'view', B 'edit', B 'token', B 'hunter2'): assert frag not in headers assert frag not in body close_server.set()Copy the code

/path/to/thing/#view=edit&token=hunter2 /path/to/thing/#view=edit&token=hunter2 In the above test case, judge the received response to verify that the response header and response body do not contain these keywords.

Combining the two levels of testing for Requests led to tip 9:

Construct mock service match tests

summary

To summarize, the following nine tips emerged from the unit test practice for Requests:

Only public interfaces are tested
Use mocks to aid unit testing
Tests cover as many branches of the target function as possible
Fixture is used to reuse the tested object, and PyTest.Mark.Parametriz is used to reuse test parameters
You can unit test the same object at different levels
Where environment variables are involved, a variety of environment variables can be simulated using context decorators
Exceptions are caught using PyTest.raises
Unit tests were performed using IO emulation
Construct mock service match tests

Refer to the link

docs.python-requests.org/en/master/
httpbin.org