demand
To be honest, studying language details alone is boring and easy to forget. This study originated from the requirements of a business scenario, and it is easier to understand and memorize the language details with the requirements of the scenario.
We have a set of players’ session dictionaries that we want to keep track of the players’ connections, and the pseudocode looks something like this:
Player_session_dict ["1000"]="playerA player_session_dict["1000"] == "playerA"Copy the code
In general, {} is fine, but some agreements will use int ID, and get session error:
assert player_session_dict[1000] == "playerA" # error
Copy the code
To solve this problem, we want to implement a data structure to store the session, using int/ STR can obtain the corresponding session:
Player_session_dict = SomeDict() player_session_dict[1000]="playerA" Get (1000) == "playerA" assert player_session_dict.get(1000) == "playerA" # == "playerA" assert player_session_dict.get("1000") == "playerA"Copy the code
This article includes the following parts
- object vs dict
- Four ways to access properties
- Object’s __dict__ property
- Use slot to restrict objects
- Implement SomeDict class
- conclusion
- tip
object vs dict
Start with Python objects and dictionaries and write the following test cases:
>>> class A:
... pass
...
>>> a = A()
>>> class B(object):
... pass
...
>>> b = B()
>>> class C(dict):
... pass
...
>>> c = C()
Copy the code
Class A is the old way of writing, and class B is the new way of writing, and I’m used to using class B, which feels more definite
A, B, c, c, c, c
>>> isinstance(a, object)
True
>>> isinstance(a, dict)
False
>>> isinstance(b, object)
True
>>> isinstance(b, dict)
False
>>> isinstance(c, object)
True
>>> isinstance(c, dict)
True
Copy the code
Dict inherits from object, which is common sense. This can also be verified from the source builtins.py:
class dict(object):
...
Copy the code
So far, our data structure SomeDict can be derived from either object or dict.
Dict () {} and dict() create dictionaries with no difference:
>>> a = {}
>>> b = dict()
>>> type(a)
<class 'dict'>
>>> type(b)
<class 'dict'>
>>> a == b
True
>>> a is b
False
Copy the code
The same goes for memory usage:
>>> import sys
>>>
>>> sys.getsizeof(a)
280
>>> b = dict()
>>> sys.getsizeof(b)
280
Copy the code
Four ways to access properties
There are probably several ways to get object attributes in Python:
.
Point: Access the attribute of an object. AttributeError is not reported[]
Square brackets: obtain list/map values based on the index. KeyError will be reported if the dictionary index does not exist.get
Get method, same as square brackets, except that index does not exist- use
in
Determines whether the attribute exists
Ordinary objects can get attributes using. :
>>> class A(object):
... pass
...
>>> a = A()
>>> a.name = "aa"
>>> a.name
'aa'
Copy the code
[] and get methods cannot be used to obtain attributes:
>>> a["name"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'A' object does not support item assignment
>>> a.get("name")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute 'get'
>>> a
<__main__.A object at 0x10de4fad0>
Copy the code
Normal dictionaries can use [] to get attributes:
>>> b = {}
>>> b["name"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'name'
>>> b["name"] = "bb"
>>> b["name"]
'bb'
Copy the code
If the attribute does not exist, KeyError will be reported. You can use the get method to be safer, and none will be returned for nonexistent attributes:
>>> b.get("age")
>>> print(b.get("age"))
None
Copy the code
In can also be used to determine whether the dictionary attribute exists first and obtain it after confirming the existence:
>>> if "age" in b:
... print("age in b")
... print(b["age"])
...
Copy the code
Normal dictionaries cannot use. To get attributes:
>>> b.name
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'name'
Copy the code
What if a class is derived from a dictionary? First look at the code:
>>> class C(dict):
... pass
...
>>>
>>> c = C()
>>> c.name = "cc"
>>> c.name
'cc'
>>> c["age"] = 10
>>> c["age"]
10
Copy the code
C object can be used. Get attributes. You can also use [] to get attributes, which have both object and dict features. But it should be noted that it can not be mixed and matched:
>>> c["name"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'name'
>>>
>>> c.age
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'B' object has no attribute 'age'
Copy the code
Get and in are used in the same way as [] :
>>> c.get("name")
>>> c.get("age")
10
>>> "name" in c
False
>>> "age" in c
True
Copy the code
Look at the get method to find the definition:
class Mapping(_Collection[_KT], Generic[_KT, _VT_co]):
# TODO: We wish the key type could also be covariant, but that doesn't work,
# see discussion in https: //github.com/python/typing/pull/273.
@abstractmethod
def __getitem__(self, k: _KT) -> _VT_co:
...
# Mixin methods
@overload
def get(self, k: _KT) -> Optional[_VT_co]: ...
@overload
def get(self, k: _KT, default: Union[_VT_co, _T]) -> Union[_VT_co, _T]: ...
def items(self) -> AbstractSet[Tuple[_KT, _VT_co]]: ...
def keys(self) -> AbstractSet[_KT]: ...
def values(self) -> ValuesView[_VT_co]: ...
def __contains__(self, o: object) -> bool: ...
Copy the code
In corresponds to the contains method. Mapping is also mixin self-collection, so you can also use []
Object’s __dict__ property
To fully understand the differences in the use of C objects above, you need to understand the implementation of object, mainly __dict__. First look at the code:
>>> class C(dict):
... pass
...
>>> c = C()
>>> c.name = "cc"
>>>
>>> c["age"] = 10
>>>
>>> c
{'age': 10}
>>>
>>> c.__dict__
{'name': 'bb'}
Copy the code
You can see that the representation of the C object separates the two dictionaries. One is the dictionary section, which comes from dict and can be used with []; The other is __dict__ inherited from the object, which can be used with.
__dict__ is a hidden attribute for custom objects, such as the a object above:
>>> a.__dict__
{'name': 'aa'}
Copy the code
Even category A:
>>> A.__dict__
dict_proxy({'__dict__': <attribute '__dict__' of 'A' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'A' objects>, '__doc__': None})
Copy the code
Here’s what the official documentation explains:
Custom classes: Each class has a separate namespace implemented by a dictionary object. Class attribute references are converted to look up in this dictionary, for example C.x is converted to c.dash [“x”]
Class Instances: Each Class instance has a separate namespace implemented through a dictionary object in which attribute references are first looked up. When an attribute is not found but exists in the class corresponding to the instance, the search continues in the class attribute. Special attributes: dict is a dictionary of attributes; Class indicates the class of the instance.
Mapping/Dictionary: This class of objects represents a collection of objects indexed by any index collection. Entries with index K can be selected from mapping A by subscript A [k]; This can be used in expressions or as the target of assignment or DEL statements. The built-in function len() returns the number of entries in a map.
The namespace here can be understood as a scope, for example:
name = "aaa"
def func():
name = "bbb"
pass
Copy the code
The name definition here can have different values in different scopes, aaa in global and BBB in func. Also for instances c1 and C2 of C objects, the same attribute name name points to different __dict__ namespaces.
Use slot to restrict objects
For ordinary objects, we can define and use:
>>> class D(object):
... def __init__(self, name):
... self.name = name
...
>>>
>>> d = D("dd")
>>> d.name
'dd'
>>> d.__dict__
{'name': 'dd'}
Copy the code
Defining name and then using name is very natural. But you can also assign and use the age attribute dynamically like this:
>>> d.age = 10
>>> d.age
10
>>> d.__dict__
{'name': 'dd', 'age': 10}
Copy the code
Code written this way is difficult to maintain later. We can use __slots__ to restrict objects:
>>> class E(object):
... __slots__=("name")
... def __init__(self,name):
... self.name=name
...
>>> e = E("ee")
>>> e.name
'ee'
>>> e.age = 10
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'E' object has no attribute 'age'
Copy the code
E defines a slot named name, so that only the name attribute can be used. Other attributes will report errors. With slots, the object’s __dict__ has also been optimized:
>>> b.__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'B' object has no attribute '__dict__'
Copy the code
Implement SomeDict class
Now that we know the basics, we can implement SomeDict that meets our needs. Behave like {} so that business usage is not affected.
class SomeDict(dict):
def __getitem__(self, item):
# []
return super(SomeDict, self).__getitem__(str(item))
def __setitem__(self, key, value):
super(SomeDict, self).__setitem__(str(key), value)
def __delitem__(self, key):
super(SomeDict, self).__delitem__(str(key))
def get(self, item):
# get
return super(SomeDict, self).get(str(item))
def __contains__(self, item):
# in
return super(SomeDict, self).__contains__(str(item))
Copy the code
Here are the test cases:
def test_some_dict():
session_clients = SomeDict()
session_clients["1000"] = "1000"
assert session_clients["1000"] == "1000"
assert session_clients[1000] == "1000"
assert session_clients.get("1000") == "1000"
assert session_clients.get(1000) == "1000"
assert session_clients.get("non_key") is None
try:
session_clients["non_key"]
except KeyError as e:
pass
assert "1000" in session_clients
assert 1000 in session_clients
assert "non_key" not in session_clients
assert 10001 not in session_clients
del session_clients[1000]
assert 1000 not in session_clients
print("success")
Copy the code
conclusion
We delve into the details of object and dictionary implementations of the Python language, comparing the use of. The difference between [] and [] implements a dictionary where only strings are used as keys. A simple summary is:
- For custom objects, you can use.Get property values
- For dictionary objects, you can use[]Get property values
- For dictionary objects, it can also be usedget 和 inFriendly fetch (no exceptions)
Performance tips
Here are two more performance tuning tips for dictionaries.
{} and dict performance comparison
Use timeit to test the speed of creating objects with {} and dict() :
$ python3 -m timeit 'x={}' 20000000 loops, best of 5: 18.1 nsec per loop $python3 -m timeit 'x=dict()' 5000000 Loops, best of 5: 93.6 nsec per loopCopy the code
You’ll find that using the {} syntax is much faster. Let’s write the following test case:
a = {}
b = dict()
Copy the code
Here’s what the test case looks like when compiled:
1 0 BUILD_MAP 0 2 STORE_NAME 0 (a) 3 4 LOAD_NAME 1 (dict) 6 CALL_FUNCTION 0 8 STORE_NAME 2 (b) 10 LOAD_CONST 0 (None) 12 RETURN_VALUECopy the code
As you can see, the former is a BUILD_MAP statement, while the latter also calls constructors and so on, so the former is faster and performs better.
slotsThe performance comparison
Timeit can also be used to test slot:
# python3 -m timeit -s 'class A(object):pass' -- "A()" # 5000000 loops, best of 5: 67.2 nsec per loop # python3 -m timeit -s 'class A(object): __slots__ = ("x",) '-- "A()" # 5000000 Loops, best of 5: 63.1 nsec per loopCopy the code
It’s easy to see how using slot also makes object creation faster.
Refer to the link
- Docs.python.org/3/reference…
- Docs.python.org/3/reference…
- Mp.weixin.qq.com/s/d5Y4Ghqa_…