Python | So collections are so easy to use! !

Source: South Branch to Warm North Branch to Cold MA

https://blog.csdn.net/mall_lucy/article/details/108822795

[Introduction]: Collections are containers that achieve specific goals to provide alternatives to Python's standard built-in containers dict, list, set, and tuple. In order to give everyone a better understanding, this article summarizes the relevant knowledge of collections in detail, let's learn together!

Collections module: A container that achieves a specific goal to provide alternatives to the Python standard built-in containers dict, list, set, and tuple.

Counter: A subclass of dictionary that provides the counting function of hashable objects.

defaultdict: A subclass of dictionary that provides a factory function to provide default values for dictionary queries.

OrderedDict: A subclass of dictionary that retains the order in which they were added.

namedtuple: A factory function for creating named tuple subclasses.

deque: Similar to a list container, which implements quick append and pop at both ends.

ChainMap: A dictionary-like container class that collects multiple maps into one view.

Counter

Counter is a subclass of dict, mainly used to count the frequency of objects you visit.

>>> import collections
>>> # Count the number of occurrences of characters
... collections.Counter('hello world')Counter({'l':3,'o':2,'h':1,'e':1,' ':1,'w':1,'r':1,'d':1})>>> #Count the number of words
... collections.Counter('hello world hello lucy'.split())Counter({'hello':2,'world':1,'lucy':1})

Common methods:

elements(): returns an iterator, the number of repeated calculations for each element. If the count of an element is less than 1, it will be ignored.
most_common([n]): returns a list, providing n elements and counts with the highest access frequency
subtract([iterable-or-mapping]): subtract elements from the iterable object, input and output can be 0 or negative
update([iterable-or-mapping]): Count elements from an iteration object or add from another mapping object (or counter).

>>> c = collections.Counter('hello world hello lucy'.split())>>> c
Counter({'hello':2,'world':1,'lucy':1})>>> #Get the number of visits to the specified object, you can also use the get method
... c['hello']2>>> #View element
... list(c.elements())['hello','hello','world','lucy']>>> c1 = collections.Counter('hello world'.split())>>> c2 = collections.Counter('hello lucy'.split())>>> c1
Counter({'hello':1,'world':1})>>> c2
Counter({'hello':1,'lucy':1})>>> #Append objects,+Or c1.update(c2)... c1+c2
Counter({'hello':2,'world':1,'lucy':1})>>> #Reduce objects,-Or c1.subtract(c2)... c1-c2
Counter({'world':1})>>> #Clear
... c.clear()>>> c
Counter()

defaultdict

Return a new dictionary-like object. defaultdict is a subclass of the built-in dict class.

class collections.defaultdict([default_factory[, ...]])

>>> d = collections.defaultdict()>>> d
defaultdict(None,{})>>> e = collections.defaultdict(str)>>> e
defaultdict(<class'str'>,{})

example

A typical usage of defaultdict is to use one of the built-in types (such as str, int, list, dict, etc.) as the default factory. These built-in types return an empty type when called without parameters.

>>> e = collections.defaultdict(str)>>> e
defaultdict(<class'str'>,{})>>> e['hello']''>>> e
defaultdict(<class'str'>,{'hello':''})>>> #An error is reported when a normal dictionary calls a key that does not exist
... e1 ={}>>> e1['hello']Traceback(most recent call last):
 File "<stdin>", line 1,in<module>
KeyError:'hello'

Use int as default_factory

>>> fruit = collections.defaultdict(int)>>> fruit['apple']=2>>> fruit
defaultdict(<class'int'>,{'apple':2})>>> fruit['banana']  #When there is no object, return 00>>> fruit
defaultdict(<class'int'>,{'apple':2,'banana':0})

Use list as default_factory

>>> s =[('yellow',1),('blue',2),('yellow',3),('blue',4),('red',1)]>>> d = collections.defaultdict(list)>>>for k,v in s:...     d[k].append(v)...>>> d
defaultdict(<class'list'>,{'yellow':[1,3],'blue':[2,4],'red':[1]})>>> d.items()dict_items([('yellow',[1,3]),('blue',[2,4]),('red',[1])])>>>sorted(d.items())[('blue',[2,4]),('red',[1]),('yellow',[1,3])]

Use dict as default_factory

>>> nums = collections.defaultdict(dict)>>> nums[1]={'one':1}>>> nums
defaultdict(<class'dict'>,{1:{'one':1}})>>> nums[2]{}>>> nums
defaultdict(<class'dict'>,{1:{'one':1},2:{}})

Use set as default_factory

>>> types = collections.defaultdict(set)>>> types['Cell phone'].add('Huawei')>>> types['Cell phone'].add('Xiaomi')>>> types['monitor'].add('AOC')>>> types
defaultdict(<class'set'>,{'Cell phone':{'Huawei','Xiaomi'},'monitor':{'AOC'}})

OrderedDict

The order of the keys in the Python dictionary is arbitrary and they are not controlled by the order of addition.

The collections.OrderedDict class provides dictionary objects that preserve the order in which they are added.

>>> o = collections.OrderedDict()>>> o['k1']='v1'>>> o['k3']='v3'>>> o['k2']='v2'>>> o
OrderedDict([('k1','v1'),('k3','v3'),('k2','v2')])

If you add a new value to an existing key, the original key position will be retained, and then the value value will be overwritten.

>>> o['k1']=666>>> o
OrderedDict([('k1',666),('k3','v3'),('k2','v2')])>>>dict(o){'k1':666,'k3':'v3','k2':'v2'}

namedtuple

Three methods of defining named tuples: The first parameter is the constructor of the named tuple (as follows: Person1, Person2, Person3)

>>> P1 = collections.namedtuple('Person1',['name','age','height'])>>> P2 = collections.namedtuple('Person2','name,age,height')>>> P3 = collections.namedtuple('Person3','name age height')

Instantiate named tuples

>>> lucy =P1('lucy',23,180)>>> lucy
Person1(name='lucy', age=23, height=180)>>> jack =P2('jack',20,190)>>> jack
Person2(name='jack', age=20, height=190)>>> lucy.name  #Pass instance name directly.Attribute to call
' lucy'>>> lucy.age
23

deque

collections.deque returns a new two-way queue object, initialized from left to right (using the append() method), and created from iterable data. If iterable is not specified, the new queue is empty.

The collections.deque queue supports thread safety, and the complexity is O(1) for appending or popping from both ends.

Although the list object also supports similar operations, the overhead of fixed-length operations (pop(0), insert(0,v)) is optimized here.

If maxlen is not specified or is None, deque can grow to any length. Otherwise, the deque is limited to the specified maximum length. Once the deque with a limited length is full, when a new item is added, the same number of items are ejected from the other end.

Supported methods:

append(x): Add x to the right end.
appendleft(x): Add x to the left end.
clear(): Clear all elements, the length becomes 0.
copy(): Create a shallow copy.
count(x): Count the number of elements in the queue equal to x.
extend(iterable): Add elements in iterable to the right side of the queue.
extendleft(iterable): Add elements in iterable on the left side of the queue. Note: When adding on the left side, the order of iterable parameters will be reversed.
index(x[,start[,stop]]): Returns the xth element (calculated from start, before stop). Return the first match, if not found, throw ValueError.
insert(i,x): Insert x at position i. Note: If the insertion will cause a limit deque to exceed the length maxlen, an IndexError is thrown.
pop(): Remove the rightmost element.
popleft(): Remove the leftmost element.
remove(value): Remove the first value found. No ValueError is thrown.
reverse(): Arrange the deque in reverse order. Return None.
maxlen: The maximum length of the queue, or None if there is no limit.

>>> d = collections.deque(maxlen=10)>>> d
deque([], maxlen=10)>>> d.extend('python')>>>[i.upper()for i in d]['P','Y','T','H','O','N']>>> d.append('e')>>> d.appendleft('f')>>> d.appendleft('g')>>> d.appendleft('h')>>> d
deque(['h','g','f','p','y','t','h','o','n','e'], maxlen=10)>>> d.appendleft('i')>>> d
deque(['i','h','g','f','p','y','t','h','o','n'], maxlen=10)>>> d.append('m')>>> d
deque(['h','g','f','p','y','t','h','o','n','m'], maxlen=10)

ChainMap

The background of the problem is that we have multiple dictionaries or mappings, and we want to merge them into a single mapping. Some people say that we can use update to merge. The problem with this is that a new data structure is created so that when we change the original dictionary Time will not sync. If you want to establish a synchronous query method, you can use ChainMap.

It can be used to merge two or more dictionaries, and when querying, query from front to back. Simple to use:

>>> d1 ={'apple':1,'banana':2}>>> d2 ={'orange':2,'apple':3,'pike':1}>>> combined1 = collections.ChainMap(d1,d2)>>> combined2 = collections.ChainMap(d2,d1)>>> combined1
ChainMap({'apple':1,'banana':2},{'orange':2,'apple':3,'pike':1})>>> combined2
ChainMap({'orange':2,'apple':3,'pike':1},{'apple':1,'banana':2})>>>for k,v in combined1.items():...print(k,v)...
orange 2
apple 1
pike 1
banana 2>>>for k,v in combined2.items():...print(k,v)...
apple 3
banana 2
orange 2
pike 1

One point to note is that when modifying ChainMap, only the first dictionary will always be modified. If the key does not exist in the first dictionary, it will be added.

>>> d1 ={'apple':1,'banana':2}>>> d2 ={'orange':2,'apple':3,'pike':1}>>> c = collections.ChainMap(d1,d2)>>> c
ChainMap({'apple':1,'banana':2},{'orange':2,'apple':3,'pike':1})>>> c['apple']1>>> c['apple']=2>>> c
ChainMap({'apple':2,'banana':2},{'orange':2,'apple':3,'pike':1})>>> c['pike']1>>> c['pike']=3>>> c
ChainMap({'apple':2,'banana':2,'pike':3},{'orange':2,'apple':3,'pike':1})

From the principle above, ChainMap actually stores the placed dictionary in a queue. When the dictionary is added or deleted, it will only be performed on the first dictionary. When the search is performed, it will be searched in turn, new_child() The method is essentially to put a dictionary before the first element of the list, the default is {}, and parents are the elements at the beginning of the list removed.

>>> a = collections.ChainMap()>>> a['x']=1>>> a
ChainMap({'x':1})>>> b = a.new_child()>>> b
ChainMap({},{'x':1})>>> b['x']=2>>> b
ChainMap({'x':2},{'x':1})>>> b['y']=3>>> b
ChainMap({'x':2,'y':3},{'x':1})>>> a
ChainMap({'x':1})>>> c = a.new_child()>>> c
ChainMap({},{'x':1})>>> c['x']=1>>> c['y']=1>>> c
ChainMap({'x':1,'y':1},{'x':1})>>> d = c.parents
>>> d
ChainMap({'x':1})>>> d is a
False
>>> d == a
True

>>> a ={'x':1,'z':3}>>> b ={'y':2,'z':4}>>> c = collections.ChainMap(a,b)>>> c
ChainMap({'x':1,'z':3},{'y':2,'z':4})>>> c.maps
[{' x':1,'z':3},{'y':2,'z':4}]>>> c.parents
ChainMap({'y':2,'z':4})>>> c.parents.maps
[{' y':2,'z':4}]>>> c.parents.parents
ChainMap({})>>> c.parents.parents.parents
ChainMap({})

Love&Share [Finish]