Python collections module
Python’s collections module is often ignored and sidelined by most of the python programmers but it contains few useful container data types which can come handy in some special scenarios and use-cases.
Let’s dig in!
As of writing, Wed 6 Nov 2019, and v3.8 — Python has total of 9 container Data types:
We’ll now see what each one has to offer and which use-case they satisfy.
namedtuple() is actually a factory function to create
tuple with named fields.
Since, there is no mention of private scope/properties/variables in Python’s
definition of OOPS (although we can enforce it to some extent using meta-classes
__setattr_ etc but still true immutability is missing). Now here comes
namedtuple which provides us the succinct way to create immutable attributes.
We can access our elements by either using tuple like numbered index or key based lookup. It can be useful where we have a requirement to make class attributes immutable.
It is the implementation of “double-ended” queues in Python and is pronounced as
“deck”. It supports
pop in order of
either direction. If
maxlen attribute is not specified or is
None, deques may
grow to an arbitrary length. Once it is full, it will start discarding elements
from other end during an insertion.
Regarding its use-case we all know how and when
deques are used.
Sometimes, we need to override global values/config etc from a local config like
mysql host for production and dev environment, in such situation we
ChainMap is used to encapsulate multiple mappings as one view. In case of a
key present in multiple mappings then value of first dict passed will be taken into
This container data type is a subclass of dict and is used to store counts of
hash-able objects. It can come handy when you have to keep track of number of
occurrences of an element in an iterable (
list, etc). For
missing/non-existent keys it returns 0 or '’ instead of IndexError.
Example use-case, find numbers which appear more than once in a list.
Although, fundamentally dict doesn’t need to have a definite order as we don’t
access elements using its position/index rather we use a
key but still there are
possible use-cases when one need to preserve the order of insertion, therefore
comes this DataType called
OrderedDict to satisfy this use case.
Almost all of us have faced this
KeyError while accessing non-existent keys in a
dict but fortunately
defaultdict saves us from this notorious error and
return the mutually decided default value.
defaultdict accepts a
function/factory as its argument to return the default value. Let’s see how it
UserList, UserDict, and UserString
These Data types are subclass of standard
respectively and act as wrapper around them. You can use them as base class for
your objects so that you can extend default behaviours and add new ones.
The collections module in Python is not very extensive or fancy, and also it doesn’t intend to satisfy every use case but still it can come very handy for some particular use-cases where otherwise a developer would have to write a custom and tricky workaround.
Problems like immutable object attributes, keeping track of occurrences of elements in a list,
maintain order in dicts,
pop from both ends of a list, etc. are very common
and therefore python has standard modules to handle them.
You can learn more about
collections module in official docs here
as each one of them have some cool methods to manipulate data which aren’t mentioned in this article.