Pythonic
Python is “simple and correct”!
- Walrus operator
:=,a := 10will perform assignment and also return10. case [str(name), _, _, (float(lat), float(lon))]:==is usually what you want.isis mostly used for comparison withNoneor other sentinels, it can’t be overloaded so it’s slightly faster than__eq__- Weak references | Fluent Python, several classes can be used to reference objects without actually being counted, useful for caches.
- Prefer EAFP (easy to ask forgiveness than permission) over LBYL (look before you leap), with
try-exceptblocks. This improves safety in multi-threaded programs.
String
- Single quote by default.
'%s/%s' % ('hello', 'world), the%formating operator understands tuple.'\N{NAME OF UNICODE CHAR}'str.maketransandstr.translatelocale.strxformfor transforming Unicode text to locale-aware form for comparison. Alternative, usepyucaand the more powerfulpyicu.unicodedatamodule for metadata about unicode characters.- RegEx built from
strandbytesare different - Use
reprlib.repr()to get concise and readable representation for objects
Data Models
l[i, ...]could meanl[i, :, :, :]__missing__method may or may not be called, depending the class we’re subclassingdictis now ordered by default, butOrderedDictis still optimized for orderingtypes.MutableMappingProxy- Consider extending from
UserDictinstead ofdictdue to their different implementations dict.setdefaultis helpful for updating mutable mappingsdict.keysanddict.valesreturn a view, which is similar to asetmemoryviewcreates a shared view without copying bytes__post_init__method for data classes- Inherit from
TypedDictto have annotations for fields in a dictionary. Unlike named tuple or data classes, it has no runtime effect, can’t set defaults.
Object Oriented
- Mixin classes can be used to provide additional functionalities to a class when inherited.
- Class attributes are stored in a shared dictionary, hence adding attributes after instantiation compromises performance.
- In data classes, fields without type hints are considered class attributes, unless it’s typed as
ClassVarorInitVar - Use
__match_args__method to implement objects matching with positional attributes - Name mangling:
__attrbecomes_ClassName__attr - A method calls
super()will activate the classes in__mro__, the corresponding method may need to also callsuper()to cooperate - Extend from
UserDict,UserList,UserStr, since the built-in types are implemented in C and they don’t call magical methods in the expected way. - Try to avoid inheritance unless the benefit is clear. Static duck typing is better.
- Operator overloading
- For
a + b,a.__add__is invoked first, if doesn’t work, thenb.__radd__ - Return
NotImplementedif not supported
- For
- When accessing attributes from an instance, it first searches in
obj.__class__for properties, if it doesn’t exist, falls back to a key inobj.__ditc. then toobj.__getattr__. This can be bypassed by directly accessingobj.__dict__. - Properties are class attributes. Can be created with
@properties, with corresponding decorators for getters and setters. Can also be created with property factory functions. property(fget=None, fset=None, fdel=None, doc=None)__init__is initializer,__new__is the real constructor.- Using descriptors as getter/setter to validate properties.
__set__,__get__methods. - Every function is a non-overriding descriptor.
Cls.func.__get__(obj)returns a method bound toobj,Cls.func.__get__(None, Cls)returns a function without binding.
Metaprogramming
type(name, bases, dict)can be used to construct classes in run time.typeis a meta class that build other classes.__init_subclass__can be used to customize some behaviors of subclasses when they’re created. Note that__slots__can’t be tweaked here since the classes have been constructed by__new__— meta class is needed for this.typeis an instance oftype,typeis a subclass ofobject,objectis an instance oftype.- Every class is an instance of
type(maybe indirectly), but only meta classes are subclass oftype. Meta classes likeabc.ABCMetainherits fromtypethe power to construct classes. MetaKlass.__new__constructs new classes.
Utils
sortedis based on__lt__methodfunc.__code__to view the code, andfunc.__code__.co_varnamesandfunc.__code__.co_freevarsto view the var names.func.__closure__to view the closure. Content can be viewed withfunc.__closure__[0].cell_contents- Given a
str, we can havestr.fmtto use it as a formatting string. inspectionlib, to help inspect objects.local.getpreferredencodinglist(zip(*a))can be used to transpose a 2D listitertools.chainto chain generators together
Functions
- Position-only argument, tuple-captured arguments, keyword-only arguments… e.g.
func(a, b, /, *d, e, **f) operatorlibrary, which contains operators that can be used together withreduce, or to be used to avoid creatinglambdafunctions.itemgetter,attrgetter, andmethodcallercan be used to create a function that accesses certain item/attr, or call a certain method. e.g.key=itemgetter(2).- Use
functool.partialandpartialmethodto create a callable from a callable with arguments bound to certain values. - Optional arg,
Optional[str]is a shorthand ofUnion[str, None], orstr | None tuple[int, ...]for unlimited tuple- Use
abc.Mappinginstead ofdictin param type hints. Parameters should be typed with generics, returns should be with concretes. @functools.wraps(func)decorator.@cachecan take up all memory,@lru_cachewithmaxsizeis safer.maxsizeshould be a power of2.typeddefaults toFalse
@singledispatch
def htmlize(obj: object) -> str:
content = html.escape(repr(obj))
return f'<pre>{content}</pre>'
@htmlize.register
def _(text: str) -> str:
content = html.escape(text).replace('\n', '<br/>\n')
return f'<p>{content}</p>'
# stack and pass types if needed
@htmlize.register(decimal.Decimal)
@htmlize.register(float)
def _(x) -> str:
...locals()can be used to access all local vars.
Types
Duck Typing
Prefer duck typing when possible, then goose typing, then nominal typing (
isinstance) if really needed.
- Type
xis consistent withy, ifxsupports all operations ofy.Anyis a special case. TypeVarT = TypeVar('T', int, float), orT = TypeVar('T', bound=Hashable), other options are also supportedTypeVarcan be replaced withtypekeyword in 3.12. But for generic classes they still implicitly inherit fromGeneric- Use
varianceargument to specify
# define a protocol
from typing import Protocol, Any
class SupportsLessThan(Protocol):
def __lt__(self, other: Any) -> bool: ...
LT = TypeVar('LT', bound=SupportsLessThan)typing.TYPE_CHECKINGis aboolthat isFalsein runtime andTruein checking__annotations__can be used to read type hints from data classes. The annotations are read as string, and evaluated when called withinspect.get_annotations()- Goose typing by sub-classing the ABC, or define classes first, then register them as virtual subclass of an ABC.
- Duck typing by EAFP (easier to ask for forgiveness than for permission), e.g. to assert
ocan is compatible withlist, simply calllist(o). __subclasshook__can be used to modify the behavior of subclass/instance. For example, any object with__len__will be considered aSequence, even without subclassing or registering.- However, runtime checks don’t always work well, static protocol typing is still helpful.
@runtime_checkableto make protocols checkable at runtime.- It’s better to make protocols “single method”, as in
SupportsComplex. - Typing for numbers are tricky, since
number.Numberhas no methods in ABC - Python interpreter goes a long way for iterators and sequences to work, even when methods are partially implemented.
- Use
@overloadto define multiple function signatures.
Decorators
pytest.mark.parametrizefor testing. e.g.@mark.parametrize('a, b', [...])@classmethodreceives class itself as first arg,@staticmethod receives nothing@typing.final- cannot be overridden,@typing.override- explicitly tell type checker that this function overrides super class
Iterators and Generators
re.finditerperforms lazy evaluation (unlikefindall), useful for building lazy evaluated iterators.- An object is
Iterableas long as it have__getitem__(doesn’t have to have__next__) - To build iterator, also try using
yield,yield from, and generator expressions. RaiseStopIterationexception if done. GeneratorandCoroutinetypes are functionally equivalent, but classic coroutine with generator function should useGenerator- Iterator produces items, generator consumes items.
- Pattern: send a sentinel value to generator to terminate it; raise
StopIterationinside generator to terminate it. - When
gen.close()is invoked, aStopIterationexception will be raised in theyieldstatement.
Control Flow
Context Manager
ctx.__enter__andctx.__exit__will be called.ctx.__exit__will also be called with exceptions for handling.gen.__exit__returns a truthy value to indicate that exceptions have been handled, otherwise they will be propagated.- Multiple contexts since Python 3.10:
with ( Ctx1() as ctx1, Ctx2() as ctx2 ): @contextlib.contextmanagerto make a function a context. Anything beforeyieldwill be__enter__, the yielded object will be given toas, anything after will be__exit__. It’s assumed that the context handles the exceptions.- Functions decorated with
@contextmanagercan be used to decorate other functions.
else Block
elsecan be used withforandwhile, which will be executed if the loop is terminated bybreak.elsecan be used withexcept, which will be executed if thetryblock didn’t raise any exceptions.
Modules and Packages
- Modules are also first-class objects, can be found in
sys.modules. - Avoid doing things in the module’s body except binding attributes.
- Special methods like
__getattr__at module body will work. imoprt builtinsor the__builtins__attribute can be used to monkey patch builtins.- If the first statement is a string literal, it’s bound to
__doc__ from mod import *binds all attributes except those starting with_. Module can define a__all__for a list of names that should be bound.- Importing is implemented in
builtin.__import__, it first searches insys.modules .pthfile for Python path config. When importing,sys.pathis looked up, the final path for moduleMwould beM/__init__.py- Use
python -m testnameto see if a name exists in stdlib. importlib.reload(M)can reload a module, but existing bindings to attributes won’t be reloaded (henceimportis preferred overfromwhen importing a module)- Hooks can be added for module importing.
P/__init__.py,P’s module body, must be present forPto be recognized as a package (except for namespace packages). IfP.__all__is not set,from P import *will only import names in__init__.py- Relative import:
from . import Xwill look forXin current package, more dots can go up in hierarchy.