跳转至正文
GZ Cloudhome Logo

My journey with Python type annotations

发布于:2023 年 10 月 8 日 at 19:32
更新于:2023 年 10 月 9 日 at 11:43

Python is the language that I use the most. I have heard that Python had started to support type hinting long ago, but it is the learning of TypeScript (TS) that gets me interested in the type hint system of Python.

The Typing System in Python

PEP 484 introduced type hints, a.k.a. type annotations for Python.

What is a type?

From PEP 483’s description, we can define a particular type by four ways:

  • By explicitly listing all values. E.g., True and False form the type bool.
  • By specifying functions which can be used with variables of a type. E.g. all objects that have a __len__ method form the type Sized. Both [1, 2, 3] and 'abc' belong to this type, since one can call len on them.
  • By a simple class definition.
  • There are also more complex types. E.g., one can define the typeFancyList as all lists containing only instances of intstr or their subclasses. The value [1, 'abc', UserID(42)] has this type.

In Python, types are implemented using classes. Class in Python is a dynamic, runtime concept: they are instances of the metaclass type.

What is a subtype?

Consider the following scenario: if first_var has type first_type, and second_var has type second_type, is it safe to assign first_var = second_var?

PEP 483 gives a strong criterion for when it should be safe:

To understand this, we can think of the relationships between real numbers and integers, and the relationships between animals and dogs.

Nominal subtyping and structural subtyping

Gradual typing

Python adopts gradual typing, which means that we can annotate only part of a program. To make this possible, a new type Any is involved, and the following definition of is-consistent-with relationship:

  • A type t1 is consistent with a type t2 if t1 is a subtype of t2. (But not the other way around.)
  • Any is consistent with every type. (But Any is not a subtype of every type.)
  • Every type is consistent with Any. (But every type is not a subtype of Any.)

Any can be considered as a type with all values and all methods. In Python, functions or variables not annotated are considered as of type Any for type checkers (Pylance does it differently, see Pylance and Pyright).

Syntax

Type annotation is done with a single colon (:) and arrow (->, for returning value of functions).

Annotating a function

The syntax for annotating a function is introduced in PEP 3107, and PEP 484 added semantics to it.

funcdef                   ::=  [decorators] "def" funcname [type_params] "(" [parameter_list] ")"
                               ["->" expression] ":" suite
decorators                ::=  decorator+
decorator                 ::=  "@" assignment_expression NEWLINE
parameter_list            ::=  defparameter ("," defparameter)* "," "/" ["," [parameter_list_no_posonly]]
                                 | parameter_list_no_posonly
parameter_list_no_posonly ::=  defparameter ("," defparameter)* ["," [parameter_list_starargs]]
                               | parameter_list_starargs
parameter_list_starargs   ::=  "*" [parameter] ("," defparameter)* ["," ["**" parameter [","]]]
                               | "**" parameter [","]
parameter                 ::=  identifier [":" expression]
defparameter              ::=  parameter ["=" expression]
funcname                  ::=  identifier

Note that lambda functions do not support annotations

Example:

def greeting(name: str) -> str:
    return 'Hello ' + name

Annotating variables

PEP 484 introduced type annotations that mainly focuses on functions. The syntax of annotating variables is introduced in PEP 526, which is now in the final state.

Type annotation can be added to an assignment statement, or to a single expression, indicating the desired type of the annotation target, to a third party type checker.

The following three statements are equivalent:

var = value # type: annotation
var: annotation; var = value
var: annotation = value

Annotations for local variables will not be evaluated; however, if a variable is in a module or class level, then the type will be evaluated.

For more information please see PEP 526.

The typing module

Since Python 3.5, a new module typing has been introduced, to support type hints in Python. We can consider this a library that we can use for annotating variables and functions, which has the fundamental building blocks for constructing types (and generic types).

Comparison with TypeScript

The typing system is different from TS. In TS, we need to transpile the TS scripts into JavaScript in order for them to be executed in a runtime. However, this is not needed in Python.

In fact, the typing system is associated with the runtime. This can be implicated by the description of the typing module:

This module provides runtime support for type hints.

Therefore, although the typing system is mainly used by static type checkers, they do affect the runtime. For example, the Python interpreter will record the annotations in a class definition and save it to a dict called __annotations__:

class A:
    a: int = 3
    b: str = 'abc'

# This will print the annotations dict from `Base`, not `Derived`.
print(A.__annotations__)

Besides, the type annotation expression will be evaluated by the Python interpreter.

typing.cast Utility function

From PEP 484, the typing.cast function is to tell the type checker that you must treat the return value of this function as of the specified type:

import typing

# No Error, as type checkers will be forced to treat the 
# right hand side as of type `int`.
a: int = typing.cast(int, '123')

This can be useful when type checkers incorrectly infer the type of a variable.

This function immediately returns the second argument, but type checkers will treat it differently.

Pylance and Pyright

Pylance is the default python language server in VS Code. It is built on top of the open-source project Pyright (Pylance is not open-source, though), a static type checker for Python. Not only does it offers code completion, code navigation, etc., and type checking, which is related to this blog post.

In Pyright, implicit Any type is regarded as the Unknown type, and only explicit Any is Any, which is different from mypy. The documentation of Pyright says that Unknown is a special form of Any, and Unknown is used to warn developers that there exists partially defined variables. This can be illustrated by the following code snippet:

# Pylance treats the `name` variable as of type Any
def func_with_explicit_any(name: typing.Any):
    print("Hello,", str(name))

# Pylance treats the `name` variable as of type Unknown
def func_with_implicit_any(name):
    print("Hello,", str(name))

When Pylance encounters a variable of type Unknown, there is one thing that should be born in mind:

def func(inp: typing.Any):
    # No Error, as `Any` is compatible with every type.
    tmp1: int = inp
    # No Error, as every type is compatible with `Any`.
    tmp2: typing.Any = tmp1
    return 0

However, in Pylance, the type Unknown does not have these properties. The behavior is a little strange at the first glance:

def func_with_unknown_argument(inp):
    # Because `inp` is not annotated, 
    # Pylance treats it as of type `Unknown`.
    # When we try to assign an `Unknown` type variable to other 
    # types, no error will be reported,
    # BUT Pylance will treat the type of `tmp` as of
    # `Unknown | int` instead.
    tmp: int = inp
    return 0

Therefore, type narrowing for Unknown is different from Any in Pylance. Perhaps the purpose of this intended behavior is to “propagate the Unknown type to better inform developers the existence of it”?

According to my (limited) experiments, however, we are able to narrow the Unknown parameter type to specific types, when only the type parameters of generic types are of type Unknown:

def func(l: list) -> int:
    # the type of `l` is `list[Unknown]`,
    # but we are able to assign it to a `list[int]` variable
    tmp: list[int] = l
    return len(tmp)