pydantic

pypi license

Current Version: 0.2.1

Data validation and settings management using python 3.6 type hinting.

Define how data should be in pure, canonical python; validate it with pydantic.

PEP 484 introduced type hinting into python 3.5, PEP 526 extended that with syntax for variable annotation in python 3.6.

pydantic uses those annotations to validate that untrusted data takes the form you want.

Example:

from datetime import datetime
from typing import List
from pydantic import BaseModel

class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: datetime = None
    friends: List[int] = []

external_data = {'id': '123', 'signup_ts': '2017-06-01 12:22', 'friends': [1, '2', b'3']}
user = User(**external_data)
print(user)
# > User id=123 name='John Doe' signup_ts=datetime.datetime(2017, 6, 1, 12, 22) friends=[1, 2, 3]
print(user.id)
# > 123

(This script is complete, it should run “as is”)

What’s going on here:

  • id is of type int; the annotation only declaration tells pydantic that this field is required. Strings, bytes or floats will be coerced to ints if possible, otherwise an exception would be raised.
  • name is inferred as a string from the default, it is not required as it has a default.
  • signup_ts is a datetime field which is not required (None if it’s not supplied), pydantic will process either a unix timestamp int (e.g. 1496498400) or a string representing the date & time.
  • friends uses python’s typing system, it is required to be a list of integers, as with id integer-like objects will be converted to integers.

If validation fails pydantic with raise an error with a breakdown of what was wrong:

from pydantic import ValidationError
try:
    User(signup_ts='broken', friends=[1, 2, 'not number'])
except ValidationError as e:
    print(e.json())

"""
{
  "friends": [
    {
      "error_msg": "invalid literal for int() with base 10: 'not number'",
      "error_type": "ValueError",
      "index": 2,
      "track": "int"
    }
  ],
  "id": {
    "error_msg": "field required",
    "error_type": "Missing",
    "index": null,
    "track": null
  },
  "signup_ts": {
    "error_msg": "Invalid datetime format",
    "error_type": "ValueError",
    "index": null,
    "track": "datetime"
  }
}
"""

Rationale

So pydantic uses some cool new language feature, but why should I actually go an use it?

no brainfuck
no new schema definition micro-language to learn. If you know python (and perhaps skim read the type hinting docs) you know how to use pydantic.
plays nicely with your IDE/linter/brain
because pydantic data structures are just instances of classes you define; auto-completion, linting, mypy and your intuition should all work properly with your validated data.
dual use
pydantic’s BaseSettings class allows it to be used in both a “validate this request data” context and “load my system settings” context. The main difference being that system settings can can have defaults changed by environment variables and more complex objects like DSNs and python objects often required.
fast
In benchmarks pydantic is around twice as fast as trafaret. Other comparisons to cerberus, marshmallow, DRF, jsonmodels etc. to come.
validate complex structures
use of recursive pydantic models and typing’s List and Dict etc. allow complex data schemas to be clearly and easily defined.
extendible
pydantic allows custom data types to be defined or you can extend validation with the clean_* methods on a model.

Install

Just:

pip install pydantic

pydantic has no dependencies except python 3.6+. If you’ve got python 3.6 and pip installed - you’re good to go.

Usage

PEP 484 Types

pydantic uses typing types to define more complex objects.

from typing import Dict, List, Optional, Union, Set

from pydantic import BaseModel


class Model(BaseModel):
    simple_list: list = None
    list_of_ints: List[int] = None

    simple_dict: dict = None
    dict_str_float: Dict[str, float] = None

    simple_set: set = None
    set_bytes: Set[bytes] = None

    str_or_bytes: Union[str, bytes] = None
    none_or_str: Optional[str] = None

    compound: Dict[Union[str, bytes], List[Set[int]]] = None

print(Model(simple_list=['1', '2', '3']).simple_list)  # > ['1', '2', '3']
print(Model(list_of_ints=['1', '2', '3']).list_of_ints)  # > [1, 2, 3]

print(Model(simple_dict={'a': 1, b'b': 2}).simple_dict)  # > {'a': 1, b'b': 2}
print(Model(dict_str_float={'a': 1, b'b': 2}).dict_str_float)  # > {'a': 1.0, 'b': 2.0}

(This script is complete, it should run “as is”)

Choices

pydantic uses python’s standard enum classes to define choices.

from enum import Enum, IntEnum

from pydantic import BaseModel


class FruitEnum(str, Enum):
    pear = 'pear'
    banana = 'banana'


class ToolEnum(IntEnum):
    spanner = 1
    wrench = 2


class CookingModel(BaseModel):
    fruit: FruitEnum = FruitEnum.pear
    tool: ToolEnum = ToolEnum.spanner


print(CookingModel())
# > CookingModel fruit=<FruitEnum.pear: 'pear'> tool=<ToolEnum.spanner: 1>
print(CookingModel(tool=2, fruit='banana'))
# > CookingModel fruit=<FruitEnum.banana: 'banana'> tool=<ToolEnum.wrench: 2>
print(CookingModel(fruit='other'))
# will raise a validation error

(This script is complete, it should run “as is”)

Recursive Models

More complex hierarchical data structures can be defined using models as types in annotations themselves.

The ellipsis ... just means “Required” same as annotation only declarations above.

from typing import List
from pydantic import BaseModel

class Foo(BaseModel):
    count: int = ...
    size: float = None

class Bar(BaseModel):
    apple = 'x'
    banana = 'y'

class Spam(BaseModel):
    foo: Foo = ...
    bars: List[Bar] = ...


m = Spam(foo={'count': 4}, bars=[{'apple': 'x1'}, {'apple': 'x2'}])
print(m)
# > Spam foo=<Foo count=4 size=None> bars=[<Bar apple='x1' banana='y'>, <Bar apple='x2' banana='y'>]
print(m.values())
# {'foo': {'count': 4, 'size': None}, 'bars': [{'apple': 'x1', 'banana': 'y'}, {'apple': 'x2', 'banana': 'y'}]}

(This script is complete, it should run “as is”)

Error Handling

from typing import List

from pydantic import BaseModel, ValidationError


class Location(BaseModel):
    lat = 0.1
    lng = 10.1

class Model(BaseModel):
    list_of_ints: List[int] = None
    a_float: float = None
    is_required: float = ...
    recursive_model: Location = None

try:
    Model(list_of_ints=['1', 2, 'bad'], a_float='not a float', recursive_model={'lat': 4.2, 'lng': 'New York'})
except ValidationError as e:
    print(e)

"""
4 errors validating input
list_of_ints:
  invalid literal for int() with base 10: 'bad' (error_type=ValueError track=int index=2)
a_float:
  could not convert string to float: 'not a float' (error_type=ValueError track=float)
is_required:
  field required (error_type=Missing)
recursive_model:
  1 error validating input (error_type=ValidationError track=Location)
    lng:
      could not convert string to float: 'New York' (error_type=ValueError track=float
"""

try:
    Model(list_of_ints=1, a_float=None, recursive_model=[1, 2, 3])
except ValidationError as e:
    print(e.json())

"""
{
  "is_required": {
    "error_msg": "field required",
    "error_type": "Missing",
    "index": null,
    "track": null
  },
  "list_of_ints": {
    "error_msg": "'int' object is not iterable",
    "error_type": "TypeError",
    "index": null,
    "track": null
  },
  "recursive_model": {
    "error_msg": "cannot convert dictionary update sequence element #0 to a sequence",
    "error_type": "TypeError",
    "index": null,
    "track": "Location"
  }
}
"""

(This script is complete, it should run “as is”)

Exotic Types

pydantic comes with a number of utilities for parsing or validating common objects.

from pathlib import Path

from pydantic import DSN, BaseModel, EmailStr, NameEmail, PyObject, conint, constr, PositiveInt, NegativeInt


class Model(BaseModel):
    cos_function: PyObject = None
    path_to_something: Path = None

    short_str: constr(min_length=2, max_length=10) = None
    regex_str: constr(regex='apple (pie|tart|sandwich)') = None

    big_int: conint(gt=1000, lt=1024) = None
    pos_int: PositiveInt = None
    neg_int: NegativeInt = None

    email_address: EmailStr = None
    email_and_name: NameEmail = None

    db_name = 'foobar'
    db_user = 'postgres'
    db_password: str = None
    db_host = 'localhost'
    db_port = '5432'
    db_driver = 'postgres'
    db_query: dict = None
    dsn: DSN = None

m = Model(
    cos_function='math.cos',
    path_to_something='/home',
    short_str='foo',
    regex_str='apple pie',
    big_int=1001,
    pos_int=1,
    neg_int=-1,
    email_address='Samuel Colvin <s@muelcolvin.com >',
    email_and_name='Samuel Colvin <s@muelcolvin.com >',
)
print(m.values())
"""
{
    'cos_function': <built-in function cos>,
    'path_to_something': PosixPath('/home'),
    'short_str': 'foo', 'regex_str': 'apple pie',
    'big_int': 1001,
    'pos_int': 1,
    'neg_int': -1,
    'email_address': 's@muelcolvin.com',
    'email_and_name': <NameEmail("Samuel Colvin <s@muelcolvin.com>")>,
    ...
    'dsn': 'postgres://postgres@localhost:5432/foobar'
}
"""

(This script is complete, it should run “as is”)

Model Config

Behaviour of pydantic can be controlled via the Config class on a model.

Here default for config parameter are shown together with their meaning.

from pydantic import BaseModel


class UserModel(BaseModel):
    id: int = ...

    class Config:
        min_anystr_length = 0  # min length for str & byte types
        max_anystr_length = 2 ** 16  # max length for str & byte types
        min_number_size = -2 ** 64  # min size for numbers
        max_number_size = 2 ** 64  # max size for numbers
        raise_exception = True  # whether or not to raise an exception if the data is invalid
        validate_all = False  # whether or not to validate field defaults
        ignore_extra = True  # whether to ignore any extra values in input data
        allow_extra = False  # whether or not too allow (and include on the model) any extra values in input data
        fields = None  # extra information on each field, currently just "alias is allowed"

Settings

One of pydantic’s most useful applications is to define default settings, allow them to be overridden by environment variables or keyword arguments (e.g. in unit tests).

This usage example comes last as it uses numerous concepts described above.

from pydantic import DSN, BaseSettings, PyObject


class Settings(BaseSettings):
    redis_host = 'localhost'
    redis_port = 6379
    redis_database = 0
    redis_password: str = None

    auth_key: str = ...

    invoicing_cls: PyObject = 'path.to.Invoice'

    db_name = 'foobar'
    db_user = 'postgres'
    db_password: str = None
    db_host = 'localhost'
    db_port = '5432'
    db_driver = 'postgres'
    db_query: dict = None
    dsn: DSN = None

    class Config:
        env_prefix = 'MY_PREFIX_'  # defaults to 'APP_'
        fields = {
            'auth_key': {
                'alias': 'my_api_key'
            }
        }

Here redis_port could be modified via export MY_PREFIX_REDIS_PORT=6380 or auth_key by export my_api_key=6380.

Usage with mypy

Pydantic works with mypy provided you use the “annotation only” version of required variables:

from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel, NoneStr

class Model(BaseModel):
    age: int
    first_name = 'John'
    last_name: NoneStr = None
    signup_ts: Optional[datetime] = None
    list_of_ints: List[int]

m = Model(age=42, list_of_ints=[1, '2', b'3'])
print(m.age)
# > 42

Model()
# will raise a validation error for age and list_of_ints

This script is complete, it should run “as is”. You can also run it through mypy with:

mypy --ignore-missing-imports --follow-imports=skip --strict-optional pydantic_mypy_test.py

Strict Optional

For your code to pass with --strict-optional you need to to use Optional[] or an alias of Optional[] for all fields with None default, this is standard with mypy.

Pydantic provides a few useful optional or union types:

  • NoneStr aka. Optional[str]
  • NoneBytes aka. Optional[bytes]
  • StrBytes aka. Union[str, bytes]
  • NoneStrBytes aka. Optional[StrBytes]

If these aren’t sufficient you can of course define your own.

Required Fields and mypy

The ellipsis notation ... will not work with mypy, you need to use annotation only fields as in the example above.

Warning

Be aware that using annotation only fields will alter the order of your fields in metadata and errors: annotation only fields will always come last, but still in the order they were defined.

To get round this you can use the Required (via from pydantic import Required) field as an alias for ellipses or annotation only.

History

v0.2.1 (2017-06-07)

  • pypi and travis together messed up the deploy of v0.2 this should fix it

v0.2.0 (2017-06-07)

  • breaking change: values() on a model is now a method not a property, takes include and exclude arguments
  • allow annotation only fields to support mypy
  • add pretty to_string(pretty=True) method for models

v0.1.0 (2017-06-03)

  • add docs
  • add history