A closer Look at Fields

Fields are the basic building blocks of a Schema. Even though lima fields follow only the most basic protocol, they are rather powerful.

How a Field gets its Data

The PersonSchema from the last chapter contains three field objects named first_name, last_name and date_of_birth. These get their data from a person object’s attributes of the same name. But what if those attributes were named differently?

Data from arbitrary Object Attributes

Let’s say our model doesn’t have an attribute date_of_birth but an attribute birthday instead.

To get the data for our date_of_birth field from the model’s birthday attribute, we have to tell the field by supplying the attribute name via the attr argument:

import datetime
from lima import Schema, fields

class Person:
    def __init__(self, first_name, last_name, birthday):
        self.first_name = first_name
        self.last_name = last_name
        self.birthday = birthday

person = Person('Ernest', 'Hemingway', datetime.date(1899, 7, 21))

class PersonSchema(Schema):
    first_name = fields.String()
    last_name = fields.String()
    date_of_birth = fields.Date(attr='birthday')

schema = PersonSchema()
schema.dump(person)
# {'date_of_birth': '1899-07-21',
#  'first_name': 'Ernest',
#  'last_name': 'Hemingway'}

Data from an Object’s Items

If an object has it’s data stored in items instead of attributes, we can tell the field about this by supplying the key argument instead of the attr argument:

import datetime
from lima import Schema, fields

person_dict = {
    'first_name': 'Ernest',
    'last_name': 'Hemingway',
    'birthday': datetime.date(1899, 7, 21),
}

class PersonDictSchema(Schema):
    last_name = fields.String(key='last_name')
    date_of_birth = fields.Date(key='birthday')

schema = PersonDictSchema()
schema.dump(person_dict)
# {'date_of_birth': '1899-07-21',
#  'last_name': 'Hemingway'}

Note

It’s currently not possible to provide None as key. use a getter (see below) if you need to do this.

Data derived by different Means

What if we can’t get the information we need from a single attribute or key? Here getters come in handy.

A getter in this context is a callable that takes an object (in our case: a person object) and returns the value we’re interested in. We tell a field about the getter via the get parameter:

def sort_name_getter(obj):
    return '{}, {}'.format(obj.last_name, obj.first_name)

class PersonSchema(Schema):
    first_name = fields.String()
    last_name = fields.String()
    sort_name = fields.String(get=sort_name_getter)
    date_of_birth = fields.Date(attr='birthday')

schema = PersonSchema()
schema.dump(person)
# {'date_of_birth': '1899-07-21',
#  'first_name': 'Ernest',
#  'last_name': 'Hemingway'
#  'sort_name': 'Hemingway, Ernest'}

Note

For getters, lambda expressions come in handy. sort_name could just as well have been defined like this:

sort_name = fields.String(
    get=lambda obj: '{}, {}'.format(obj.last_name, obj.first_name)
)

Constant Field Values

Sometimes a field’s data is always the same. For example, if a schema provides a field for type information, this field will most likely always have the same value.

To reflect this, we could provide a getter that always returns the same value (here, for example, the string 'https:/schema.org/Person'). But lima provides a better way to achieve the same result: Just provide the val parameter to a field’s constructor:

class TypedPersonSchema(Schema):
    _type = fields.String(val='https://schema.org/Person')
    givenName = fields.String(attr='first_name')
    familyName = fields.String(attr='last_name')
    birthDate = fields.Date(attr='birthday')

schema = TypedPersonSchema()
schema.dump(person)
# {'_type': 'https://schema.org/Person',
#  'birthDate': '1899-07-21',
#  'familyName': 'Hemingway',
#  'givenName': 'Ernest'}

Note

It’s currently not possible to provide None as a constant value using val - use a getter if you need to do this.

On Field Parameters

attr, get and val are mutually exclusive. See lima.fields.Field for more information on this topic.

How a Field presents its Data

If a field has a static method (or instance method) pack(), this method is used to present a field’s data. (Otherwise the field’s data is just passed through on marshalling. Some of the more basic built-in fields behave that way.)

So by implementing a pack() static method (or instance method), we can support marshalling of any data type we want:

from collections import namedtuple
from lima import fields, Schema

# a new data type
GeoPoint = namedtuple('GeoPoint', ['lat', 'long'])

# a field class for the new date type
class GeoPointField(fields.Field):
    @staticmethod
    def pack(val):
        ns = 'N' if val.lat > 0 else 'S'
        ew = 'E' if val.long > 0 else 'W'
        return '{}° {}, {}° {}'.format(val.lat, ns, val.long, ew)

# a model using the new data type
class Treasure:
    def __init__(self, name, location):
        self.name = name
        self.location = location

# a schema for that model
class TreasureSchema(Schema):
    name = fields.String()
    location = GeoPointField()

treasure = Treasure('The Amber Room', GeoPoint(lat=59.7161, long=30.3956))
schema = TreasureSchema()
schema.dump(treasure)
# {'location': '59.7161° N, 30.3956° E', 'name': 'The Amber Room'}

Or we can change how already supported data types are marshalled:

class FancyDate(fields.Date):
    @staticmethod
    def pack(val):
        return val.strftime('%A, the %d. of %B %Y')

class FancyPersonSchema(Schema):
    first_name = fields.String()
    last_name = fields.String()
    date_of_birth = FancyDate(attr='birthday')

schema = FancyPersonSchema()
schema.dump(person)
# {'date_of_birth': 'Friday, the 21. of July 1899',
#  'first_name': 'Ernest',
#  'last_name': 'Hemingway'}

Warning

Make sure the result of your pack() methods is JSON serializable (or at least in a format accepted by the serializer of your target format).

Also, don’t try to override an existing instance method with a static method. Have a look at the source if in doubt (in lima itself, currently only lima.fields.Embed and lima.fields.Reference implement pack() as instance methods.

Data Validation

In short: There is none.

lima is opinionated in this regard. It assumes you have control over the data you want to serialize and have already validated it before putting it in your database.

But this doesn’t mean it can’t be done. You’ll just have to do it yourself. The pack() method would be the place for this:

import re

class ValidEmailField(fields.String):
    @staticmethod
    def pack(val):
        if not re.match(r'[^@]+@[^@]+\.[^@]+', val):
            raise ValueError('Not an email address: {!r}'.format(val))
        return val

Note

If you need full-featured validation of your existing data at marshalling time, have a look at marshmallow.

Fields Recap

  • You now know how it’s determined where a field’s data comes from. (from least to highest precedence: field name < attr < getter < constant field value.
  • You know how a field presents its data (pack() method).
  • You know how to support your own data types (subclass lima.fields.Field) and implement pack()
  • And you know how to change the marshalling of already supported data types (subclass the appropriate field class and override pack())
  • Also, you’re able to implement data validation should the need arise (implement/override pack()).