Pyson

Pyson converts between dicts/lists and python3 objects.

The basic version determines the types for (de)serialization from the __init__ function in the involved classes. Polymorphism is supported, so derived classes can (de)serialized from a superclass.

You can control the (de)serialization mapping using jackson-styled annotations.

Why not @datatype

The @datatype annotation from python offers serialization through the asdict method. However it does not support polymorphism, it does not include the information needed to properly support polymorphism, and it does not offer the reverse method (e.g. from-dict).

For from-dict functionality, you need to use a third party library, like dataclasses_fromdict, dacite or mashumaro.

But even then polymorphic classes can not be deserialized, because the serialized form just does not contain the information required to do this.

Installation

For pyson you need python 3.8, 3.9 or 3.10.

pip install https://tracinsy.ewi.tudelft.nl/pubtrac/Utilities/export/451/pyson/dist/pyson-1.1.4.tar.gz

or from your setup.py

    install_requires=[ "pyson@pip install https://tracinsy.ewi.tudelft.nl/pubtrac/Utilities/export/451/pyson/dist/pyson-1.1.4.tar.gz"],

NOTICE you may want to use the latest version. Check pyson/dist for the latest.

WARNING: there exists another other python library named "pyson", which is an entirely different project.

Basic Mechanism

The basic mechanism for pyson is to look at the __init__ function of the class under serialization. The arguments in the __init__ function are matched to the json fields. The arguments in the __init__ must be fully typed, and these types are used to determine how to interpret the json content. The __init__ function must not be overloaded for this.

You can use typing.Any to indicate you expect any PRIMITIVE type (int, str, dict, etc). For security reasons, pyson always requires explicit class for polymorphism and never deserializes just any class, only primitives can be handled this way.

After deserializing the arguments, the default constructor __init__ is used to create the object. This enforces the proper construction, including all argument checks etc.

Deserialization

For Deserialization of a json object, it goes like this

  • if the target class has @JsonSubtypes,
    • check which the subclasses are and what their ID is.
    • Determine the actual class contained in the json
    • "unwrap" the json, so that we have the remaining json to deserialize the actual class
  • If it's not a JsonSubtypes, then the actual class is the requested target class
  • now that the actual class is known,
    • the parameter names and param-classes from the __init__ function of the actual class are taken
    • for each of the parameters, recursively deserialize the json value for that parameter, using the param-class as targetclass.
    • call the constructor of the target class, using the parsed json for each parameter. Missing parameters is allowed if the constructor has default values for that parameter.

Serialization

Serialization of an object is much more straightforward.

  • Create a json dict, with keys the arguments of the __init__ function of the object and the value the serialized value returned by the getter (also considering @JsonGetter)
  • If the object is an instance of a class with @JsonSubtypes, add/wrap the json with class info according to the @JsonTypeInfo

Primitive types

Primitive types are types that are built into Python and do not have the __init__ function as we expect, for instance float, str. Therefore these need to be intercepted and handled through special constructors and serializers.

The following types are recognised and handled separately:

NoneType, int, float, bool, str, complex, range, bytes, bytearray, datetime, URI, UUID, Decimal, dict, list, Exceptions (see below)

The dict and list are treated even more special as we also support the List[TYPE] and Dict[KEYTYPE, VALUETYPE] types.

Using Decimal may cause rounding errors even before they get to Pyson, because json converts them to float. And floats can not represent all decimals accurately. For example, in python str(0.24745759773026107) == '0.24745759773026108'. Overall, a number string "N" in your json text will converted as Decimal(str(float("N"))) . When in doubt, check for your number "N" that str(float(N))==N, if not you have a rounding issue.

You can use Union[X, NoneType] where NoneType=type(None) or Optional[X]if the object can also be None. So for exmple you can use reservationBid:Union[Bid,type(None)] = None. Do not use the mypy.NoneType which is an entirely different class. We do not support Union[X,Y] where both X and Y are not NoneType.

Dict Keys

In JSON, keys of dictionaries/maps are restricted to strings. In Python however, dicts usually contain general objects as keys.

To work around this JSON restriction, pyson recognises from your specified type (eg Dict[MyObject, str]) whether the key is an object. In that case, a conversion is done between the python object (MyObject) and a json string, recursively using the standard pyson parse and toJson functions.

Exceptions

Exceptions are handled as follows: Serialization of an exception gives {"message":"...", "cause": EX, "stackTrace":[]} where EX is either None or another serialized exception. This format tries to maximize compatibility with the jackson way of serializing exceptions. The "stackTrace" field is always empty because python does not include the stacktrace with exceptions; it's there only for compatibility of the format with java. Note that the type of the exception is not included (as in jackson). Deserialization takes a dict with at least a "message" field. The constructor of the provided class is called with this message, so make sure that your exception constructor takes the message as the constructor argument. If the dict contains a cause, the cause is also parsed recursively, as general Exception, and attribute __cause__ of the exception is filled with the result.

Decorators

The following annotations are available

@JsonGetter

Added to a function definition in a class. Takes "value" as argument, containing the name of a json field (which is also used as argument in the constructor).

Indicates that that function is to be used to get the value of the given json field.

@JsonSubTypes

Added to a class definition. Takes a list of strings as argument. Each string is the full class path to another class. You need the full class path here, because you can't do an import to make the short name work because then you would introduce a circular dependency.

Indicates that that other class is a sub-class of the annotated class. The other class can be used for deserialization as well.

If this is used, the @JsonTypeInfo must also be provided.

@JsonValue

indicates that the value of annotated getter method is to be used as the single value to serialize for the instance, instead of the usual method of collecting properties of value. Note that in python you can only create what you would call "java class fields" in the constructor, like this

class X:
  def __init__(self,..):
    self._x = initialvalue

Therefore the jackson alternative of annotating a field with @JsonValue makes no sense in python.

@JsonTypeInfo

Added to a class definition. Contains "Id" and "As" values.

Indicates that the class name/id should be included in the json code. This is especially useful in combination with @JsonSubTypes.

The "Id" value can have the value NONE, NAME, CLASS. We recommend to use NAME,

Id Valuemeaning
NONEDo not include class id at all. Do not use this with @JsonTypeInfo. Only included because it was availale in Jackson
NAMEUse the name of the class to refer to the class. All classes referred must have different name (not two the same names with different classpath).
CLASSUse the full.class.path to refer to the class.
NAME should be used, the others are not properly implemented. We recommend this method anyway, because it is shorter and gives more readable json, and gives you flexibility to move around the actual classes if needed without breaking compabibility with existing json files

The "As" annotation indicates *how* to include the class name is included in/extracted from the json.

As valuemeaning
PROPERTYThe json dict is extended with a type parameter containing the class Id
WRAPPER_OBJECTA dict is created with the class Id as key, and as value the actual class contents
WRAPPER+ARRAYAn array is used for storing the contents of the class

At this moment WRAPPER_OBJECT should be used; the others are not properly implemented.

@JsonDeserialize

This annotation allows a custom hand-coded deserializer to be used. The argument is a full.class.path of a class implementing Deserializer. For example, @JsonDeserialize("geniusweb.issuevalue.ValueDeserializer.ValueDeserializer")

The Deserializer requires one implemented method: deserialize(self, data:object, clas: object). It takes the json object and the expected class, and then returns the deserialized object.

@JsonSerialize

This annotation allows a custom hand-coded serializer to be used. The argument is a full.class.path of the class implementing Serializer.

For example, @JsonSerialize(ValueSerializer)

The Serializer requires one implemented method: def serialize(self, obj:object)-> object

It takes the python object, and returns a json object (dict, list, etc) containing the serialized object.

@JsonSerialize and @JsonDeserialize will usually come in pairs, as a custom serialization will need a custom deserialization.

Inheritance of annotations

The usual inheritance mechanism of python applies also to the decorators. If you want to disable an inherited JsonDeserializer use @JsonDeserialize(None). Similarly for @JsonSerializer.

Examples

See pyson/test/ObjectMapperTest.py for many examples.

A simple example, deserializng a dict with objects

from pyson.ObjectMapper import ObjectMapper
from pyson.JsonTypeInfo import JsonTypeInfo
from pyson.JsonTypeInfo import Id,As
from typing import Dict

@JsonTypeInfo(use=Id.NAME, include=As.WRAPPER_OBJECT)
class Simple:
    def __init__(self, a:int):
        self._a=a
    def geta(self)->int:
        return self._a
    def __eq__(self, other):
        return isinstance(other, self.__class__) and \
            self._a==other._a
    def __str__(self):
        return self._name+","+str(self._a)

pyson=ObjectMapper()
objson = { 'a':{"Simple":{'a':1}},'c':{"Simple":{'a':3}}}
obj=pyson.parse(objson, Dict[str,Simple])
obj['a'].geta()

A complex example showing many things at once

from pyson.ObjectMapper import ObjectMapper
from pyson.JsonSubTypes import JsonSubTypes
from pyson.JsonTypeInfo import JsonTypeInfo
from pyson.JsonTypeInfo import Id,As
from typing import Dict,List,Set
import json

class Props:
    '''
    compound class with properties, used for testing 
    '''
    def __init__(self, age:int, name:str):
        if age<0:
            raise ValueError("age must be >0, got "+str(age))
        self._age=age
        self._name=name;
    def __str__(self):
        return self._name+","+str(self._age)
    def getage(self):
        return self._age
    def getname(self):
        return self._name
    def __eq__(self, other):
        return isinstance(other, self.__class__) and \
            self._name==other._name and self._age==other._age


@JsonSubTypes(["__main__.Bear"])
@JsonTypeInfo(use=Id.NAME, include=As.WRAPPER_OBJECT)
class Animal:
    pass
    
    
class Bear(Animal):
    def __init__(self, props:Props):
        self._props=props
            
    def __str__(self):
        return "Bear["+str(self._props)+"]"
        
    def getprops(self):
        return self._props
    def __eq__(self, other):
        return isinstance(other, self.__class__) and \
            self._props==other._props

pyson=ObjectMapper()


obj=Bear(Props(1,'bruno'))
res=pyson.toJson(obj)
print("result:"+str(res))
bson={'Bear': {'props': {'age': 1, 'name': 'bruno'}}}
res=pyson.parse(bson, Animal)
print("Deserialized an Animal! -->"+str(res))

NOTICE: our code allows you to use objects as keys, as python does allow this. However json requires strings as keys.

And an example using @JsonGetter

from pyson.ObjectMapper import ObjectMapper
from pyson.JsonGetter import JsonGetter
class Getter:
    def __init__(self, a:int):
        self._a=a
    @JsonGetter("a")
    def getValue(self):
        return self._a

getter=Getter(17)
pyson=ObjectMapper()
pyson.toJson(getter)

An example using custom (de)serializer comes from DeserializerTest.py:

class ValueDeserializer(Deserializer):
    def __hash__(self):
        return hash(self.geta())
    def deserialize(self, data:object, clas: object)-> object:
        if type(data)!=str:
            raise ValueError("Expected str starting with '$', got  "+str(data))
        return Simple(int(data[1:]))

class ValueSerializer(Serializer):
    def serialize(self, obj:object)-> object:
        if not isinstance(obj, Simple):
            raise ValueError("Expected Dimple object")
        return "$" + str(obj.geta())


@JsonDeserialize(__main__.ValueDeserializer)
@JsonSerialize("__main__.ValueSerializer")
class Simple:
    def __init__(self, a:int):
        self._a=a 
    def geta(self)->int:
        return self._a
    def __eq__(self, other):
        return isinstance(other, self.__class__) and \
            self._a==other._a
    def __repr__(self):
        return "Simple:"+str(self._a)
    def __hash__(self):
        return hash(self._a)

pyson=ObjectMapper()
pyson.toJson(Simple(12))
pyson.parse("$12", Simple)

Add/Extend annotations at runtime

You can also add/extend existing annotations at runtime. You just need to update the class and function attributes. For now, check the source code for details.

Download sources

svn co https://tracinsy.ewi.tudelft.nl/pub/svn/Utilities/pyson

Last modified 6 months ago Last modified on 11/29/23 14:22:42
Note: See TracWiki for help on using the wiki.