Version 79 (modified by 4 months ago) ( diff ) | ,
---|
Pyson
Pyson converts between dicts/lists and python3 objects. Note that this differs from jackson (a similar library for java), where conversion is done between strings and java objects.
The basic version determines the types for (de)serialization from the __init__
function in the involved classes. Polymorphism is supported, so derived classes can (de)serialized from a superclass.
You can control the (de)serialization mapping using jackson-styled annotations.
Existing libraries
Existing libraries do not offer the features we need.
The @datatype annotation from python offers serialization through the asdict
method.
However it does not support polymorphism, it does not include the information needed to properly support polymorphism, and it does not offer the reverse method (e.g. from-dict).
For from-dict functionality, you need to use a third party library, like dataclasses_fromdict, dacite or mashumaro.
But even then polymorphic classes can not be deserialized, because the serialized form just does not contain the information required to do this.
Installation
For pyson you need python 3.8, 3.9 or 3.10.
pip install https://tracinsy.ewi.tudelft.nl/pubtrac/Utilities/export/451/pyson/dist/pyson-1.1.4.tar.gz
or from your setup.py
install_requires=[ "pyson@pip install https://tracinsy.ewi.tudelft.nl/pubtrac/Utilities/export/451/pyson/dist/pyson-1.1.4.tar.gz"],
NOTICE you may want to use the latest version. Check pyson/dist for the latest.
WARNING: there exists another other python library named "pyson", which is an entirely different project.
Basic Mechanism
The basic mechanism for pyson is to look at the __init__
function of the class under serialization.
The arguments in the __init__
function are matched to the json fields.
The arguments in the __init__
must be fully typed, and these types are used to determine how to interpret the json content.
The __init__
function must not be overloaded for this.
You can use typing.Any
to indicate you expect any PRIMITIVE type (int, str, dict, etc).
For security reasons, pyson always requires explicit class for polymorphism and never deserializes just any class, only primitives can be handled this way.
After deserializing the arguments, the default constructor __init__
is used to create the object. This enforces the proper construction, including all argument checks etc.
If your class has a constructor __init__(self, X)
and you add @JsonValue to the getX method in your class, then your class will (de)serialize as if it's just an X, so without wrappers.
Deserialization
For Deserialization of a json object, it goes like this
- if the target class has @JsonSubtypes,
- check which the subclasses are and what their ID is.
- Determine the actual class contained in the json
- "unwrap" the json, so that we have the remaining json to deserialize the actual class
- If it's not a JsonSubtypes, then the actual class is the requested target class
- now that the actual class is known,
- the parameter names and param-classes from the
__init__
function of the actual class are taken - for each of the parameters, recursively deserialize the json value for that parameter, using the param-class as targetclass.
- call the constructor of the target class, using the parsed json for each parameter. Missing parameters is allowed if the constructor has default values for that parameter.
- the parameter names and param-classes from the
Serialization
Serialization (conversion to a json object) of a python object is much more straightforward.
- If there is a @JsonSerialize annotation, use that
- If there is a @JsonValue annotated getter, then call the getter and serialize that.
- Create a json dict, with keys the arguments of the
__init__
function of the object and the value the serialized value returned by the getter (also considering @JsonGetter) - If the object is an instance of a class with @JsonSubtypes, add/wrap the json with class info according to the @JsonTypeInfo
Primitive types
Primitive types are types that are built into Python and do not have the __init__
function as we expect, for instance float, str. Therefore these need to be intercepted and handled through special constructors and serializers.
The following types are recognised and handled separately:
NoneType, int, float, bool, str, complex, range, bytes, bytearray, datetime, URI, UUID, Decimal, dict, list, Exceptions (see below)
The dict and list are treated even more special as we also support the List[TYPE] and Dict[KEYTYPE, VALUETYPE] types.
Using Decimal may cause rounding errors even before they get to Pyson, because json converts them to float. And floats can not represent all decimals accurately. For example, in python |
You can use Union[X, NoneType]
where NoneType=type(None)
or Optional[X]
if the object can also be None. So for exmple you can use reservationBid:Union[Bid,type(None)] = None
. Do not use the mypy.NoneType which is an entirely different class. We do not support Union[X,Y]
where both X and Y are not NoneType.
Dict Keys
In JSON, keys of dictionaries/maps are restricted to strings. In Python however, dicts usually contain general objects as keys.
To work around this JSON restriction, pyson recognises from your specified type (eg Dict[MyObject, int]
) whether the key is an object. In that case, a conversion is done between the python object (MyObject) and a json string, recursively using the standard pyson parse and toJson functions.
If MyObject is using @JsonValue to directly map to a str, then that string is directly used for the mapping to and from json.
Exceptions
Exceptions are handled as follows:
Serialization of an exception gives {"message":"...", "cause": EX, "stackTrace":[]}
where EX is either None or another serialized exception. This format tries to maximize compatibility with the jackson way of serializing exceptions. The "stackTrace" field is always empty because python does not include the stacktrace with exceptions; it's there only for compatibility of the format with java. Note that the type of the exception is not included (as in jackson).
Deserialization takes a dict with at least a "message" field. The constructor of the provided class is called with this message, so make sure that your exception constructor takes the message as the constructor argument. If the dict contains a cause, the cause is also parsed recursively, as general Exception, and attribute __cause__
of the exception is filled with the result.
Decorators
The following annotations are available
@JsonGetter
Added to a function definition in a class. Takes "value" as argument, containing the name of a json field (which is also used as argument in the constructor).
Indicates that that function is to be used to get the value of the given json field.
@JsonSubTypes
Added to a class definition. Takes a list of strings as argument. Each string is the full class path to another class. You need the full class path here, because you can't do an import to make the short name work because then you would introduce a circular dependency.
Indicates that that other class is a sub-class of the annotated class. The other class can be used for deserialization as well.
If this is used, the @JsonTypeInfo must also be provided. If you do not specify @JsonTypeInfo, then the default is used which means the type info is not attached to the serialized format, causing deserialization to fail.
Example: @JsonSubTypes(['A','B'])
attached to class C
indicates that class A and B are subclasses of C.
@JsonValue
indicates that the value of annotated getter method is to be used as the single value to serialize for the instance, instead of the usual method of collecting properties of value. Note that in python you can only create what you would call "java class fields" in the constructor, like this
class X: def __init__(self,..): self._x = initialvalue
Therefore the jackson alternative of annotating a field with @JsonValue makes no sense in python.
@JsonTypeInfo
Added to a class definition. Contains "Id" and "As" values.
Indicates that the class name/id should be included in the json code. This is especially useful in combination with @JsonSubTypes.
The "Id" value can have the value NONE, NAME, CLASS. We recommend to use NAME,
Id Value | meaning |
NONE | Do not include class id at all. Do not use this with @JsonTypeInfo. Only included because it was availale in Jackson |
NAME | Use the name of the class to refer to the class. All classes referred must have different name (not two the same names with different classpath). |
CLASS | Use the full.class.path to refer to the class. |
The "As" annotation indicates *how* to include the class name is included in/extracted from the json.
As value | meaning |
PROPERTY | The json dict is extended with a type parameter containing the class Id
|
WRAPPER_OBJECT | A dict is created with the class Id as key, and as value the actual class contents |
WRAPPER+ARRAY | An array is used for storing the contents of the class |
At this moment WRAPPER_OBJECT should be used; the others are not properly implemented.
example:
@JsonTypeInfo(use=Id.NAME, include=As.WRAPPER_OBJECT)
added to class C indicates that all fields in C will be wrapped inside a wrapper object (a json dict), and that the key of the dict will be the (short) NAME of the class. The de-serializer will use that directly to use the indicated class for deserialization of the json
@JsonDeserialize
This annotation allows a custom hand-coded deserializer to be used. The argument is a full.class.path of a class implementing Deserializer.
For example, @JsonDeserialize("geniusweb.issuevalue.ValueDeserializer.ValueDeserializer")
The Deserializer requires one implemented method:
deserialize(self, data:object, clas: object)
. It takes the json object and the expected class, and then returns the deserialized object.
@JsonSerialize
This annotation allows a custom hand-coded serializer to be used. The argument is a full.class.path of the class implementing Serializer.
For example, @JsonSerialize(ValueSerializer)
The Serializer requires one implemented method:
def serialize(self, obj:object)-> object
It takes the python object, and returns a json object (dict, list, etc) containing the serialized object.
@JsonSerialize
and @JsonDeserialize
will usually come in pairs, as a custom serialization will need a custom deserialization.
Inheritance of annotations
The usual inheritance mechanism of python applies also to the decorators.
If you want to disable an inherited JsonDeserializer use @JsonDeserialize(None)
. Similarly for @JsonSerializer.
Examples
See pyson/test/ObjectMapperTest.py for many examples.
A simple example, deserializng a dict with objects
from pyson.ObjectMapper import ObjectMapper from pyson.JsonTypeInfo import JsonTypeInfo from pyson.JsonTypeInfo import Id,As from typing import Dict @JsonTypeInfo(use=Id.NAME, include=As.WRAPPER_OBJECT) class Simple: def __init__(self, a:int): self._a=a def geta(self)->int: return self._a def __eq__(self, other): return isinstance(other, self.__class__) and \ self._a==other._a def __str__(self): return self._name+","+str(self._a) pyson=ObjectMapper() objson = { 'a':{"Simple":{'a':1}},'c':{"Simple":{'a':3}}} obj=pyson.parse(objson, Dict[str,Simple]) obj['a'].geta()
A complex example showing many things at once
from pyson.ObjectMapper import ObjectMapper from pyson.JsonSubTypes import JsonSubTypes from pyson.JsonTypeInfo import JsonTypeInfo from pyson.JsonTypeInfo import Id,As from typing import Dict,List,Set import json class Props: ''' compound class with properties, used for testing ''' def __init__(self, age:int, name:str): if age<0: raise ValueError("age must be >0, got "+str(age)) self._age=age self._name=name; def __str__(self): return self._name+","+str(self._age) def getage(self): return self._age def getname(self): return self._name def __eq__(self, other): return isinstance(other, self.__class__) and \ self._name==other._name and self._age==other._age @JsonSubTypes(["__main__.Bear"]) @JsonTypeInfo(use=Id.NAME, include=As.WRAPPER_OBJECT) class Animal: pass class Bear(Animal): def __init__(self, props:Props): self._props=props def __str__(self): return "Bear["+str(self._props)+"]" def getprops(self): return self._props def __eq__(self, other): return isinstance(other, self.__class__) and \ self._props==other._props pyson=ObjectMapper() obj=Bear(Props(1,'bruno')) res=pyson.toJson(obj) print("result:"+str(res)) bson={'Bear': {'props': {'age': 1, 'name': 'bruno'}}} res=pyson.parse(bson, Animal) print("Deserialized an Animal! -->"+str(res))
NOTICE: our code allows you to use objects as keys, as python does allow this. However json requires strings as keys.
And an example using @JsonGetter
from pyson.ObjectMapper import ObjectMapper from pyson.JsonGetter import JsonGetter class Getter: def __init__(self, a:int): self._a=a @JsonGetter("a") def getValue(self): return self._a getter=Getter(17) pyson=ObjectMapper() pyson.toJson(getter)
An example using custom (de)serializer comes from DeserializerTest.py:
class ValueDeserializer(Deserializer): def __hash__(self): return hash(self.geta()) def deserialize(self, data:object, clas: object)-> object: if type(data)!=str: raise ValueError("Expected str starting with '$', got "+str(data)) return Simple(int(data[1:])) class ValueSerializer(Serializer): def serialize(self, obj:object)-> object: if not isinstance(obj, Simple): raise ValueError("Expected Dimple object") return "$" + str(obj.geta()) @JsonDeserialize(__main__.ValueDeserializer) @JsonSerialize("__main__.ValueSerializer") class Simple: def __init__(self, a:int): self._a=a def geta(self)->int: return self._a def __eq__(self, other): return isinstance(other, self.__class__) and \ self._a==other._a def __repr__(self): return "Simple:"+str(self._a) def __hash__(self): return hash(self._a) pyson=ObjectMapper() pyson.toJson(Simple(12)) pyson.parse("$12", Simple)
Add/Extend annotations at runtime
You can also add/extend existing annotations at runtime. You just need to update the class and function attributes. For now, check the source code for details.
Download sources
svn co https://tracinsy.ewi.tudelft.nl/pub/svn/Utilities/pyson