Serialization/De-serialization Support for the SQLAlchemy Declarative ORM
The library works as a drop-in extension - change one line of existing code, and it should just work. Furthermore, it has been extensively tested on Python 2.7, 3.4, 3.5, 3.6, 3.7, and 3.8 using SQLAlchemy 0.9 and higher.
- Why SQLAthanor?
- Hello, World and Basic Usage
- Questions and Issues
- Indices and tables
Odds are you’ve used SQLAlchemy before. And if you haven’t, why on earth not? It is hands down the best relational database toolkit and ORM available for Python, and has helped me quickly write code for many APIs, software platforms, and data science projects. Just look at some of these great features.
As its name suggests, SQLAlchemy focuses on the problem of connecting your Python code to an underlying relational (SQL) database. That’s a super hard problem, especially when you consider the complexity of abstraction, different SQL databases, different SQL dialects, performance optimization, etc. It ain’t easy, and the SQLAlchemy team has spent years building one of the most elegant solutions out there.
But as hard as Pythonically communicating with a database is, in the real world with microservices, serverless architectures, RESTful APIs and the like we often need to do more with the data than read or write from/to our database. In almost all of the projects I’ve worked on over the last two decades, I’ve had to:
Python objects (pickled or not) are great, but they’re rarely the best way of transmitting data over the wire, or communicating data between independent applications. Which is where formats like JSON, CSV, and YAML come in.
So when writing many Python APIs, I found myself writing methods to convert my SQLAlchemy records (technically, model instances) into JSON or creating new SQLAlchemy records based on data I received in JSON. So after writing similar methods many times over, I figured a better approach would be to write the serialization/de-serialization code just once, and then re-use it across all of my various projects.
Which is how SQLAthanor came about.
- Easy to adopt: Just tweak your existing SQLAlchemy
importstatements and you’re good to go.
- With one method call, convert SQLAlchemy model instances to:
- With one method call, create or update SQLAlchemy model instances from:
- Decide which serialization formats you want to support for which models.
- Decide which columns/attributes you want to include in their serialized form (and pick different columns for different formats, too).
- Default validation for de-serialized data for every SQLAlchemy data type.
- Customize the validation used when de-serializing particular columns to match your needs.
- Works with Declarative Reflection and the Automap Extension.
- Programmatically generate Declarative Base Models from serialized data or Pydantic models.
- Programmatically create SQLAlchemy Table objects from serialized data or Pydantic models.
Since serialization and de-serialization are common problems, there are a variety of alternative ways to serialize and de-serialize your SQLAlchemy models. Obviously, I’m biased in favor of SQLAthanor. But it might be helpful to compare SQLAthanor to some commonly-used alternatives:
As the examples provided above show, importing SQLAthanor is very straightforward, and you can include it in an existing codebase quickly and easily. In fact, your code should work just as before. Only now it will include new functionality to support serialization and de-serialization.
The table below shows how SQLAlchemy classes and functions map to their SQLAthanor replacements:
|SQLAlchemy Component||SQLAthanor Analog|
from sqlalchemy.ext.declarative import declarative_base
from sqlathanor import declarative_base
from sqlalchemy.ext.declarative import as_declarative
from sqlathanor import as_declarative
from sqlalchemy import Column
from sqlathanor import Column
from sqlalchemy import relationship
from sqlathanor import relationship
from sqlalchemy.ext.automap import automap_base
from sqlathanor.automap import automap_base
Now that you have imported SQLAthanor, you can just declare your models the way you normally would, even using the exact same syntax.
But now when you define your model, you can also configure serialization and de-serialization for each attribute using two approaches:
- The Meta Configuration approach lets you define a single
__serialization__attribute on your model that configures serialization/de-serialization for all of your model’s columns, hybrid properties, association proxies, and properties.
- The Declarative Configuration approach lets you supply additional arguments to your attribute definitions that control whether and how they are serialized, de-serialized, or validated.
explicit is better than implicit
—PEP 20 - The Zen of Python
By default, all columns, relationships, association proxies, and hybrid properties will not be serialized. In order for a column, relationship, proxy, or hybrid property to be serializable to a given format or de-serializable from a given format, you will need to explicitly enable serialization/deserialization.
Both the Meta and the Declarative configuration approaches use the same API for configuring serialization and de-serialization. While there are a lot of details, in general, the configuration arguments are:
If you give these options one value, it will either enable (
True) or disable (
False) both serialization and de-serialization, respectively.
But you can also supply a
tuplewith two values, where the first value controls whether the attribute supports the format when inbound (de-serialization) or whether it supports the format when outbound (serialization).
In the example above, the
passwordattribute will not be included when serializing the object (outbound). But it will be expected / supported when de-serializing the object (inbound).
on_serializeindicates the function or functions that are used to prepare an attribute for serialization. This can either be a single function (that applies to all serialization formats) or a
dictwhere each key corresponds to a format and its value is the function to use when serializing to that format.
on_serializeis left as
None, then SQLAthanor will apply a default
on_serializefunction based on the attribute’s data type.
on_deserializeindicates the function or functions that are used to validate or pre-process an attribute when de-serializing. This can either be a single function (that applies to all formats) or a
dictwhere each key corresponds to a format and its value is the function to use when de-serializing from that format.
on_deserializeis left as
None, then SQLAthanor will apply a default
on_deserializefunction based on the attribute’s data type.
So now let’s say you have a model instance and want to serialize it. It’s super easy:
That’s it! Of course, the serialization methods all support a variety of other (optional!) options to fine-tune their behavior (CSV formatting, relationship nesting, etc.).
Now let’s say you receive a
User object in serialized form and want
to create a proper Python
User object. That’s easy, too:
That’s it! Of course, all the de-serialization functions have additional options to fine-tune their behavior as needed. But that’s it.
We welcome contributions and pull requests! For more information, please see the Contributor Guide. And thanks to all those who’ve already contributed:
Detailed information about our test suite and how to run tests locally can be found in our Testing Reference.