Serializing computations¶
Loman can serialize computations to a JSON file for later inspection or post-mortem debugging. This is useful when a scheduled job should capture its inputs, intermediates, and results so they can be examined if something goes wrong.
>>> import math
>>> from loman import Computation
>>> comp = Computation()
>>> comp.add_node('x', value=4.0)
>>> def area(x):
... return math.pi * x ** 2
>>> comp.add_node('area', area)
>>> comp.compute_all()
>>> comp.to_dict()
{'x': 4.0, 'area': 50.26548245743669}
To save and reload the computation:
>>> comp.write_json('comp.json')
>>> comp2 = Computation.read_json('comp.json')
>>> comp2.v.area
50.26548245743669
The output is a plain JSON text file, so it is human-readable and can be inspected with any text editor.
Excluding nodes from serialization¶
Sometimes a node holds a value that should not (or cannot) be saved — for example a database connection, a licensed dataset, or an object that does not support JSON serialization. Pass serialize=False when adding the node:
>>> import sqlalchemy as sa
>>> comp = Computation()
>>> comp.add_node('engine', sa.create_engine('sqlite://'), serialize=False)
>>> comp.add_node('result', value=42)
>>> comp.write_json('comp.json')
>>> comp2 = Computation.read_json('comp.json')
>>> comp2.state('engine')
<States.UNINITIALIZED: 1>
>>> comp2.v.result
42
The excluded node is preserved in the file with UNINITIALIZED state and no value; all other nodes round-trip normally.
Lambdas are not serializable by default¶
A lambda cannot be serialized because it has no importable module path. Use a module-level function instead:
>>> from loman import Computation, ComputationSerializer, SerializationError
>>> comp = Computation()
>>> comp.add_node('a', value=1)
>>> comp.add_node('b', lambda a: a + 1)
>>> comp.compute_all()
>>> import io
>>> try:
... comp.write_json(io.StringIO())
... except SerializationError as e:
... print(e)
Cannot serialize lambda function on node NodeKey(parts=('b',)). Use a module-level importable function, serialize=False, or ComputationSerializer(use_dill_for_functions=True).
Replace the lambda with a named function defined at module level:
>>> def increment(a):
... return a + 1
>>> comp.add_node('b', increment)
>>> comp.compute_all()
>>> comp.write_json('comp.json') # now succeeds
Using dill to serialize lambdas and closures¶
When refactoring to named functions is impractical, pass use_dill_for_functions=True to ComputationSerializer. This serializes any callable — including lambdas and closures that capture local variables — as a base64-encoded dill blob inside the JSON:
>>> s = ComputationSerializer(use_dill_for_functions=True)
>>> comp = Computation()
>>> comp.add_node('a', value=3)
>>> comp.add_node('b', lambda a: a * 2)
>>> comp.compute_all()
>>> buf = io.StringIO()
>>> comp.write_json(buf, serializer=s)
>>> _ = buf.seek(0)
>>> comp2 = Computation.read_json(buf, serializer=s)
>>> comp2.v.b
6
>>> comp2.insert('a', 10)
>>> comp2.compute_all()
>>> comp2.v.b
20
The same serializer instance must be passed to both write_json and read_json.
Warning
The dill blob embedded in the JSON is not portable across Python versions and shares the same stability caveats as the deprecated write_dill. Prefer named functions when long-term compatibility matters.
File objects and strings¶
Both write_json and read_json accept either a file path (string) or any text-mode file-like object:
>>> import io
>>> buf = io.StringIO()
>>> comp.write_json(buf)
>>> _ = buf.seek(0)
>>> comp3 = Computation.read_json(buf)
>>> comp3.v.b
2
PINNED nodes¶
Pinned nodes round-trip correctly — their PINNED state and value are preserved:
>>> comp = Computation()
>>> comp.add_node('a', value=10)
>>> comp.pin('a')
>>> comp.write_json('comp.json')
>>> comp2 = Computation.read_json('comp.json')
>>> comp2.state('a')
<States.PINNED: 5>
>>> comp2.v.a
10
ERROR nodes¶
If a node is in ERROR state, its exception type, message, and traceback are preserved as strings so they can be read back for post-mortem inspection even without the original exception class:
>>> def bad_func():
... raise ValueError("something went wrong")
>>> comp = Computation()
>>> comp.add_node('result', bad_func)
>>> comp.compute_all()
>>> comp.state('result')
<States.ERROR: 4>
>>> comp.write_json('comp.json')
>>> comp2 = Computation.read_json('comp.json')
>>> comp2.state('result')
<States.ERROR: 4>
>>> comp2['result'].value.exception
Exception('something went wrong')
Custom serialization for user-defined types¶
For types that are not handled by the default serializer, pass a custom ComputationSerializer instance with additional transformers registered:
>>> from loman import Computation, ComputationSerializer
>>> from loman.serialization import CustomTransformer, Transformer
>>> class Point:
... def __init__(self, x, y):
... self.x = x
... self.y = y
>>> point_transformer = CustomTransformer(
... Point,
... to_dict=lambda v: {'__point__': True, 'x': v.x, 'y': v.y},
... from_dict=lambda d: Point(d['x'], d['y']),
... )
>>> s = ComputationSerializer()
>>> s._t.register(point_transformer)
>>> comp = Computation()
>>> comp.add_node('origin', value=Point(0, 0))
>>> buf = io.StringIO()
>>> comp.write_json(buf, serializer=s)
>>> _ = buf.seek(0)
>>> comp2 = Computation.read_json(buf, serializer=s)
>>> comp2.v.origin.x
0
Pandas support¶
DataFrames and Series are serialized automatically:
>>> import pandas as pd
>>> comp = Computation()
>>> comp.add_node('df', value=pd.DataFrame({'a': [1, 2], 'b': [3, 4]}))
>>> buf = io.StringIO()
>>> comp.write_json(buf)
>>> _ = buf.seek(0)
>>> comp2 = Computation.read_json(buf)
>>> comp2.v.df.shape
(2, 2)
JSON format reference¶
The file is a single JSON object with three top-level keys:
{
"version": 1,
"nodes": [ ... ],
"edges": [ ... ]
}
Node object¶
Each entry in nodes has:
| Field | Type | Description |
|---|---|---|
key |
string | Node name. Hierarchical keys use / as separator. |
state |
string | null | States enum name: "UPTODATE", "STALE", "UNINITIALIZED", "ERROR", "PINNED", … |
value |
any | Encoded value (see below), or null when absent. |
has_value |
bool | true when value should be restored; false when the node has no value. |
func |
object | null | Encoded callable (see below), or null. |
serialize |
bool | Whether the node carries the __serialize__ tag. |
tags |
list[string] | Non-system user tags. |
Edge object¶
Each entry in edges has:
| Field | Type | Description |
|---|---|---|
src |
string | Source node key. |
dst |
string | Destination node key. |
param_type |
"arg" | "kwd" | null |
How the value is passed to the function. |
param |
int | string | null | Positional index for "arg", parameter name for "kwd". |
Value encoding¶
Plain Python scalars (int, float, str, bool, None) are stored as-is.
Compound types use a tagged object with a "type" discriminator:
NumPy array
{
"type": "ndarray",
"shape": [3],
"dtype": "<f8",
"data": [1.0, 2.0, 3.0]
}
Pandas DataFrame (split orientation, column dtypes preserved)
{
"type": "dataframe",
"columns": ["x", "y"],
"index": [0, 1],
"data": [[1.0, 3.0], [2.0, 4.0]],
"dtypes": {"x": "int64", "y": "float64"}
}
ERROR node value (exception preserved as strings for post-mortem)
{
"__loman_error__": true,
"exception_type": "ValueError",
"exception_str": "something went wrong",
"traceback": "Traceback (most recent call last):\n ..."
}
Function encoding¶
Importable module-level function (default)
{
"type": "func_ref",
"module": "mypackage.calcs",
"qualname": "compute_result"
}
Lambda or closure (only when use_dill_for_functions=True)
{
"type": "dill_func",
"blob": "gASVyQAAAAAAAACMCmRpbGwuX2RpbGyU..."
}
The blob field is a base64-encoded dill byte string. It is not portable across Python versions.
Note
The JSON serialization format is not intended for long-term storage. It is designed for short-term inspection and post-mortem debugging. The format may change between releases.