Support for ZODB object serialization.
ZODB serializes objects using a custom format based on Python pickles. When an object is unserialized, it can be loaded as either a ghost or a real object. A ghost is a persistent object of the appropriate type but without any state. The first time a ghost is accessed, the persistence machinery traps access and loads the actual state. A ghost allows many persistent objects to be loaded while minimizing the memory consumption of referenced but otherwise unused objects.
Pickle format -------------
ZODB stores serialized objects using a custom format based on pickle. Each serialized object has two parts: the class description and the object state. The class description must provide enough information to call the class's ``__new__`` and create an empty object. Once the object exists as a ghost, its state is passed to ``__setstate__``.
The class description can be in a variety of formats, in part to provide backwards compatibility with earlier versions of Zope. The four current formats for class description are:
The second of these options is used if the object has a __getnewargs__() method. It is intended to support objects like persistent classes that have custom C layouts that are determined by arguments to __new__(). The third and fourth (#3 & #7) apply to instances of a persistent class (which means the class itself is persistent, not that it's a subclass of Persistent).
The type object is usually stored using the standard pickle mechanism, which involves the pickle GLOBAL opcode (giving the type's module and name as strings). The type may itself be a persistent object, in which case a persistent reference (see below) is used.
It's unclear what "usually" means in the last paragraph. There are two useful places to concentrate confusion about exactly which formats exist:
Earlier versions of Zope supported several other kinds of class descriptions. The current serialization code reads these descriptions, but does not write them. The three earlier formats are:
Formats 4 and 6 are used only if the class defines a __getinitargs__() method, but we really can't tell them apart from formats 7 and 2 (respectively). Formats 5 and 6 are used if the class does not have a __module__ attribute (I'm not sure when this applies, but I think it occurs for some but not all ZClasses).
Persistent references ---------------------
When one persistent object pickle refers to another persistent object, the database uses a persistent reference.
ZODB persistent references are of the form::
oid A simple object reference.
(oid, class meta data) A persistent object reference
[reference_type, args] An extended reference
Extension references come in a number of subforms, based on the reference types.
The following reference types are defined:
w
Persistent weak reference. The arguments consist of an oid.
The following are planned for the future:
n
Multi-database simple object reference. The arguments consist
of a databaase name, and an object id.
m
Multi-database persistent object reference. The arguments consist
of a databaase name, an object id, and class meta data.
The following legacy format is also supported.
[oid] A persistent weak reference
Because the persistent object reference forms include class information, it is not possible to change the class of a persistent object for which this form is used. If a transaction changed the class of an object, a new record with new class metadata would be written but all the old references would still use the old class. (It is possible that we could deal with this limitation in the future.)
An object id is used alone when a class requires arguments to it's __new__ method, which is signalled by the class having a __getnewargs__ attribute.
A number of legacyforms are defined: