STON (for Smalltalk Object Notation) is a lightweight, text-based, and human-readable data-interchange format. STON is developed by Sven Van Caekenberghe. STON can be used to serialize domain level objects, either for persistency or for network transport. As its name suggests, it is based on JSON (see also Chapter NeoJSON). It adds symbols as a primitive value, and class tags for object values and references. Implementations for Pharo Smalltalk, Squeak and Gemstone Smalltalk are available.
Some of these differences are due to the fact that JSON knows only about lists and maps, which means that there is no concept of object types or classes. As a result it is not easy to encode arbitrary objects, and some of the possible solutions are quite verbose. For example, the type or class is encoded as a property and/or an indirection to encode the object's contents is added. To address this, STON extends JSON by adding a primitive value, and 'class' tags for object values and references, as we will see next.
STON offers three main features:
Limitations of STON are that in its current form it cannot serialize a number of objects that are more system or implementation than domain oriented, such as Blocks and classes. STON is also less efficient than a binary encoding such as Fuel.
A reference implementation for STON was implemented in Pharo and works in versions 1.3, 1.4, 2.0, 3.0 and 4.0. The project contains a full complement of unit tests.
STON is hosted on SmalltalkHub. To load STON, execute the following code snippet:
You can also add the following repository to your package browser:
We now show how to serialize and materialize objects, starting with a simple rectangle and then continuing with more complex objects.
To generate a STON representation for an object, STON provides two messages
toStringPretty:. The first message generates a compact version
and the second displays the serialized version in a more readable way. For example:
What is shown above follows the default representation scheme for objects. Each class can define its own custom representation, as discussed in section 3.
Once you have the textual representation of an object you can obtain the encoded objects using the
Alternatively, you can also use the STON facade as follows
This example shows how more complex data structures are represented in STON.
Maps are represented by curly braces
}, with keys and values separated by a colon
: and items are separated by a comma
, . Lists are
] and their items are separated by a comma. Class tags are represented by
ClassName [ ... ] or
Next is an example of what pretty printed STON for a simple object looks like. Even without further explanation, the semantics should be clear.
Here is a more complex example: a ZnResponse object. It is the result of serializing the result of the following HTTP request (using Zinc, see Chapters Zinc Client and Zinc Server). It also shows that curly braces are for dictionaries and square brackets are for lists.
Note that when encoding regular objects, STON uses Symbols as keys. For Dictionaries, you can use Symbols, Strings and Numbers as keys.
We will now go into detail on how the notation encodes Smalltalk values. Values are either a primitive value or an object value. Note that the undefined object
nil and a reference to an already encountered object are considered values as well.
The kinds of values which are considered as primitives are numbers, strings, symbols, booleans and
nil. We talk about each of these next, and we show an
example of their encoding.
Numbers are either integers or floats.
Strings are enclosed using single quotes and backslash is used as the escape character. A general Unicode escape mechanism using four hexadecimal digits can be used to encode any character. Some unreadable characters have their own escape code, like in JSON. STON conventionally encodes all non-printable non-ASCII characters.
Symbols are preceded by a
Symbols consisting of a limited character set (letters, numbers, a dot, underscore, dash or forward slash) are written literally. Symbols containing characters
outside this limited set are encoded like strings, enclosed in single quotes.
Booleans consist of the constants
The undefined object is represented by the constant
Values that are not primitives can be three kinds of objects. The first kind is a collection of values: lists or maps, the second kind is a non-collection object, and the last kind is a reference to another value.
Like in JSON, STON uses two primitive composition mechanisms: lists and maps. Lists consist of an ordered collection of arbitrary objects. Maps consist of an unordered collection of key-value pairs. Keys can be strings, symbols or numbers, and values are arbitrary objects.
Lists are delimited by
]. Items are separated by a comma
For example the following expression is a list with two numbers -40 and -15.
The serialization of an array is represented by a list.
Lists are also used to represent values of certain object instance variables, as discussed in section 3.
Maps are delimited by
}. Keys and values are separated by a colon
: and items are separated by a comma
,. Dictionaries are serialized as
maps, for example as below:
An object in STON has a class tag and a representation. A class tag starts with an alphabetic uppercase letter
and contains alphanumeric characters only. A representation is either a list or a map.
The next example shows an instance of the class
This is a generic way to encode arbitrary objects. Non-collection classes are encoded using a map of their instance variables: instance variable name (a symbol) mapped to instance variable value. Collection classes are encoded using a list of their values.
For the list like collection subclass
Array, the class tag is optional, given a list representation. The following pairs are thus equivalent:
Also, for the map like collection subclass
Dictionary the class tag is optional, given a map representation:
To support shared objects and cycles in the object graph, STON adds the concept of references to JSON.
Each object value encountered during a depth first traversal of the graph is numbered from 1 up.
If a object is encountered again, only a reference to its number is recorded.
References consist of the
@ sign followed by a positive integer.
When the data is materialized, references are resolved after reconstructing the object graph.
Here is an OrderedCollection that shares a Point object three times:
A two element
Array that refers to itself in its second element will look like this:
Note that strings are not treated as objects and are consequently never shared.
In the current reference implementation in Pharo, a number of classes received a special, custom representation, often chosen for compactness and readability. We give a list of them here and then discuss on how to implement such a custom representation.
Time is represented by a one element array with an ISO style
Date is represented as a one element array with an ISO style
TimeStamp is represented as a one element array with an ISO style
Point is represented as a two element array with the x and y values
ByteArray is represented as a one element array with a hex string
Character is represented as a one element array with a one element string
Associations are represented as a pair separated by
Nesting is also possible
#foo : 1 : 2 means
Note that this custom representation does not change the way maps (either for dictionaries or for arbitrary objects) work. In practice, this means that there are now two closely related expressions:
In the first case you get an Array of explicit Associations, in the second case you get a Dictionary (which uses Associations internally).
The choice of using a default STON mapping for objects or to prefer a custom representation is up to you and your application. In the generic mapping instance variable names (as symbols) and their values become keys and values in a map. This is flexible: it won't break when instance variables are added or removed. It is however more verbose and exposes all internals of an object, including ephemeral ones. Custom representations are most useful to increase the readability of small, simple objects.
The key methods are instance method
stonOn: and class or instance method
The former produces a STON representation of the object and the latter creates a new object from a STON representation. If
fromSton: is implemented at
instance side, STON will first create an instance of the object before calling
fromSton: e.g. as in
Point. If implemented at class side, the creation of
the instance is the responsibility of the
fromSton: method, e.g. as in
During encoding, classes can output a one-line representation of themselves by sending either the message
#writeObject:streamShortList: to an instance of
STONWriter. The first argument of the message should be
self and the second argument a single element, or a
collection of elements respectively.
Examples of this are below:
An instance of
STONWriter also understands the
#writeObject:streamMap: messages, which generate a multi-line representation. Also, classes can use another external name by overriding
STON offers a way to control which instance variables get written and the order in which they get written.
This can be done by overwriting
Object class>>#stonAllInstVarNames to return an array of symbols. Each symbol is the name of a variable and the order of the
symbols determines write order. Also, having
true causes instance variables to be written out when they are
nil (the default is to omit them).
Lastly, postprocessing on instance variables for resolving references is realized by the
Object>>stonProcessSubObjects: method. If custom postprocessing is
required, this method should be overwritten.
This section lists some code examples on how to use the current implementation and its API.
STON acts as a class facade API to read/write to/from streams/strings
while hiding the actual parser or writer classes. It is a central access point, but it is very thin:
using the reader or writer directly is perfectly fine, and offers some more options as well.
Parsing is the simplest operation, use either the
fromStream: method, like this:
Invoking the reader (parser) directly goes like this:
Writing has two variants: the regular compact representation or the pretty printed one.
The methods to use are
put:onStreamPretty:, like this:
Like JSON, STON does not allow comments of any kind in its format.
However, STON offers the possibility to handle comments using a special stream named
The following snippets illustrate two ways to use this stream:
This helper class is useable in other contexts too, like for NeoJSON. The advantage is that it does not change the STON (or JSON) syntax itself, it just adds some functionality on top.
The writer can be created explicitly as follows:
When created, the reference policy of the writer can be set.
The default for STON is to track object references and generate references when needed.
Other options are to signal an error on shared references by sending the writer
referencePolicy: #error, or to ignore them (
with the risk of going into an infinite loop. An example of the error reference policy is below:
The current STON implementation has a very large degree of JSON compatibility.
Valid JSON input is almost always valid STON. The only exceptions are the string delimiters (single quotes for STON, double quotes for JSON) and
null. The STON parser accepts both variants for full compatibility.
The STON writer has a
jsonMode option so that generated output conforms to standard JSON. That means the use of single quotes as string delimiters,
nil, and the treatment of symbols as strings. When using JSON mode the reference policy should be set to
#ignore for full JSON
compatibility. Also, as JSON does not understand non-primitive values outside of arrays or dictionaries, it is necessary to convert data structures to an
Dictionary first. Attempting to write non primitive instances that are not arrays or dictionaries will throw an error.
Next is an example of how to use the STON writer to generate JSON output.
STON also supports the conversion or not of CR, LF, or CRLF characters inside strings and symbols as one chosen canonical newLine.
STONReader>>convertNewLines: aBoolean and the message
STONReader>>newLine: aCharacter read and convert CR, LF, or CRLF inside strings and symbols as
one chosen canonical newLine. When true, any newline CR, LF or CRLF read unescaped inside strings or symbols will be converted to the newline convention chosen,
newLine:. The default is false, not doing any convertions.
In the following example, any CR, LF or CRLF seen while reading Strings will all be converted to the same EOL, CRLF.
STONWriter>>keepNewLines: aBoolean works as follows: If true, any newline CR, LF or CRLF inside strings or symbols will not be escaped but will
instead be converted to the newline convention chosen, see
newLine:. The default is false, where CR, LF or CRLF will be enscaped unchanged.
Any CR, LF or CRLF inside any String will no longer be written as
\r\n but all as CRLF, a normal EOL.
STON is a practical and simple text-based object serializer based on JSON (see also Chapter NeoJSON). We have shown how to use it, how values are encoded and how to define a custom representation for a given class.