Data Structure Saving and Loading
Attention
This section is out-dated and needs to be rewritten!
This section describes the data structure saving and loading support and infrastructure. We use the toml++ <https://https://github.com/marzer/tomlplusplus> library to assist with object saving and reloading. Objects opt into this infrastructure by providing a special save() and static load() member functions. The expected signatures and semantics of these functions will be described in this section with an API reference at the bottom.
Context Free Saving and Loading
Many classes to be saved are simple enough that they may be stored entirely inside a TOML table. We call these classes “context free” because their saving format does not depend on the directory in which an object is being saved. The example below demonstrates a simple class implementing context free loading and saving.
class ContextFreeSaveable {
// Members
private:
int64_t a_;
int64_t b_;
public:
ContextFreeSaveable(int64_t a, int64_t b)
: a_{a}
, b_{b} {}
friend bool
operator==(const ContextFreeSaveable&, const ContextFreeSaveable&) = default;
// The version number used for saving and loading.
// This can be used to detect and reload older versions of the data structure in a
// backwards compatible way.
static constexpr svs::lib::Version save_version = svs::lib::Version{0, 0, 1};
// Serialized objects need a schema as well, which is essentially a unique name
// associated with the serialized TOML table.
//
// The combination of schema and name allow speculative loading code some guarentee
// as to the expected contents and types of a table.
static constexpr std::string_view serialization_schema = "example_context_free";
// Save the object.
svs::lib::SaveTable save() const;
// Load the object.
static ContextFreeSaveable load(const svs::lib::ContextFreeLoadTable&);
};
There are several things to note.
First, each class is expected to supply a named schema and version (in the form of an svs::lib::Version
) information along with its serialization form.
This enables classes to evolve while maintaining backwards compatibility with previously saved versions.
Furthermore, the combination of schema and version enables reasoning about reloaded toml::table
files, providing mechanisms like auto-loading and object detection.
Note
Once SVS matures, it is expected that object saving will not make backwards incompatible changes to their saved format without incrementing the major version of the library!
Making a breaking change to a class’ saved format will also break all classes that transitively use this class.
Next, the object returned from the save()
method is a svs::lib::SaveTable
, which in practice is a thin wrapper around a toml::table
.
The table should contain the relevant data required to reconstruct the object upon loading.
Library facilities will take care of storing the version information.
Note
The toml::table
class stored entries as key-value pairs.
Keys beginning with two underscores “__” are reserved by the SVS saving infrastructure.
Outside of that, classes are free to use whatever names they like.
Finally, loading is expected to take a svs::lib::LoadTable
- also a thin wrapper around a toml::table
.
The table given to load
will match that given by save
, potentially with the addition of some reserved names (see the note above).
Implementing Save and Load
The implementation of save
is given below.
// The `svs::lib::SaveTable` is a pair consisting of a `toml::table` and a version
// number.
svs::lib::SaveTable ContextFreeSaveable::save() const {
return svs::lib::SaveTable(
serialization_schema, save_version, {{"a", svs::lib::save(a_)}, SVS_LIST_SAVE_(b)}
);
}
There is not much too it.
Each member of ContextFreeSaveable
is stored as a key-value pair svs::lib::SaveTable
.
The version is passed as the first argument to the constructor of svs::lib::SaveTable
and the entries are passed as a std::initializer_list
of key-value pairs.
Keys are string-like and values should be obtained through calls to svs::lib::save()
.
The example shows two equivalent ways of calling svs::lib::save()
.
First is a direct invocation that specified the key name (“a”) and passes the member a_
to svs::lib::save()
.
The other uses the convenience macro SVS_LIST_SAVE_
to automatically derive the key based on the member name.
Note
When using the SVS_LIST_SAVE_
and SVS_MEMBER_LOAD_AT_
helper macros that end in underscores, a trailing underscore will be automatically appended to the target variable name.
Loading is also straightforward.
// Loading takes the items produced by `save()` and should yield an instance of the
// associated class.
ContextFreeSaveable ContextFreeSaveable::load(const svs::lib::ContextFreeLoadTable& table) {
// Perform a version check.
// This class is only compatible with one version.
//
// This check is also not needed as it is performed automatically by the loading
// infrastructure.
if (table.version() != save_version) {
throw std::runtime_error("Version Mismatch!");
}
// Retrieve the saved values from the table.
return ContextFreeSaveable(
svs::lib::load_at<int64_t>(table, "a"), SVS_LOAD_MEMBER_AT_(table, b)
);
}
The svs::lib::load_at
method is used to extract the element from the table at a specific key.
Alternatively, the macro SVS_LOAD_MEMBER_AT_
can be used to automatically determine the type of the object to load.
While we did not perform an explicit version check, one happens behind the scenes.
To expand, if a class does not define a static method bool check_load_compatibility(std::string_view schema, svs::lib::Version version)
, the loading infrastructure will check the loaded schema and version against ContextFreeSaveable::serialization_schema
and ContextFreeSaveable::save_version
respectively.
Later, we will show how to customize this compatibility check.
Using Save and Load
Saving and restoring and object to/from disk is easy.
// Construct an object, save it to disk, and reload it.
auto context_free = ContextFreeSaveable(10, 20);
auto saved = svs::lib::save(context_free);
auto context_free_reloaded =
svs::lib::load<ContextFreeSaveable>(svs::lib::node_view(saved));
// Check that saving and reloading was successful
if (context_free != context_free_reloaded) {
throw ANNEXCEPTION("Context free reloading in-memory failed!");
}
// We also get saving and reloading from disk for free.
svs::lib::save_to_disk(context_free, dir);
context_free_reloaded = svs::lib::load_from_disk<ContextFreeSaveable>(dir);
if (context_free != context_free_reloaded) {
throw ANNEXCEPTION("Context free reloading to-disk failed!");
}
The example above shows constructing a ContextFreeSavable
, saving it to a serialized form using svs::lib::save
, and reloading it with svs::lib::load
.
It further shows that we can save and reload the data structure to disk using svs::lib::save_to_disk
and svs::lib::load_from_disk
respectively.
Note that a directory is required instead of a simple .toml
file because in the next section, we will discuss contextual saving, which may require multiple files.
By storing the saved object in a directory, we maintain the same API.
One advantage of context free saving is that we can save an entire object inside a TOML table. This allows usage like the following example, which can be used to construct more advanced object saving in testing and benchmarking pipelines.
// Construct an object, save it to a table and reload.
auto context_free = ContextFreeSaveable(10, 20);
auto table = svs::lib::save_to_table(context_free);
auto context_free_reloaded =
svs::lib::load<ContextFreeSaveable>(svs::lib::node_view(table));
if (context_free != context_free_reloaded) {
throw ANNEXCEPTION("Context free reloading failed!");
}
Contextual Saving and Loading
Context free saving and loading is great for small key-value-like data structures. However, larger data structures like datasets and graphs can carry significant binary state unsuitable for storage in a TOML file. Instead, it is preferable to store this state in one or more auxiliary binary files that are rediscovered from the TOML configuration when loading. These data structures are “contextual” because they require run-time context in the form of the directory being processed in addition to the TOML format.
The example class definition below shows a class that implements contextual saving and loading.
The motivation for contextual loading is the existence of the std::vector<float> data_
member.
If the size of this vector is large, saving it in a TOML file is space and time inefficient.
class Saveable {
private:
// We have a member that is also a saveable object.
ContextFreeSaveable member_;
// The `data_` member may be arbitrarily long and is thus not necessarily suitable
// for storage in a `toml::table`.
std::vector<float> data_;
public:
Saveable(ContextFreeSaveable member, std::vector<float> data)
: member_{member}
, data_{std::move(data)} {}
friend bool operator==(const Saveable&, const Saveable&) = default;
static constexpr svs::lib::Version save_version = svs::lib::Version{0, 0, 1};
static constexpr std::string_view serialization_schema = "example_saveable";
// Customized compatibility check.
static bool
check_load_compatibility(std::string_view schema, svs::lib::Version version) {
// Backwards compatible with version `v0.0.0`.
return schema == serialization_schema && version <= save_version;
}
// Contextual saving.
svs::lib::SaveTable save(const svs::lib::SaveContext& ctx) const;
// Contextual loading.
static Saveable load(const svs::lib::LoadTable& table);
};
Objects implementing contextual saving and loading have “save” and “load” methods.
However, this time they require a svs::lib::SaveContext
and svs::lib::LoadContext
respectively.
The svs::lib::SaveContext
class provides a way of obtaining the saving directory as well as facilities to generate unique filenames to avoid name clashing.
The svs::lib::LoadContext
provides the working directory when loading.
Together, these classes facilitate the generation of saved objects in a relocatable manner.
Additionally, this example shows the definition of a check_load_compatibility
method.
This provides a way for the class to declare its compatibility with older serialization versions and will be called if provided.
Implementing Contextual Saving and Loading
The code snippet below shows the implementation of the contextual save method.
svs::lib::SaveTable Saveable::save(const svs::lib::SaveContext& ctx) const {
// Generate a unique name for the file where we will save the associated binary
// data.
//
// This filename will be unique in the directory generated for saving this object.
auto fullpath = ctx.generate_name("data", "bin");
{
// Open the file and store the contents of the vector into that file in a dense
// binary form.
auto ostream = svs::lib::open_write(fullpath);
svs::lib::write_binary(ostream, data_);
}
// Generate a table to save the object.
auto table = svs::lib::SaveTable(serialization_schema, save_version);
// Use `save` to save the sub-object into a sub-table.
// Even though `ContextFreeSaveable` is context free, we can still pass the
// context variable if desired.
//
// The library infrastructure will call the correct member function.
SVS_INSERT_SAVE_(table, member, ctx);
// Also store the size of the vector we're going to save.
// Since integers of type `size_t` are not natively saveable in a
// `toml::table`, we use the overload set `svs::save` to safely convert it.
table.insert("data_size", svs::lib::save(data_.size()));
// Store only the relative portion of the path to make the saved object
// relocatable.
//
// Again, we need to use `svs::save` to convert `std::filesystem::path`
// to a string-like type for the `toml::table`.
table.insert("data_file", svs::lib::save(fullpath.filename()));
return table;
}
To save the data_
member, the generate_name()
method is used to generate a unique file name in the saving directory.
The contents of the vector are then saved directly to this file.
We need to find this file when reloading the data structure.
However, the variable fullpath
returned by generate_name()
is an absolute path.
When we store this filepath in the TOML table, we need to ensure that we only store the final filename.
When reloading, the full path will be recreated using the svs::lib::LoadContext
.
This example also demonstrates another important concept: recursive saving.
The Saveable
class has a ContextFreeMember
.
To save the member, svs::lib::save
is used, which will perform all the necessary steps to save that member class and return its generated TOML table, which can then be nested inside the Saveable
’s TOML table.
Reloading is similar.
Saveable Saveable::load(const svs::lib::LoadTable& table) {
// Obtain the file path and the size of the stored vector.
auto full_path = table.resolve_at("data_file");
// Provide compatibility with older methods where `old_data_size` was used instead
// of `data_sizze`.
size_t data_size = 0;
if (table.version() == svs::lib::Version(0, 0, 0)) {
data_size = svs::lib::load_at<size_t>(table, "old_data_size");
} else {
data_size = svs::lib::load_at<size_t>(table, "data_size");
}
// Allocate a sufficiently sized vector and
auto data = std::vector<float>(data_size);
{
auto istream = svs::lib::open_read(full_path);
svs::lib::read_binary(istream, data);
}
// Finish constructing the object by recursively loading the `member_` subobject.
return Saveable(SVS_LOAD_MEMBER_AT_(table, member), std::move(data));
}
Here we see the directory obtained from the load context combined with the file name stored in the TOML table to recreate the full filepath for the saved binary data.
The function svs::lib::load()
is used to load the saveable subobject.
End to end saving is shown below.
// Initialize the data vector.
auto data = std::vector<float>(100);
std::iota(data.begin(), data.end(), 10);
// Construct, save, and reload.
auto context_required = Saveable(ContextFreeSaveable(20, 30), std::move(data));
svs::lib::save_to_disk(context_required, dir);
auto context_required_reloaded = svs::lib::load_from_disk<Saveable>(dir);
// Ensure that the reloaded object equals the original.
if (context_required != context_required_reloaded) {
throw ANNEXCEPTION("Context required reloading failed!");
}
General Guidelines
Prefer context-free loading and saving if possible. It is more flexible and allows for more uses than contextual saving.
Use
svs::lib::save()
andsvs::lib::load()
to save and reload saveable sub-objects.
STL Support
The saving and loading infrastructure has support for several built-in types, including std::vector
.
The example below demonstrates the use of std::vector
.
auto data = std::vector<Saveable>(
{Saveable(ContextFreeSaveable(10, 20), {1, 2, 3}),
Saveable(ContextFreeSaveable(30, 40), {4, 5, 6})}
);
svs::lib::save_to_disk(data, dir);
auto reloaded = svs::lib::load_from_disk<std::vector<Saveable>>(dir);
if (reloaded != data) {
throw ANNEXCEPTION("Reloading vector failed!");
}
The list of built-in types is:
Type Class |
Notes |
Integers |
Will error if the conversion from TOML’s |
Booleans |
|
|
Lossy conversion allowed for |
|
|
|
Can optionally take an allocator as the first non-context argument. Loading is contextual if |
Advanced Features
Load Helpers
Until now, it has been assumed that the class to be loaded implements a static load
method.
However, this is not always convenient nor least verbose.
All of the load methods described so far can take an instance of a class for the first argument.
As long as this object has an appropriate load()
method as described above, it can be used.
In fact, the return type is not constrained, so this “load helper” may be used to create any other class.
Load Argument Forwarding
For all loading methods, an arbitrary number of trailing arguments can be appended to any call
These arguments will be forwarded to the final load()
method.
This allows run-time context that isn’t necessary to be saveable or that can change from run to run (for example: allocator) to be given.
Load and Save Override
It is occasionally useful to use a lambda to implement ad-hoc loading and saving of some sub-components of a larger class.
This can be be done by passing the lambda to the svs::lib::SaveOverride
and svs::lib::LoadOverride
classes respectively and passing these to the various saving and loading methods.
Power-User Functionality
Saving and loading plumbing for a class T
passes through svs::lib::Saver
and svs::lib::Loader
proxy classes.
If T
implements member save
and load
methods, then the default definition for these proxy classes will “do the right thing” and call those methods.
Alternatively, classes may chose to explicitly specialize these classes.
See the documentation on those classes for details. API Reference ————-
-
class SaveContext
Public Functions
-
inline explicit SaveContext(std::filesystem::path directory, const Version &version = CURRENT_SAVE_VERSION)
Construct a new SaveContext in the current directory.
- Parameters:
directory – The directory where the data structure will be saved.
version – The saving version (leave at the default).
-
inline const std::filesystem::path &get_directory() const
Return the current directory where intermediate files will be saved.
-
inline std::filesystem::path generate_name(std::string_view prefix, std::string_view extension = "svs") const
Generate a unique filename in the saving directory.
Note that the returned
std::filesystem::path
is an absolute path to the saving directory and as such, should not be stored directly in any configuration table in order for the resulting saved object to be relocatable.Instead, use the
.filepath()
member function to obtain a relative path to the saving directory.- Parameters:
prefix – An identifiable prefix for the file.
extension – The desired file extension.
-
inline explicit SaveContext(std::filesystem::path directory, const Version &version = CURRENT_SAVE_VERSION)
-
class LoadContext
Context used when loading aggregate objects.
Public Functions
-
inline const std::filesystem::path &get_directory() const
Return the current directory where intermediate files will be saved.
-
inline std::filesystem::path resolve(const std::filesystem::path &relative) const
Return the given relative path as a full path in the loading directory.
-
inline const Version &version() const
Return the current global loading version scheme.
Saving and loading should prefer to implement their own versioning instead of relying on the global version.
-
inline const std::filesystem::path &get_directory() const
-
class SaveTable
Versioned table use when saving classes.
Public Functions
-
inline explicit SaveTable(std::string_view schema, const Version &version)
Construct an empty table with the given version.
-
inline explicit SaveTable(std::string_view key, const Version &version, std::initializer_list<toml::impl::table_init_pair> init)
Construct a table using an initializer list of key-value pairs.
Generally, values of the key-value pairs should be the return values from further calls to
svs::lib::save()
.
-
template<typename T>
inline void insert(std::string_view key, T &&value) Insert a new value into the table with the provided key.
The argument
value
should generally be obtained directly from a call tosvs::lib::save()
.
-
inline bool contains(std::string_view key) const
Checks if the container contains an element with the specified key.
-
inline explicit SaveTable(std::string_view schema, const Version &version)