Data Structure Saving and Loading

Attention

This section is out-dated and needs to be rewritten!

This section describes the data structure saving and loading support and infrastructure. We use the toml++ <https://https://github.com/marzer/tomlplusplus> library to assist with object saving and reloading. Objects opt into this infrastructure by providing a special save() and static load() member functions. The expected signatures and semantics of these functions will be described in this section with an API reference at the bottom.

Context Free Saving and Loading

Many classes to be saved are simple enough that they may be stored entirely inside a TOML table. We call these classes “context free” because their saving format does not depend on the directory in which an object is being saved. The example below demonstrates a simple class implementing context free loading and saving.

class ContextFreeSaveable {
    // Members
  private:
    int64_t a_;
    int64_t b_;

  public:
    ContextFreeSaveable(int64_t a, int64_t b)
        : a_{a}
        , b_{b} {}

    friend bool
    operator==(const ContextFreeSaveable&, const ContextFreeSaveable&) = default;

    // The version number used for saving and loading.
    // This can be used to detect and reload older versions of the data structure in a
    // backwards compatible way.
    static constexpr svs::lib::Version save_version = svs::lib::Version{0, 0, 1};

    // Serialized objects need a schema as well, which is essentially a unique name
    // associated with the serialized TOML table.
    //
    // The combination of schema and name allow speculative loading code some guarentee
    // as to the expected contents and types of a table.
    static constexpr std::string_view serialization_schema = "example_context_free";

    // Save the object.
    svs::lib::SaveTable save() const;
    // Load the object.
    static ContextFreeSaveable load(const svs::lib::ContextFreeLoadTable&);
};

There are several things to note. First, each class is expected to supply a named schema and version (in the form of an svs::lib::Version) information along with its serialization form. This enables classes to evolve while maintaining backwards compatibility with previously saved versions. Furthermore, the combination of schema and version enables reasoning about reloaded toml::table files, providing mechanisms like auto-loading and object detection.

Note

Once SVS matures, it is expected that object saving will not make backwards incompatible changes to their saved format without incrementing the major version of the library!

Making a breaking change to a class’ saved format will also break all classes that transitively use this class.

Next, the object returned from the save() method is a svs::lib::SaveTable, which in practice is a thin wrapper around a toml::table. The table should contain the relevant data required to reconstruct the object upon loading. Library facilities will take care of storing the version information.

Note

The toml::table class stored entries as key-value pairs. Keys beginning with two underscores “__” are reserved by the SVS saving infrastructure. Outside of that, classes are free to use whatever names they like.

Finally, loading is expected to take a svs::lib::LoadTable - also a thin wrapper around a toml::table. The table given to load will match that given by save, potentially with the addition of some reserved names (see the note above).

Implementing Save and Load

The implementation of save is given below.

// The `svs::lib::SaveTable` is a pair consisting of a `toml::table` and a version
// number.
svs::lib::SaveTable ContextFreeSaveable::save() const {
    return svs::lib::SaveTable(
        serialization_schema, save_version, {{"a", svs::lib::save(a_)}, SVS_LIST_SAVE_(b)}
    );
}

There is not much too it. Each member of ContextFreeSaveable is stored as a key-value pair svs::lib::SaveTable. The version is passed as the first argument to the constructor of svs::lib::SaveTable and the entries are passed as a std::initializer_list of key-value pairs. Keys are string-like and values should be obtained through calls to svs::lib::save(). The example shows two equivalent ways of calling svs::lib::save(). First is a direct invocation that specified the key name (“a”) and passes the member a_ to svs::lib::save(). The other uses the convenience macro SVS_LIST_SAVE_ to automatically derive the key based on the member name.

Note

When using the SVS_LIST_SAVE_ and SVS_MEMBER_LOAD_AT_ helper macros that end in underscores, a trailing underscore will be automatically appended to the target variable name.

Loading is also straightforward.

// Loading takes the items produced by `save()` and should yield an instance of the
// associated class.
ContextFreeSaveable ContextFreeSaveable::load(const svs::lib::ContextFreeLoadTable& table) {
    // Perform a version check.
    // This class is only compatible with one version.
    //
    // This check is also not needed as it is performed automatically by the loading
    // infrastructure.
    if (table.version() != save_version) {
        throw std::runtime_error("Version Mismatch!");
    }

    // Retrieve the saved values from the table.
    return ContextFreeSaveable(
        svs::lib::load_at<int64_t>(table, "a"), SVS_LOAD_MEMBER_AT_(table, b)
    );
}

The svs::lib::load_at method is used to extract the element from the table at a specific key. Alternatively, the macro SVS_LOAD_MEMBER_AT_ can be used to automatically determine the type of the object to load.

While we did not perform an explicit version check, one happens behind the scenes. To expand, if a class does not define a static method bool check_load_compatibility(std::string_view schema, svs::lib::Version version), the loading infrastructure will check the loaded schema and version against ContextFreeSaveable::serialization_schema and ContextFreeSaveable::save_version respectively. Later, we will show how to customize this compatibility check.

Using Save and Load

Saving and restoring and object to/from disk is easy.

    // Construct an object, save it to disk, and reload it.
    auto context_free = ContextFreeSaveable(10, 20);
    auto saved = svs::lib::save(context_free);
    auto context_free_reloaded =
        svs::lib::load<ContextFreeSaveable>(svs::lib::node_view(saved));

    // Check that saving and reloading was successful
    if (context_free != context_free_reloaded) {
        throw ANNEXCEPTION("Context free reloading in-memory failed!");
    }

    // We also get saving and reloading from disk for free.
    svs::lib::save_to_disk(context_free, dir);
    context_free_reloaded = svs::lib::load_from_disk<ContextFreeSaveable>(dir);

    if (context_free != context_free_reloaded) {
        throw ANNEXCEPTION("Context free reloading to-disk failed!");
    }

The example above shows constructing a ContextFreeSavable, saving it to a serialized form using svs::lib::save, and reloading it with svs::lib::load. It further shows that we can save and reload the data structure to disk using svs::lib::save_to_disk and svs::lib::load_from_disk respectively. Note that a directory is required instead of a simple .toml file because in the next section, we will discuss contextual saving, which may require multiple files. By storing the saved object in a directory, we maintain the same API.

One advantage of context free saving is that we can save an entire object inside a TOML table. This allows usage like the following example, which can be used to construct more advanced object saving in testing and benchmarking pipelines.

    // Construct an object, save it to a table and reload.
    auto context_free = ContextFreeSaveable(10, 20);
    auto table = svs::lib::save_to_table(context_free);
    auto context_free_reloaded =
        svs::lib::load<ContextFreeSaveable>(svs::lib::node_view(table));

    if (context_free != context_free_reloaded) {
        throw ANNEXCEPTION("Context free reloading failed!");
    }

Contextual Saving and Loading

Context free saving and loading is great for small key-value-like data structures. However, larger data structures like datasets and graphs can carry significant binary state unsuitable for storage in a TOML file. Instead, it is preferable to store this state in one or more auxiliary binary files that are rediscovered from the TOML configuration when loading. These data structures are “contextual” because they require run-time context in the form of the directory being processed in addition to the TOML format.

The example class definition below shows a class that implements contextual saving and loading. The motivation for contextual loading is the existence of the std::vector<float> data_ member. If the size of this vector is large, saving it in a TOML file is space and time inefficient.

class Saveable {
  private:
    // We have a member that is also a saveable object.
    ContextFreeSaveable member_;
    // The `data_` member may be arbitrarily long and is thus not necessarily suitable
    // for storage in a `toml::table`.
    std::vector<float> data_;

  public:
    Saveable(ContextFreeSaveable member, std::vector<float> data)
        : member_{member}
        , data_{std::move(data)} {}

    friend bool operator==(const Saveable&, const Saveable&) = default;

    static constexpr svs::lib::Version save_version = svs::lib::Version{0, 0, 1};
    static constexpr std::string_view serialization_schema = "example_saveable";

    // Customized compatibility check.
    static bool
    check_load_compatibility(std::string_view schema, svs::lib::Version version) {
        // Backwards compatible with version `v0.0.0`.
        return schema == serialization_schema && version <= save_version;
    }

    // Contextual saving.
    svs::lib::SaveTable save(const svs::lib::SaveContext& ctx) const;
    // Contextual loading.
    static Saveable load(const svs::lib::LoadTable& table);
};

Objects implementing contextual saving and loading have “save” and “load” methods. However, this time they require a svs::lib::SaveContext and svs::lib::LoadContext respectively. The svs::lib::SaveContext class provides a way of obtaining the saving directory as well as facilities to generate unique filenames to avoid name clashing. The svs::lib::LoadContext provides the working directory when loading. Together, these classes facilitate the generation of saved objects in a relocatable manner.

Additionally, this example shows the definition of a check_load_compatibility method. This provides a way for the class to declare its compatibility with older serialization versions and will be called if provided.

Implementing Contextual Saving and Loading

The code snippet below shows the implementation of the contextual save method.

svs::lib::SaveTable Saveable::save(const svs::lib::SaveContext& ctx) const {
    // Generate a unique name for the file where we will save the associated binary
    // data.
    //
    // This filename will be unique in the directory generated for saving this object.
    auto fullpath = ctx.generate_name("data", "bin");

    {
        // Open the file and store the contents of the vector into that file in a dense
        // binary form.
        auto ostream = svs::lib::open_write(fullpath);
        svs::lib::write_binary(ostream, data_);
    }

    // Generate a table to save the object.
    auto table = svs::lib::SaveTable(serialization_schema, save_version);

    // Use `save` to save the sub-object into a sub-table.
    // Even though `ContextFreeSaveable` is context free, we can still pass the
    // context variable if desired.
    //
    // The library infrastructure will call the correct member function.
    SVS_INSERT_SAVE_(table, member, ctx);

    // Also store the size of the vector we're going to save.
    // Since integers of type `size_t` are not natively saveable in a
    // `toml::table`, we use the overload set `svs::save` to safely convert it.
    table.insert("data_size", svs::lib::save(data_.size()));

    // Store only the relative portion of the path to make the saved object
    // relocatable.
    //
    // Again, we need to use `svs::save` to convert `std::filesystem::path`
    // to a string-like type for the `toml::table`.
    table.insert("data_file", svs::lib::save(fullpath.filename()));
    return table;
}

To save the data_ member, the generate_name() method is used to generate a unique file name in the saving directory. The contents of the vector are then saved directly to this file. We need to find this file when reloading the data structure. However, the variable fullpath returned by generate_name() is an absolute path. When we store this filepath in the TOML table, we need to ensure that we only store the final filename. When reloading, the full path will be recreated using the svs::lib::LoadContext.

This example also demonstrates another important concept: recursive saving. The Saveable class has a ContextFreeMember. To save the member, svs::lib::save is used, which will perform all the necessary steps to save that member class and return its generated TOML table, which can then be nested inside the Saveable’s TOML table.

Reloading is similar.

Saveable Saveable::load(const svs::lib::LoadTable& table) {
    // Obtain the file path and the size of the stored vector.
    auto full_path = table.resolve_at("data_file");

    // Provide compatibility with older methods where `old_data_size` was used instead
    // of `data_sizze`.
    size_t data_size = 0;
    if (table.version() == svs::lib::Version(0, 0, 0)) {
        data_size = svs::lib::load_at<size_t>(table, "old_data_size");
    } else {
        data_size = svs::lib::load_at<size_t>(table, "data_size");
    }

    // Allocate a sufficiently sized vector and
    auto data = std::vector<float>(data_size);
    {
        auto istream = svs::lib::open_read(full_path);
        svs::lib::read_binary(istream, data);
    }

    // Finish constructing the object by recursively loading the `member_` subobject.
    return Saveable(SVS_LOAD_MEMBER_AT_(table, member), std::move(data));
}

Here we see the directory obtained from the load context combined with the file name stored in the TOML table to recreate the full filepath for the saved binary data. The function svs::lib::load() is used to load the saveable subobject.

End to end saving is shown below.

    // Initialize the data vector.
    auto data = std::vector<float>(100);
    std::iota(data.begin(), data.end(), 10);

    // Construct, save, and reload.
    auto context_required = Saveable(ContextFreeSaveable(20, 30), std::move(data));
    svs::lib::save_to_disk(context_required, dir);
    auto context_required_reloaded = svs::lib::load_from_disk<Saveable>(dir);

    // Ensure that the reloaded object equals the original.
    if (context_required != context_required_reloaded) {
        throw ANNEXCEPTION("Context required reloading failed!");
    }

General Guidelines

  • Prefer context-free loading and saving if possible. It is more flexible and allows for more uses than contextual saving.

  • Use svs::lib::save() and svs::lib::load() to save and reload saveable sub-objects.

STL Support

The saving and loading infrastructure has support for several built-in types, including std::vector. The example below demonstrates the use of std::vector.

    auto data = std::vector<Saveable>(
        {Saveable(ContextFreeSaveable(10, 20), {1, 2, 3}),
         Saveable(ContextFreeSaveable(30, 40), {4, 5, 6})}
    );

    svs::lib::save_to_disk(data, dir);
    auto reloaded = svs::lib::load_from_disk<std::vector<Saveable>>(dir);
    if (reloaded != data) {
        throw ANNEXCEPTION("Reloading vector failed!");
    }

The list of built-in types is:

STL Support

Type Class

Notes

Integers

Will error if the conversion from TOML’s int64_t type is lossy.

Booleans

float, double

Lossy conversion allowed for float to support literals like “1.2”.

std::string, std::filesystem::path

std::vector<T, Alloc>

Can optionally take an allocator as the first non-context argument. Loading is contextual if T is contextual.

Advanced Features

Load Helpers

Until now, it has been assumed that the class to be loaded implements a static load method. However, this is not always convenient nor least verbose. All of the load methods described so far can take an instance of a class for the first argument. As long as this object has an appropriate load() method as described above, it can be used. In fact, the return type is not constrained, so this “load helper” may be used to create any other class.

Load Argument Forwarding

For all loading methods, an arbitrary number of trailing arguments can be appended to any call These arguments will be forwarded to the final load() method. This allows run-time context that isn’t necessary to be saveable or that can change from run to run (for example: allocator) to be given.

Load and Save Override

It is occasionally useful to use a lambda to implement ad-hoc loading and saving of some sub-components of a larger class. This can be be done by passing the lambda to the svs::lib::SaveOverride and svs::lib::LoadOverride classes respectively and passing these to the various saving and loading methods.

Power-User Functionality

Saving and loading plumbing for a class T passes through svs::lib::Saver and svs::lib::Loader proxy classes. If T implements member save and load methods, then the default definition for these proxy classes will “do the right thing” and call those methods. Alternatively, classes may chose to explicitly specialize these classes.

See the documentation on those classes for details. API Reference ————-

class SaveContext

Public Functions

inline explicit SaveContext(std::filesystem::path directory, const Version &version = CURRENT_SAVE_VERSION)

Construct a new SaveContext in the current directory.

Parameters:
  • directory – The directory where the data structure will be saved.

  • version – The saving version (leave at the default).

inline const std::filesystem::path &get_directory() const

Return the current directory where intermediate files will be saved.

inline std::filesystem::path generate_name(std::string_view prefix, std::string_view extension = "svs") const

Generate a unique filename in the saving directory.

Note that the returned std::filesystem::path is an absolute path to the saving directory and as such, should not be stored directly in any configuration table in order for the resulting saved object to be relocatable.

Instead, use the .filepath() member function to obtain a relative path to the saving directory.

Parameters:
  • prefix – An identifiable prefix for the file.

  • extension – The desired file extension.

class LoadContext

Context used when loading aggregate objects.

Public Functions

inline const std::filesystem::path &get_directory() const

Return the current directory where intermediate files will be saved.

inline std::filesystem::path resolve(const std::filesystem::path &relative) const

Return the given relative path as a full path in the loading directory.

inline const Version &version() const

Return the current global loading version scheme.

Saving and loading should prefer to implement their own versioning instead of relying on the global version.

class SaveTable

Versioned table use when saving classes.

Public Functions

inline explicit SaveTable(std::string_view schema, const Version &version)

Construct an empty table with the given version.

inline explicit SaveTable(std::string_view key, const Version &version, std::initializer_list<toml::impl::table_init_pair> init)

Construct a table using an initializer list of key-value pairs.

Generally, values of the key-value pairs should be the return values from further calls to svs::lib::save().

template<typename T>
inline void insert(std::string_view key, T &&value)

Insert a new value into the table with the provided key.

The argument value should generally be obtained directly from a call to svs::lib::save().

inline bool contains(std::string_view key) const

Checks if the container contains an element with the specified key.