Distance Computation
Functor are used to implement and tailor distance computations in a number of ways.
The library uses functors (i.e., structs) for a number of reasons summarised below.
In general, the function svs::distance::compute()
is used to compute the similarity between queries on the left and dataset elements on the right.
Often, distance computation needs to perform some pre-processing per query inorder to execute efficiently. Using structs allows for this behavior.
If structs are empty, than the extra overhead introduced (e.g., space overhead) by them is negligible compared to function references or pointers.
Overloading operations for structs is easier than function references.
Using functors allows for fancier options like type-erasure to reduce generated code if desired.
In particular, bullet point (1) deserves a little explanation. An efficient implementation for similarity metric such as cosine-similarity may involve either computing the norm of the query (externally supplied vector) or normalizing the vector. Both cases require some local scratch space for either storing the computed norm or the normalized query (to maintain const-correctness of the externally supplied query). Using a distance functor allows this scratch space to exist. A single struct can then be re-used to process multiple queries in series or copied to process queries in parallel.
On the other hand, not all distance implementations (such as L-p distances or inner-product) may not require such state. This leads to the idea of argument fixing.
This is a test compute()
of cross referencing.
Argument Fixing
To support functors both requiring and not needing query preprocessing, the function svs::distance::maybe_fix_argument()
should be called on a query a
before any distance computations requiring a
as the left-hand argument.
Note that the right-hand argument b
can vary at will.
A distance function MyFunctor
may opt-in to argument fixing by defining the method fix_argument
as follows:
struct MyStruct {
// Boiler-plate
template<typename QueryType>
void fix_argument(const QueryType& query) { /*impl*/ }
};
The possible values of QueryType
should be restricted as appropriate.
API Documentation
-
template<typename T>
using compare_t = typename T::compare Obtain the type of the comparator used for the distance functor
T
.
-
template<typename F, typename A>
void maybe_fix_argument(F &f, A a) Perform an argument fixing of query
a
for the distance functorf
. This provides a hook mechanism allowing distance functors to preprocess the query before performing any distance computations if desired.This should be called before calling
svs::distance::compute
whereb
is allowed to vary buta
is not.Modifications to
a
require a new call todistance::maybe_fix_argument
.If distance functors wish to opt-in to argument fixing, the must define a member function
fix_argument(A)
which will be automatically called bydistance::maybe_fix_argument
.If distance functors do not define such a member function, then
distance::maybe_fix_argument
becomes a no-op unlessf
defines the static member variablein which case, missing overloads tostatic constexpr bool must_fix_argument = true;
f.fix_argument(A)
become compiler errors.
-
template<typename A, typename B, HasOneArgCompute<B> F>
float compute(F &f, A, B b) Perform a distance computation with the distance functor
f
, querya
and elementb
.Functors
f
can opt-in to this method in one of three ways:Direct overloading of
distance::compute
. This can be used to extend the behavior of existing functors without modification.If
f
does not implement argument fixing for this combination of types, thenf
may implement a member functionf.compute(a, b)
which will be called automatically.if
f
does implement argument fixing for this type combination, thenf
may implement a single argument member functionf.compute(b)
which will be called.
Argument fixing is implied if
f.fix_argument(a)
is valid. In order to apply to all distance functors,distance::maybe_fix_argument
should be called each time the value ofa
changes.
-
template<typename T>
compare_t<T> comparator(const T &x) Return a comparison functgor for distance functors
T
. For example, distance functors that wish to minimize distances may returnstd::less<>
while those those wishing to maximize may returnstd::greater<>
.Custom types can provide support for
distance::comparator
in one of three ways:Direct overloading of
distance::comparator
.If the desired comparison functor is default constructible, providing a type alias
T::compare
.Providing a member function
x.compare()
returning the functor.
When
T
defines both a member function and a type alias, the member function takes precedence.- Parameters:
x – The distance functor to obtain the comparison functor for.
- Returns:
A comparison functor
f
with an overloadedoperator()
such thatis valid.f(float, float) -> bool
-
template<typename T>
constexpr bool implicitly_broadcastable() Return whether distance functors of type
T
are implicitly broadcastable.To be implicitly broadcastable, must not require argument fixing. In other words, it must be safe to call
distance::compute
with varying left and right-hand arguments without needing to callmaybe_fix_argument
.To opt into implicit broadcasting, classes should implement a constexpr static boolean member
implicit_broadcast = true
.
-
template<typename T>
class BroadcastDistance - #include <svs/concepts/distance.h>
Efficiently create copies of the distance functor
f
of typeT
to allow batch distance computations involving multiple queries.Be default, this is done by invoking the copy constructor the necessary number of times. However, if explicit copies of the functor
f
are not needed for correctness (i.e.,distance::implicitly_broadcastable<T>() == true
, than these copies will not be made resulting in a slightly more efficient implementation.Invariants:
size()
cannot be zero. This is because distance functions requiring broadcasting are often stateful and may have some shared (usually read-only) state.If the underlying storage for this class ever becomes empty, we lose that storage.