By Jelle Hellings.
Posted at 20 October 2024, 6:51.
Last updated at 22 October 2024, 1:21.
Consider the set
of
values. In this article, we solve the following combinatorial
problem: given a value
,
,
find two functions to_position
and
from_position
such that
to_position
maps each subset
holding exactly
values
()
to a unique index in the range
with
the total number of distinct subsets of
holding exactly
values; and
from_position
maps indices in the range
back to the subset
holding exactly
with from_position(to_position(T)) == T
.
The construction of highly-efficient versions of both
to_position
and from_position
boils down to
solving a combinatoric puzzle.
First, in SectionΒ 1, we will provide a high-level
sketch of my use-case for the functions to_position
and
from_position
. Next, in SectionΒ 2, we detail
the necessary mathematical background to solve our combinatoric puzzle.
Finally, in SectionΒ 3 we present our solution.
Consider a costly function complex_task
that takes as
input
optional values of which exactly
,
,
need to be provided. Based on only which
input values are provided (e.g., the first, third, and fifth value),
function complex_task
will first compute a helper
object helper
that will detail how to deal with these
specific
input values. Next, function complex_task
will use
helper
together with the provided input values to compute
the result of the function. In the following code fragment, we sketch
the function complex_task
:
auto complex_task(std::span<std::optional<value_t>, N> values)
{
/* Check which @{M} values are present in @{values}. After this loop,
* @{indices[i]} is true if the @{i}-th input in @{values} is available. */
std::bitset<N> indices = {};
for (size_t i = 0; i < N; ++i) {
indices.set(i, values[i].has_value());
}
assert(indices.count() == M);
/* Compute a helper object to deal with the provided input values. */
auto helper = compute_helper(indices);
/* Solve the actual problem using this helper. */
return solve_problem(values, helper);
}
In this function, we represent the values that are provided by a
bitset indices
in which the
-th
bit is set if the
-th
input value is provided. As one can see in the sketch, the helper object
helper
is computed using only the information in
the bitset indices
that details which input values
in values
are present.
The function complex_task
will be used very frequently
(thousands of times per second). Hence, we are interested in optimizing
this function. Inspection of the function shows that the computation
performed by compute_helper
, the computation of the helper
object helper
, is rather taxing. Luckily,
helper
is only dependent on which input values are
providedβas represented by indices
βand not on the
content of these input values. Hence, to optimize
complex_task
, we can pre-compute all possible helper
objects once and use these pre-computed values.
As an initial attempt toward optimizing complex_task
, we
can try something like the following:
using mapping_t = std::unordered_map<std::bitset<N>, helper_t>;
const mapping_t pre_computed_helpers = compute_helpers();
auto complex_task(std::span<std::optional<value_t>, N> values)
{
/* Check which @{M} values are present in @{values}. After this loop,
* @{indices[i]} is true if the @{i}-th input in @{values} is available. */
std::bitset<N> indices = {};
for (size_t i = 0; i < N; ++i) {
indices.set(i, values[i].has_value());
}
assert(indices.count() == M);
/* Solve the actual problem using this helper. */
return solve_problem(values, pre_computed_helpers.at(indices));
}
In the above, compute_helpers
is a function that
computes a mapping between possible values for indices
and
the corresponding helper objects. The following sketch illustrates how
we can implement compute_helpers
if we can
enumerate all possible
-bit
bitsets.
auto compute_helpers()
{
mapping_t result;
for (/* each possible @{N}-bit bitset indices */) {
if (indices.count() == M) {
result.emplace(indices, compute_helper(indices));
}
}
return result;
}
To complete compute_helpers
, we will provide an initial
approach toward enumerating over all
-bit
bitsets. In C++, bitsets can be constructed out of
unsigned long long
integers. Assume, for example, that
unsigned long long
is a 32-bit value. In this case, each
unsigned long long
value is a sequence
of 32 bits that represents the value
.
For example, the value
has the binary representation
and, indeed,
.
In this representation, all
-bit
bitsets,
,
are represented by the values
.
In addition, using the binary representation of unsigned values to our
advantage, we can compute
by setting the bit
using a bit-shift 1ull << N
. Hence, for
sufficiently-small values of
,
we can complete compute_helpers
as follows:
auto compute_helpers()
{
static_assert(N < std::numeric_limits<unsigned long long>::digits,
"N not sufficiently small!");
mapping_t result;
for (unsigned long long v = 0; v < (1ull << N); ++v) {
std::bitset<N> indices(v);
if (indices.count() == M) {
result.emplace(indices, compute_helper(indices));
}
}
return result;
}
We note that the above program is not necessary very fast:
compute_helpers
will check
distinct values. Hence, the worst-case runtime complexity of
compute_helpers
is at-least
.
For example, with
,
compute_helpers
will already check
distinct values v
, which is more than a billion values!
Assuming
is not too large, the complexity of compute_helpers
is not
necessarily a huge issue, however: we only need to call
compute_helpers
once. Still, later in this article
we will develop a much more efficient version of
compute_helpers
.
The optimization outlined above will reduces the cost of
complex_task
significantly. We can do better, however:
std::unordered_map
requires a considerable amount of
storage overhead to store our helper objects (namely the entire hash
table structure used by std::map
internally); requires
multiple memory operations to find a specific helper object (namely to
look up the hash bucket and to search in that bucket); and requires
dynamic memory allocation, thereby ruling out the capability to
pre-compute pre_computed_helpers
at compile time via a
constexpr function. Similar drawbacks apply if we would switch
to std::map
(a binary search tree).
Instead of using standard dictionaries such as std::map
or std::unordered_map
with their inherent drawbacks, we are
looking for a perfect solution. Assume
pre_computed_helpers
will end up holding
key-value pairs. A perfect representation of
pre_computed_helpers
should aim for the following three
properties:
Store the minimal amount of data we can potentially store:
exactly
helper objects. To do so, we want to represent
pre_computed_helpers
by an array holding exactly
helper objects (without storing any other data).
Perform the minimal number of memory operations to look up a
helper object for a given bitset indices
. To do so, we want
a function that can turn bitset indices
into the position
p
such that pre_computed_helpers[p]
is the
helper obect corresponding to indices
.
A highly-efficient function to construct
pre_computed_helpers
that only considers
distinct bitsets (instead of the
bitsets considered by compute_helpers
above). In addition,
pre_computed_helpers
is preferably constructed at compile
time if the helper objects support compile time construction.
For the second and third property, we will use the functions
to_position
and from_position
we described in
the introduction. The construction of these functions requires an
elementary background in combinatorics, which we present next.
Consider a set of distinct values. First, we consider the number of distinct sequences one can create out of set . When creating a single such sequence (permutation), we have options for the first value in the sequence. After choosing the first value, we have options left for the second value in the sequence. Likewise, after choosing the second value, we have options left. We continue until we end up with a single option, which becomes the last value in our sequence. Hence, we can make distinct sequences of length out of the set . These distinct sequences are typically called the permutations of . As an example, consider the four values . We have a total of permutations:
Next, we consider the number distinct sequences of length , , we can create out of . The construction of such a sequence of length is similar to the construction of a permutation: we repeatedly choose one of the values from until the sequence has values, after which we still have values left in . Hence, we can make distinct sequences of length out of the set . These distinct sequences are typically called the -permutations of . As an example, consider the four values . We have a total of two-permutations of : Note that is equivalent to the first terms of : the remaining terms of are . Hence,
Lastly, we consider the number of distinct subsets we can create out of set holding values, . In sets, we do not consider the order of values. Hence, the number of subsets of values is not equal to the number of -permutations .
Now consider an -permutation of set . This permutation holds a set of values. As is a set of values, we can construct permutations of . Each of these permutations holds the same values as and, hence, represents the same set of values. Hence, we can make distinct subsets out of set . These distinct subsets are typically called the -combinations of . As an example, consider the four values . We have a total of two-combinations of :
Our functions to_position
and from_position
will map between
-bit
bitsets indices
and positions in
with
the number of distinct values for indices
. Note that each
of these
-bit
bitsets represents an
-combination
of the values
.
Hence, we have
and our functions will have the following declarations:
/*
* Return a unique position for the @{M}-combination of values 0, ..., @{N} - 1
* represented by @{indices}. Requires @{indices.count() == M}.
*/
constexpr std::size_t to_position(std::bitset<N> indices);
/*
* Return the unique @{M}-combination of values 0, ..., @{N} - 1 represented by
* the position @{pos}. Requires @{0 <= pos < x} with @{x} the number of
* @{M}-combinations of a set of @{N} values.
*/
constexpr std::bitset<N> from_position(std::size_t pos);
First, we need to decide which position we wish to assign to
a given bitset indices
. We will simply assign an ordering
on all possible bitsets and assign positions with respect to this order.
In specific, we will use the following ordering: a bitset
comes-before bitset
if
and the largest value in
is smaller than the largest value in
.
For example,
comes-before
(because
is smaller than
)
and
comes-before
(because
is smaller than
).
to_position
First, we look at the function to_position
. We start
with a recursive version to_position_rec
with the
following declaration:
/*
* Assume that @{m <= n <= N} and @{m <= M}. Return a unique position for the
* m-combination of values 0, ..., @{n} - 1 represented by @{indices}. Requires
* @{indices.count() == m}.
*/
std::size_t to_position_rec(std::bitset<N> indices, unsigned n, unsigned m);
One calls this recursive function initially with and . The base case for this recursive function is the case . In this case, : there is only one -combination out of a set of values. Hence, in the base case, we can assign the position .
Next, we consider the recursive cases. In the recursive case, we
only look at whether the value
is represented by the bitset indices
. We distinguish two
cases:
The value
is in indices
.
As is the largest value we are considering, the position of any bitset that holds value must come after the position of all bitsets that do not hold value . The bitsets that do not hold value represent -combinations over the set of values. Hence, we have a total of such bitsets.
As such, we start the position of all bitsets that hold
value
at
.
Next, to further determine the position of the bitset represented by
indices
, we have to determine a suitable position
relative to
for the remaining
values in indices
. These remaining
values are taken from the set
of
values. Hence, the remaining
values in indices
represent an
()-combination
of the values
.
As such, we can use a recursive call
to_position_rec(rem, n - 1, m - 1)
, with rem
the bitset representing the remaining
values in indices
, to compute the suitable position that we
will use relative to
.
The value
is not in indices
.
As argued in the previous case, the bitsets that do not hold value
represent
-combinations
over the set
of
values and we have a total of
such bitsets. As the previous case ensures that all bitsets that do
hold value
start at position
,
we can simply compute a position in this case by interpreting
indices
as an
-combination
over the set
.
As such, we can use a recursive call
to_position_rec(indices, n - 1, m)
to compute a suitable
position for this bitset.
The above recursive construction leads to the following implementation:
/*
* Assume that @{m <= n <= N} and @{m <= M}. Return a unique position for the
* @{m}-combination of values 0, ..., @{n} - 1 represented by @{indices}.
* Requires @{indices.count() == m}.
*/
constexpr std::size_t to_position_rec(std::bitset<N> indices,
unsigned n, unsigned m)
{
assert(m <= n && n <= N && m <= M && M <= N);
assert(indices.count() == m);
if (m == n) {
/* Base case. */
return 0;
}
else {
/* Recursive cases. */
if (indices.test(n - 1)) {
auto rem = indices;
rem.set(n - 1, false);
auto p_base = combinations(n - 1, m);
return p_base + to_position_rec(rem, n - 1, m - 1);
}
else {
return to_position_rec(indices, n - 1, m);
}
}
}
The above function requires a function combinations
to
compute
.
Based on the definition given in SectionΒ 2, we
derive
/*
* Compute the number of subsets of size @{m} of a set of @{n} values, according
* to the definition of @{m}-combinations.
*
* This version only works for small values of @{n} and @{m}, as otherwise the
* computation will produce values that do not fit in the type @{std::size_t}.
* Later we will look at a version of this function that can better deal with
* large inputs.
*/
constexpr std::size_t combinations(unsigned n, unsigned m)
{
assert(m <= n);
const auto max_v = std::numeric_limits<std::size_t>::max();
std::size_t numerator = 1;
std::size_t denominator = 1;
while (m >= 1) {
assert(numerator <= (max_v / n));
numerator *= n--;
denominator *= m--;
}
return numerator / denominator;
}
We can simplify the above function to_position_rec
in
two ways. First, we notice that there is no need to construct
rem
: we inspect each value only once. Second, we can easily
replace the recursion by a simple loop. By doing so, we obtain:
constexpr std::size_t to_position(std::bitset<N> indices)
{
assert(M <= N);
assert(indices.count() == M);
auto n = N;
auto m = M;
std::size_t pos = 0;
while (m != n) {
if (indices.test(n - 1)) {
pos += combinations(n - 1, m);
--n;
--m;
}
else {
--n;
}
}
return pos;
}
The complexity of this function is easy to determine: we inspect all
bits once and we call combinations
up-to-
times. The complexity of the call combinations(n - 1, m)
is
entirely determined by the parameter m
: the function
combinations
performs m
steps in its main
loop. The
up-to-
calls to combinations
use values
.
Hence, the total complexity of this version of to_position
is
.
We can further reduce the cost of to_position
to
by eliminating the repeated calls to combinations
. We can
do so by maintaining the value
during the algorithm. Assume we have a variable n_choose_m
holding the value of
.
We need to update n_choose_m
in two cases:
The value
is in indices
.
In this case, the if-case in the loop decreases the value of both and by one. Hence, we need to compute from . We have Hence, we have
In addition, in this case we need the value . We have Hence, we have and we conclude
The value
is not in indices
.
In this case, the else-case in the loop decreases the value of by one. Hence, we need to compute from . In the previous case, we already showed that:
We apply the above strategy to maintain
to to_position
and we obtain:
constexpr std::size_t to_position(std::bitset<N> indices)
{
assert(M <= N);
assert(indices.count() == M);
auto n = N;
auto m = M;
std::size_t pos = 0;
std::size_t n_choose_m = combinations(n, m);
while (m != n) {
auto nd_choose_m = (n_choose_m * (n - m)) / n;
auto nd_choose_md = (n_choose_m * m) / n;
if (indices.test(n - 1)) {
pos += nd_choose_m;
n_choose_m = nd_choose_md;
--n;
--m;
}
else {
n_choose_m = nd_choose_m;
--n;
}
}
return pos;
}
This improved and final version of to_position
has a
complexity of only
.
from_position
Next, we look at the function from_position
. The design
of this function mirrors the design of to_position
. In
specific, given a position pos
, we will reconstruct the
bitset indices
by trying to add values starting at the
largest value
and ending with
.
Consider we try to add the maximum value
.
We again distinguish two cases:
The value
must be added to indices
.
By the construction detailed the previous section, we must have
.
The remaining position
is the position of the
()-combination
over the set
of all
values less-than
that must be added to indices
.
The value
must not be added to indices
.
By the construction detailed the previous section, we must have
.
The position pos
is the position of the
-combination
over the set
of all
values less-than
that must be added to indices
.
We apply the above strategy and we obtain:
constexpr std::bitset<N> from_position(std::size_t pos)
{
assert(pos < combinations(N, M));
auto n = N;
auto m = M;
std::bitset<N> indices;
std::size_t n_choose_m = combinations(n, m);
while (m != 0) {
auto nd_choose_m = (n_choose_m * (n - m)) / n;
auto nd_choose_md = (n_choose_m * m) / n;
if (nd_choose_m <= pos) {
pos -= nd_choose_m;
indices.set(n - 1, true);
n_choose_m = nd_choose_md;
--n;
--m;
}
else {
n_choose_m = nd_choose_m;
--n;
}
}
return indices;
}
This only and final version of from_position
has a
complexity of only
.
compute_helpers
The original compute_helpers
of SectionΒ 1 inspects all
subsets of
for a worst-case running time of at-least
.
For any non-trivial value of
,
this will take unreasonably long. Luckily, we can use the tools we just
developed to implement compute_helpers
more
efficiently.
We observe that we must compute
helper objects. These helper objects will correspond with positions
.
For each of these positions, we can compute the corresponding bitset
using from_position
. We obtain:
constexpr auto compute_helpers()
{
constexpr auto x = combinations(N, M);
std::array<helper_t, x> result;
for (auto pos = 0; pos < x; ++pos) {
result[pos] = compute_helper(from_position(pos));
}
return result;
}
The complexity of this version of compute_helpers
is
which, for most values of
,
is significantly better than
.
The function combinations
of SectionΒ 3.1 is very limited: the computation of
numerator
and denominator
will quickly
overflow and not fit in a fixed-size unsigned integer. Indeed,
upon completion the value denominator
will hold
.
For
,
we already have
,
which would not fit in a 32-bit unsigned integer, and for
,
we have
,
which would not fit in a 64-bit unsigned integer. The value
numerator
is even larger than the value
denominator
. Still, there are plenty of cases in which
is a small value, while the function of SectionΒ 3.1 will fail to compute it. For example,
,
which easily fits in a 32-bit unsigned integer, but cannot be computed
by combinations
even if std::size_t
is a
64-bit unsigned integer type.
To see how we can improve combinations
, we consider the
computation of
.
By the definition of
,
we have
Assume we are computing the numerator
and denominator
of this fraction using 32-bit unsigned integers. Hence, the maximum
value
and
can hold is
.
With this maximum, we can only compute the first six terms of
,
as we have
.
After computing the first six terms of
and
,
we have
and
and we have
We can simplify the above fraction by reducing the values and . Remember, we can simplify a any fraction by removing common terms from their numerator and denominator. For example, we can simplify the fraction to by dividing both the numerator and the denominator , which are both dividable by two, by two. Likewise, we can further simplify by dividing both the numerator and the denominator by three and seven. We cannot simplify further, however, as the and do not have a common divisor.
We can apply the same simplification to the computation of
:
we simplify the term
by dividing both
and
by a large common divisor. Note that in this case
and
are guaranteed to have a large common divisor: as both
and
are the multiplication of six consecutive integers, at-least
three of these consecutive integers are divisible by two, two
are divisible by three, and one is divisible by five.
Hence, we can at-least use the common divisor
.
This is not the largest common divisor of
and
,
however: the largest possible divisor is known as the greatest
common divisor, which can be computed using Euclidβs Algorithm (or
any of the other well-known greatest common divisor algorithms).
Conveniently, C++ already provides an algorithm to compute the greatest
common divisor via std::gcd
.
As it turns out, the greatest common divisor of and is . Hence, after dividing both and by , we can continue the computation with a few more terms without overflowing either or .
Note that the computation of involves the multiplication of terms. We can reduce this to only ten terms using the definition of . Let . We have and, by the definition of , we have Hence, , the computation of which only involves ten terms.
Applying the above two ideas will lead to the following improved
variant of combinations
:
/*
* Compute the number of subsets of size @{m} of a set of n values, according to
* the definition of @{m}-combinations. Return @{0} when the combination cannot
* be represented by a fixed-sized value.
*/
constexpr std::size_t combinations(unsigned n, unsigned m)
{
/* Internally, we use the largest unsigned integer type availabe to reduce
* the number of overflows of @{numerator}. */
const auto max_v = std::numeric_limits<std::uintmax_t>::max();
std::uintmax_t numerator = 1;
std::uintmax_t denominator = 1;
/* Minimize the number of terms we will compute. */
m = std::min(m, n - m);
while (m >= 1) {
/* Check whether multiplying would overflow @{numerator}. */
if ((max_v / n) < numerator) {
/* Multiplying the @{numerator} by @{n} would overflow. We remove
* the common divisor @{cd} from @{numerator} and @{denominator}. */
auto cd = std::gcd(numerator, denominator);
numerator /= cd;
denominator /= cd;
/* Check whether removing the common divisor prevents overflow. */
if ((max_v / n) < numerator) {
return 0;
}
}
numerator *= n--;
denominator *= m--;
}
auto result = numerator / denominator;
/* Check whether the result fits in the output type. */
const auto max_size_v = std::numeric_limits<std::size_t>::max();
if (max_size_v < result) {
return 0;
}
else {
return result;
}
}
We note that the resulting function combinations
still
cannot compute all possible values
:
these values can easily grow very large. For example,
does not fit in a 32-bit unsigned integer and
does not fit in a 64-bit unsigned integer. To be able to compute and
represent
for arbitrary large values of
and
,
one will require a custom arbitrary large integer
representation.
Below we provide the full solution in a single program. In this full
solution, we have made a single change: we made the maximum value for
a template parameter and we made the values
and
function arguments. By doing so, to_position
and
from_position
can be used with arbitrary values of
and
.
To facilitate the usage of this solution, I release all code on this
page under the 2-Clause BSD License:
/*
* Copyright (c) 2024 Jelle Hellings
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
/* First, the functions combinations, to_position, and from_position. */
#include <algorithm>
#include <cassert>
#include <cstddef>
#include <cstdint>
#include <bitset>
#include <limits>
#include <numeric>
/* By default, we only support up-to-64-bit bitsets. */
constexpr std::size_t default_max_N = 64;
/*
* Compute the number of subsets of size @{m} of a set of @{n} values, according
* to the definition of @{m}-combinations. Return @{0} when the combination
* cannot be represented by a fixed-sized value.
*/
constexpr std::size_t combinations(unsigned n, unsigned m)
{
/* Internally, we use the largest unsigned integer type availabe to reduce
* the number of overflows of @{numerator}. */
const auto max_v = std::numeric_limits<std::uintmax_t>::max();
std::uintmax_t numerator = 1;
std::uintmax_t denominator = 1;
/* Minimize the number of terms we will compute. */
m = std::min(m, n - m);
while (m >= 1) {
/* Check whether multiplying would overflow @{numerator}. */
if ((max_v / n) < numerator) {
/* Multiplying the @{numerator} by @{n} would overflow. We remove
* the common divisor @{cd} from @{numerator} and @{denominator}. */
auto cd = std::gcd(numerator, denominator);
numerator /= cd;
denominator /= cd;
/* Check whether removing the common divisor prevents overflow. */
if ((max_v / n) < numerator) {
return 0;
}
}
numerator *= n--;
denominator *= m--;
}
auto result = numerator / denominator;
/* Check whether the result fits in the output type. */
const auto max_size_v = std::numeric_limits<std::size_t>::max();
if (max_size_v < result) {
return 0;
}
else {
return result;
}
}
/*
* Return a unique position for the @{m}-combination of values 0, ..., @{n} - 1
* represented by @{indices}. Requires @{indices.count() == m}.
*/
template<std::size_t MaxN = default_max_N>
constexpr std::size_t to_position(std::bitset<MaxN> indices,
unsigned n, unsigned m)
{
assert(n <= MaxN);
assert(m <= n);
assert(indices.count() == m);
std::size_t pos = 0;
std::size_t n_choose_m = combinations(n, m);
assert(n_choose_m != 0);
while (m != n) {
auto nd_choose_m = (n_choose_m * (n - m)) / n;
auto nd_choose_md = (n_choose_m * m) / n;
if (indices.test(n - 1)) {
pos += nd_choose_m;
n_choose_m = nd_choose_md;
--n;
--m;
}
else {
n_choose_m = nd_choose_m;
--n;
}
}
return pos;
}
/*
* Return the unique @{m}-combination of values 0, ..., @{n} - 1 represented by
* the position @{pos}. Requires @{0 <= pos < x} with x the number of
* @{m}-combinations of a set of @{n} values.
*/
template<std::size_t MaxN = default_max_N>
constexpr std::bitset<MaxN> from_position(std::size_t pos,
unsigned n, unsigned m)
{
assert(n <= MaxN);
assert(pos < combinations(n, m));
std::bitset<MaxN> indices;
std::size_t n_choose_m = combinations(n, m);
assert(n_choose_m != 0);
while (m != 0) {
auto nd_choose_m = (n_choose_m * (n - m)) / n;
auto nd_choose_md = (n_choose_m * m) / n;
if (nd_choose_m <= pos) {
pos -= nd_choose_m;
indices.set(n - 1, true);
n_choose_m = nd_choose_md;
--n;
--m;
}
else {
n_choose_m = nd_choose_m;
--n;
}
}
return indices;
}
/* Next, the the demonstration application involving complex_task. */
#include <array>
#include <optional>
#include <span>
#include <string>
/* A dummy value type, just as an example. */
using value_t = std::string;
/* A dummy helper type, just as an example. This helper can store an internal
* representation of a bitset. */
using helper_t = unsigned long long;
/*
* A dummy function to construct a helper for indices @{indices}.
*/
template<std::size_t N>
constexpr helper_t compute_helper(std::bitset<N> indices)
{
/* Return an internal representation of the indices @{indices}. */
return indices.to_ullong();
}
/*
* The dummy problem being solved.
*/
template<std::size_t N>
auto solve_problem(std::span<std::optional<value_t>, N> values,
const helper_t& helper)
{
/* Simply return the helper, which represent the bitset @{indices}. */
return helper;
}
/*
* Pre-compute the helpers used by @{complex_task}. Note that here the template
* parameters @{N} and @{M} are required as the size of the resulting array
* is a function of these parameters.
*/
template<std::size_t N, std::size_t M>
constexpr auto compute_helpers()
{
constexpr auto x = combinations(N, M);
std::array<helper_t, x> result;
for (auto pos = 0; pos < x; ++pos) {
result[pos] = compute_helper(from_position(pos, N, M));
}
return result;
}
/* The pre-computed helpers used by @{complex_task}. */
template<std::size_t N, std::size_t M>
constexpr auto pre_computed_helpers = compute_helpers<N, M>();
/*
* The complex task that takes @{N} optional values, of which @{M} must be
* provided.
*/
template<std::size_t N, std::size_t M>
requires (N != std::dynamic_extent)
auto complex_task(std::span<std::optional<value_t>, N> values)
{
/* Check which @{M} values are present in @{values}. After this loop,
* @{indices[i]} is true if the @{i}-th input in @{values} is available. */
std::bitset<N> indices = {};
for (size_t i = 0; i < N; ++i) {
indices.set(i, values[i].has_value());
}
assert(indices.count() == M);
/* Compute a helper object to deal with the provided input values. */
auto helper = pre_computed_helpers<N, M>[to_position(indices, N, M)];
/* Solve the actual problem using this helper. */
return solve_problem(values, helper);
}
/* Finally, a main entry point demonstrating this library. */
#include <iostream>
int main(int argc, char* argv[])
{
/* First, lets illustrate that the library works. */
const std::size_t N = 7;
const std::size_t M = 3;
auto x = combinations(N, M);
/* We enumerate each M-combination of 0, 1, ..., 6. */
std::cout << "There are " << x <<
" 3-combination of 0, 1, ..., 6."
" These combinations are:" << std::endl;
for (auto pos = 0; pos < x; ++pos) {
auto indices = from_position(pos, N, M);
std::cout << " * the " << pos << "-th combination is: ";
auto delim = "{";
for (auto index = 0; index < N; ++index) {
if (indices.test(index)) {
std::cout << delim << index;
delim = ", ";
}
}
std::cout << "}, ";
std::cout << " (position: "
<< to_position(indices, N, M)
<< ")" << std::endl;
}
std::cout << std::endl;
/* Now lets see if @{complex_task} works. */
std::array<std::optional<value_t>, N> vs = {
std::nullopt,
"second",
std::nullopt,
"third",
"fourth",
std::nullopt,
std::nullopt
};
/* The dummy solve_problem simply returns a representation of the bitset
* representing the values in the above array @{vs}. Hence, the below should
* print the 7-bit value @{0011010}. */
std::cout << std::bitset<N>{complex_task<N, M>(vs)} << std::endl;
}
Join the discussion