Fun with small bitsets
April 24, 2024
Small sets of small integers can often be packed into single integers by using the bits to mark presence or non-presence of elements. This can often be useful for compactly storing changing sets which can be processed simply when needed.
A place I’ve found this useful is in the register allocator/code generator of my compiler project.
The compiler (at time of writing) uses a simple linear scan register allocator. It doesn’t matter too much how this works, but in its current state, the allocator doesn’t take into account all of the requirements that might be needed in practice. For example
- x86_64 instructions don’t typically accept two memory operands. If one operand is a memory location, the other operand is typically a direct register. If the allocator spills both values in an operation to the stack, it’s useful to find a scratch register to be able to compute intermediate values. Ideally this doesn’t spill an existing value, so we can avoid a save/restore.
- The system V x86_64 ABI doesn’t guarantee the callee preserve all registers, so the caller needs to do the work of saving values which might be overwritten by a procedure call. There is no value in saving and restoring registers we’re not actually using, and no value in saving restoring registers which the callee must preserve, so ideally we’d avoid doing this.
As a simple improvement over the most pessimistic approach (assume we’re evicting a live register, or save and restore all the possible registers), I have my values in the code-generator tagged with the live set of registers at the beginning of the values lifetime.
x86_64 has 16 integer registers, so the set of “in use” integer registers can be stored in a single uint16_t
.
We can define callee preserved registers, and caller preserved registers as masks:
#define CALLEE_PRESERVED ((1U << RBP) | (1U << RBX) | (1U << R12) \
| (1U << R13) | (1U << R14) | (1U << R15))
#define CALLEE_VOLATILE (~CALLEE_PRESERVED)
This means if we need a scratch register for a value, and we know the live set live
encodes the currently used registers, we can query live & CALLEE_PRESERVED
for a scratch register,
int scratch = -1;
uint16_t available = live & CALLEE_PRESERVED;
if (available)
{
scratch = pick(available); /* Pick an item from the set */
}
else
{
scratch = temp_spill_register();
}
/* work with scratch happens here */
if (!((1 << scratch) & available))
{
restore(scratch);
}
Similarly, at the site of a procedure call, a return value is created, and this can be tagged with the live registers at the site of the call. This lets us evict precisely the registers which are in use at the call site, and which the callee might trample.
uint16_t preserve = live & CALLEE_VOLATILE;
for (int i = 0; i < 16; ++i)
{
if ((1 << i) & preserve)
{
evict(i);
}
}
/* Do call here */
for (int i = 15; i >= 0; --i)
{
if ((1 << i) & preserve)
{
restore(i);
}
}
While I aim to make the register allocator more sophisticated with time (allowing values to move, for example), this is still a useful way to capture and encode constraints across sets of registers, and I expect it can be used in other problems.