constexpr All the Things! (but gently)

The art of transmuting metals into gold is notoriously well known. But what secrets lie inside C++ runtime transmutation?⁰Photo by Kevin Ku on UnsplashIntroductionC++11 has taken C++ to the modern age. Those were very confusing times. “rvalues” could b…


This content originally appeared on Level Up Coding - Medium and was authored by Aviv Avitan

The art of transmuting metals into gold is notoriously well known. But what secrets lie inside C++ runtime transmutation?⁰

Photo by Kevin Ku on Unsplash

Introduction

C++11 has taken C++ to the modern age. Those were very confusing times. “rvalues” could be defined as variables, decltype and auto were (just) seemingly the same, and execution in compile-time seemed like a magical feature.

constexpr is a modifier that allows marking functions and variables as possible candidates for compilation time execution. Similar to inline, this is just a hint.

The rules are simple: write a constexpr-friendly function and any constexpr function call using constant expressions will be replaced in binary with the computation result, in compile time.
Otherwise, it will be called just as any other normal function would — in runtime.

OK… Is That Really New?

Well, This concept is not entirely new. Even in C, it was common that compilers made some sort of constant propagation for function optimization, elimination of function calls or redundant code. But there are some subtle differences:

  • constexpr declaration is explicit by the developer, optimizations are not. Detecting if some code undergoes optimizations may not be an easy task.
  • Constant propagation occurs only if compiler optimizations are set but constexpr will “optimize” code even without specifying any optimization.
  • constexpr execution evaluates code, constant propagation does not.
  • Even if all calls of a function are optimized, the non-constexpr function will remain in assembly and increase binary size.

(Note: Needless to say that constant propagation is just a small part of compiler code optimizations set)

But What About Template Metaprogramming?

Beyond constant propagation, template metaprogramming (TMP) is also a method to defer runtime computations into compile time. TMP has its good practices, but for the common C++ developer TMP is not an intuitive tool. It is easy to abuse and in any case — writing the same code to run in compile and run time is impossible.

Unlike TMP, once a function is declared as constexpr, evaluation time depends on the caller site. And it’s easier to debug: just remove the constexpr modifier and recompile¹.

So far it sounds pretty amazing.

Nothing could go wrong, right? (Image from makememe)

The Test Case Scenario

My experiments with constexpr began after doing an exercise known as “Happy Number”. In this exercise, a developer has to determine if an input number is happy or not. They need to take a number, break it into its digits and sum their squares. e.g., 123 will become 1²+2²+3² =14. This process is repeated on the result until it either converges to 1 (happy) or diverges into an endless loop (not happy).

This algorithm tests happiness of all numbers from 1 to the max result of the sum of squared digits of 4 bytes unsigned integer. Once this is accounted for, the happiness of any input number can be returned in O(1).

Read the last paragraph again: it really stands out that preprocessing can lead to O(1) solution.

Compile and Run Time Worlds … Unite!

At this point, any real developer would use the initialized cache (isHappyArray) as .. a cache .. for the next calls.
I skipped that part.
I assure you dear readers that I am a real developer, but I wanted to make it a constexpr and never worry about it in runtime again.
Why not? my code is compiled with my testing module, full of constant expressions. Therefore once the compilation is finished, the program is in fact validated.

But this code has a problem: set<int> can not be used in the constexpr context. The container set has no constexpr constructor/destructor to support the allocation/deallocation of memory in compile time.

Let’s start with the good news: unlike constant propagation, The compiler notifies you if your code is capable of running in compile time.

The bad news: very few elements in the standard library support constexpr context. (more bad news: allocated compile-time memory won’t propagate to runtime)

In this attempt, we observe that the principle of “writing once execute anywhere” is far from perfect. Even today, running complex code that uses smart data structures might not be possible at all. This is somewhat fixed: It is supported in the standard to allocate variable-sized C-array with the support of the new operator in constexpr context (i.e., GCC 10). Yet, most variable-sized containers are still not there yet.

Let Us Compromise

In our case though, we can sacrifice some readability and efficiency to gain constexpr benefits. The set<int> can be replaced with a std::array of size 15.
Finding out 15 was a simple case of binary search of trial and error.
However, The fact I had to find out the maximal number of iterations for each happy number candidate is a downfall of not having variable-size containers.

The code after the modification looks like this (note comment in line #13):

This algorithm now is less efficient: the complexity of finding a previously handled number increased. Instead of just querying a set, now it is needed to traverse a small array. Luckily, the array is bounded by a constant 15 iterations at maximum.
But in run time it is still relatively fast and still can be evaluated in compilation!

Unfortunately, that’s exactly where our new concerns are: compilation time.

Image by Author

Nothing Left to an Undefined Chance

Before we understand the issues with compile time, we need to address an open issue.

We still need to understand how C++ is executing code during compilation. The compilation process doesn’t fork into children in any phase of the compilation and it is not an option, to begin with: have you considered what happens on cross-compilation?

The conclusion is that our compiler doesn’t evaluate the constexpr calls: That is a different module called the constant evaluator².

Any code evaluated in compile time must guarantee the same result as in runtime. I can not tell if it is possible to align the expected behavior of both modules according to the subset of the C++ specification the compiler implements. Even if it is possible the specification doesn’t cover the behavior of UBs (Undefined behaviors).
By definition, the compiler implementers may do as they see fit in those cases. The same compiler may even behave differently in UB in different compilation environments (flags, platform). For me, it seems that aligning the constant evaluator with UB is just too tricky to promise.

This is at least my reasoning I can offer why the standard dictates that any constexpr should not contain UBs:

(5.7) — an operation that would have undefined behavior … [Note: including, for example, signed integer overflow (7.2), certain pointer arithmetic (7.6.6), division by zero (7.6.5), or certain shift operations (7.6.7) ]

Having that cleared hopefully, the next conclusion of running in the constant evaluator is inevitable: compilation time evaluations are slower than runtime executions (just as language interpreters are much slower than compilers).

But by what degree³?

The Time and Space-Time Tradeoff Theorem

Back to our example, I fancied a program that can be compiled and tested at the same step. To actually have a compiled program == validated program.

The program takes an integer and returns if it’s happy or not. To test our code, we need to test as many numbers as possible against ground truth results.

It is just a question of quantity. how many tests can we delegate to compile time?

To answer this question, the building times (in seconds) of the program were measured as a function of the number of numbers processed in compile time.

Consider those building times with the runtime execution of 5000 numbers (+ writing to standard output the results): lower than 500ms.

Compilation time evaluation can be very-very time-consuming. The only thing to take from here is that the decision to defer code to compilation should be based on:

  1. Amount of CPU cycles wasted in that code.
  2. How frequently this piece of code is called.
  3. The whole program lifespan. Like memory leaks, it doesn’t pay to be effective in short-lived programs.

Note that longer compilation times are not priceless.
They benefit runtime - but may be taxing your organization in other aspects: resources, longer testing cycles, overall poor developer experience

and increased rate of coffee consumption in the office kitchen (credits xkcd)

Now that the downsides of the constexpr are clear, we can objectively state what are the good practices of constexpr (in this author’s opinion).

So, What Is It Good For?

  1. Apply “smart” constant propagation w/o code optimizations and elimination of heavy constant calculations in runtime (nothing new)
  2. Optimizing code that benefits from memoization
  3. Serve as a basic quality gate: you may not be able to migrate all your module tests to compilation time, but you should consider including very basic sanity tests if the extra cost is negligible

What can be better?

Allowing Constexpr Complex Initialization

Recall that in the “HappyNumber” scenario, the “cached” results array was recreated each time for each input number. This was done to demonstrate the speed differences of computations in compile time vs runtime. Unfortunately, there is no simple way to use this cache across calls in compile time.

You may expect such code to look like this:

This code block is the same as we had before, only with isHappyArr as static and the isHappyArray[MAX_HAPPY_RESULT] cell is used as a simple initialization flag.
But this code can not be compiled: constexpr context disallows the use of static variables. Moreover, the constexpr is actually a “const” and after declaration — can not be modified.
The only possible initialization is pre-calculating the cache and hardcoding it into a variable. Not a very friendly solution.

Developer Control of Memory Propagation From Compile-Time to Runtime

Even if reusing the cache (as suggested in (1)) would be possible in compile-time, the runtime still needs to run it once to create it’s own cache. The runtime can not use anything else than primitives from compile-time. Allowing developers to be explicit about what should be stored in the data segment in compilation time could be a very handy tool (but may lead to bloated compilation units).

Photo by Timothy Meinberg on Unsplash

Thank you very much for reading. Feel free to leave your comments and share constexpr experiences!

Resources / Footnotes

[0] Title is based on the famous Ben Deane & Jason Truner talk in CppCon 2017. Another geeky sidenote and spoiler: FMA’s law of equivalent exchange does not apply in runtime to compile transmutation, as this article will demonstrate.

[1] Well, to be honest, this is more of a tooling issue.

[2] Borrowed terminology from “Design and evolution of constexpr in C++”
[https://pvs-studio.com/en/blog/posts/cpp/0909/].
Recommended material for further reading.

[3] This may be not true — Interpreters can be relatively efficient when executing problems that benefit from memoization. In the given HappyNumber example, no such optimization is possible.
Anyway, a well-written runtime code that employs memoization will be faster by several factors.

[4] I have worked on products whose final application is made by linking hundreds of static libraries together. In those cases, the overall amount of work needed to be done to test a feature, something really simple, might take hours instead of seconds. If the static libraries are small enough compile-time tests may even replace most unit tests.
Secondly, consider a case of adding a very simple functionality (e.g., preventing overflow or enforcing max/min values) that can be only tested in an edge case that hardly reproduces. Instead of forcing the runtime into replicating the behavior to some extent, you may just code your function dependencies in a few lines of code.


constexpr All the Things! (but gently) was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Aviv Avitan


Print Share Comment Cite Upload Translate Updates
APA

Aviv Avitan | Sciencx (2022-04-19T14:30:32+00:00) constexpr All the Things! (but gently). Retrieved from https://www.scien.cx/2022/04/19/constexpr-all-the-things-but-gently/

MLA
" » constexpr All the Things! (but gently)." Aviv Avitan | Sciencx - Tuesday April 19, 2022, https://www.scien.cx/2022/04/19/constexpr-all-the-things-but-gently/
HARVARD
Aviv Avitan | Sciencx Tuesday April 19, 2022 » constexpr All the Things! (but gently)., viewed ,<https://www.scien.cx/2022/04/19/constexpr-all-the-things-but-gently/>
VANCOUVER
Aviv Avitan | Sciencx - » constexpr All the Things! (but gently). [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/04/19/constexpr-all-the-things-but-gently/
CHICAGO
" » constexpr All the Things! (but gently)." Aviv Avitan | Sciencx - Accessed . https://www.scien.cx/2022/04/19/constexpr-all-the-things-but-gently/
IEEE
" » constexpr All the Things! (but gently)." Aviv Avitan | Sciencx [Online]. Available: https://www.scien.cx/2022/04/19/constexpr-all-the-things-but-gently/. [Accessed: ]
rf:citation
» constexpr All the Things! (but gently) | Aviv Avitan | Sciencx | https://www.scien.cx/2022/04/19/constexpr-all-the-things-but-gently/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.