C++ modules with Clang

Jan 9, 2021 3:46:35 PM

Modules in C++ were described in this post, but there were no practical examples on how to compile the code contained in that article. This article will explain how to build C++ modules using Clang.

Clang was chosen because its support for C++20 modules comes by default and is mature enough at the time of writing this. Also, it works in the major platforms without requiring extra packages: Apple supports it since 2008. In 2019, Microsoft's added support for Clang in Visual Studio.

Note: GNU C++ Compiler would also be a candidate due to its popularity, but as of January/2020 it requires downloading a branch and non-Linux users would require extra effort to setup the environment.

Modules in Clang pre-dates the C++ standard due to its support for modern Objective-C. That language was used by Apple before they replaced it with Swift and it shares features with C++'s pre-compiled header.

In Clang's implementation of Objective-C, modules can be defined with a module map file. That uses a special syntax similar to what is used in older configuration files, like Apache's httpd. This is an example of how it would be possible to define a hierarchy of modules:

module M1 {
  header "M1.h"
  export *
}

This was intuitive for Objective-C developers because they can consume that map with a single import statement:

@import M1;
int main(int argc, char ** argv) {
  symbol_from_m1();
}

Note: Objective-C code looks similar to C++ in that they share a common requirement of being compatible with C. Those similarities are also present when dealing with Objective-C++ - which is meant to be compatible with C++.

Compilation of that unit requires extra flags to enable module support and to use the explicit module map. Assuming the map is called "M1.modulemap" and the source, "M1Main.m", then the command to compile it in an object form is:

clang M1Main.m \
  -fmodules -fmodule-map-file=M1.modulemap \
  -o M1Main.o -c

Note: The "clang" driver uses the same syntax and similar options as GCC: "-o" defines the output file name and "-c" makes the compiler stop before the link phase.

Module maps are widely used in Apple's version of Clang to reduce the effort of dealing with Objective-C frameworks. But, due to the architecture of Clang, the same map can be used in C++ by changing a single character in the source file:

import M1;
int main(int argc, char ** argv) {
  symbol_from_m1();
}

The compilation of that C++ file depends on the Clang version. Newer versions - such as 11 - are able to be as simple as Objective-C:

clang M1Main.cpp -std=c++2a \
  -fmodules -fmodule-map-file=M1.modulemap \
  -o M1Main.o -c

Note: "c++2a" was the codename of the standard before it was completed. Newer versions of Clang accept "c++20" as well. The old naming is used here for compatibility.

Older versions may not able to implicitly convert C++ headers in modules maps. This impacts automatic translation of legacy headers but the module infrastructure of Clang also work without explicit module maps. For a practical example, given a file named "M2.cpp" containing this module:

export module M2;
export void F2() {}

The source file can be built into a binary object like a normal C++ compilation unit:

clang M2.cpp -std=c++2a -o M2.o -c

Note: Flags for module map were dropped because the "-std=c++2a" implies modules and this example does not use module maps.

The result only contains binary code and a lookup table - effectively mapping symbols to code. At this stage, information about the module is lost. But that can be kept by slightly changing the command line:

clang M2.cpp -std=c++2a -o M2.pcm -c \
  -Xclang -emit-module-interface

This instructs Clang to emit the module interface as part of the object file. The resulting file is not standard anymore: only Clang can use the resulting PCM file.

The "-Xclang" pass the next argument to the Clang CC1 driver. This is the same as if calling CC1 directly:

clang -cc1 M2.cpp -std=c++2a -o M2.pcm -emit-module-interface

The lack of "-c" is intentional. The regular driver (without "-cc1") requires it to stop the compilation before the link phase. That holds true even when passing the extra "-emit-module-interface". Meanwhile, the CC1 driver understands the flag also implies stopping earlier.

Note: The equivalent of "-c" for CC1 is "-emit-obj"

Unfortunately, this step is not enough when the module is imported in another unit like this:

import M2;
int main() {
  F2();
}

Assuming this file was named as "M2Main.cpp", the following command will emit an error about "M2" not being found:

clang M2Main.cpp -std=c++2a -c -o M2Main.o

It is a known limitation of the C++20 standard: there is no rule about how to implicitly find modules. But it is possible to explicitly inform Clang where modules can be found.

The simplest solution would be to inform which PCM file contains the M2 module:

clang M2Main.cpp -std=c++2a -c -o M2Main.o \
  -fmodule-file=M2=M2.pcm

The "M2=M2.pcm" syntax maps the module "M2" to a file named "M2.pcm" inside the current directory. This also accepts relative and absolute paths and the file name itself can be different. This means a hypothetical "M2=./pcm/m2-mod.pcm" example would work, if "m2.cpp" is compiled to "pcm/m2-mod.pcm".

Apart from listing all dependencies via the command line, Clang also allows searching for pre-built modules in a specific path. Since previous examples built the PCM in the current directory, then the invocation with a pre-build path is:

clang M2Main.cpp -std=c++2a -c -o M2Main.o \
  -fprebuilt-module-path=.

This requires all modules to have their files named after their module name, but it is enough to allow compilation of module consumers without listing all dependencies manually.

Given both "M2.pcm" and "M2Main.o" were successfully compiled, then a final executable can be linked:

clang M2.pcm M2.o -o exe

A subtle aspect is how Clang can use PCMs as a replacement of object files. This works because that file contains both interface and implementation of a module. The internal representation is described in LLVM's website, but it can be described as the code converted into an abstract syntax tree (AST).

This approach of linking from the PCM brings optimisations similar to other known techniques.

It can merge code the same way as compilation of multiple C++ units in a single command-line - but the pre-compilation steps can be parallelised, which is not possible with a single compiler invocation.

It also allows Clang to perform compiler-time optimisations, which are the same used by regular C++ compilation, without relying on link-time optimisations (LTO), which requires a whole new category of algorithms.

Benefits so far may seem theoretical, but a simple benchmark can be done to gather some data. Assuming a module M3:

module;
#include <iostream>
export module M3;
export void F3() {
  std::cerr << "ok\n";
}

M3 includes <iostream>, which is a header known for its long processing time. Compilation is done with this command:

clang M3.cpp -std=c++2a -c -o M3.pcm \
  -O3 -Xclang -emit-module-interface

"-O3" was introduced to impose extra optimisation penalties - this brings the example closer to a real case scenario. When measured with the "time" utility, this command takes around 0.6 seconds.

Now, given a consumer like this:

import M3;
int main() {
  F3();
}

And its compilation with this command:

clang M3C.cpp -std=c++2a -c -o M3C.o \
  -O3 -fprebuilt-module-path=.

In the same machine, that takes 0.1s to run. This clearly shows "M3C" was not impacted by the cost of "iostream". Nonetheless, the compiled output inlined F3 inside the main function. Surprisingly, this link command works:

clang++ M3C.o -o exe

Note: clang++ was used since we require the standard C++ library from <iostream>.

This cannot be taken as a rule because the M3 example was completely inlined inside M3C. In a real project, some symbols will not be inlined. Therefore, this was the link command used for the benchmark:

clang++ -O3 M3.pcm M3C.o -o exe

That takes around 0.15s to finish. For comparison, the PCM can be turned into an object before being linked:

clang -O3 M3.pcm -c -o M3.o

This requires 0.1s. Linkage of that object file is as expected:

clang++ -O3 M3.o M3C.o -o exe

Taking 0.1s to complete. This means the second approach takes 0.2s. It is 50 milliseconds longer - which is a number small enough to be negligible in in this test. Nonetheless, this difference may be larger for a different scenario.

Just for completeness, the C++ unit can be converted from source to object, like mentioned before:

clang -O3 M3.cpp -c -o M3.o

But, as expected, this will take 0.6s - the same amount of time to compile from the source to PCM.

In conclusion, this article showed how to use C++ modules with Clang, and data for this simple test shows the main benefit of them: the pre-compilation of <iostream> inside M3 only impacts M3 itself. Consumers of it are only positively impacted: M3C takes 6x less to compile, when compared to M3, and it can inline usages of "std::cerr" with minimal penalty.

C++, C++20, Clang

C++ modules with Clang

Read On

Explaining C++20 Modules