I belatedly came across the news regarding warp, a C and C++ preprocessor written by Walter Bright in a joint project with Facebook. This has been released under the Boost license and is available at https://github.com/facebook/warp.
The Facebook blog makes the following statement with regards to improved efficiency over the GCC preprocessor
Replacing gcc’s preprocessor with
warp
has led to significant improvements of our end-to-end build times (including linking). Depending on a variety of circumstances, we measured debug build speed improvements ranging from 10% all the way to 40%, all in complex projects with massive codebases and many dependencies
The AMA on Reddit and the comments on Hacker News have some interesting commentary on the performance claims, as always with performance measurement and comparison this is a difficult field to navigate.
I am reminded of a recent bottleneck I experienced in the build process of a medium sized project using GCC for an ARM target. The investigation identified the major culprit to be the preprocessing stage, specifically the Search Path for header files was very long. A long Search Path is not something that I would have immediately assumed would be the cause of the bottleneck but other lines of investigation were strongly indicating this was truly the cause.
The makefile for the project was auto generated through a propriety tool and thus the Search Path too; for whatever reason the tool naively included every directory in the project root directory, also pulling in all the CVS metadata folders! (use of CVS is a different discussion).
Assumptions about performance are meaningless without quantitative values to back it up so I went about gathering data,
- The complete auto generated makefile
- Semi-tailored makefile removing all CVS metadata folders – to test the assertion may have merit before moving onto 3, low hanging fruit performance indicator
- Tailored makefile including only the required directories in the Search Path – more timely process but on a real project the Search Path has a high fixed cost and a low marginal cost
A makefile is no good without something to compile so I chose the simplest file combination available, an example.h/example.c that contained a function definition and a simple 8 line function respectively. Between stage 1 and 3 of the testing the build time improved by a factor of 100, rather unexpected. When expanding the testing to the project proper, the improvement dropped but the using the tailored makefile was at least an order of magnitude better.
I’m curious as to the benefit of auto generating makefiles in the manner that the tool does, while I am aware of various approaches such as CMake that bring additional benefits this one did not. Personally I’ve not found the maintenance of a makefile too onerous, as some would say there is only one Makefile in existence.
It would be interesting to perform the same tests with the various preprocessors, warp, Clang and a more up to date version of GCC. I’m especially curious if they suffer the same apparent slowdown due to long search path, which to me seems independent of the file complexity.