[CLN-list] Linking against cln fails when built with link-time-optimization
Alexey Sheplyakov
asheplyakov at yandex.ru
Sat Sep 26 21:26:22 CEST 2020
Hello,
22.09.2020, 03:53, "Atri Bhattacharya" <badshah400 at opensuse.org>:
> To follow up on this, we found that the problem was us building the cln
> library with "-fvisibility-inlines-hidden" in combination with LTO.
> Turning off the "-fvisibility-inlines-hidden" flag has fixed this issue
> (even with LTO enabled).
Not exactly. The problem is *us* breaking the standard (ISO C++98) for performance reasons.
Consider include/cln/rational.h
19
20 extern const cl_I denominator (const cl_RA& r);
21
and src/rational/cl_RA.h
149 inline const cl_I denominator (const cl_RA& r)
150 {
151 if (integerp(r))
152 return 1;
153 else
154 return TheRatio(r)->denominator;
155 }
Notice that the same function is declared non-inline in the public header, however
it's implemented as inline. That's because we don't want to expose too many
implementation details, and yet we want to inline that function in CLN itself.
With '-fvisibility-inlines-hidden' the compiler can assume that those inline versions
are not used outside of the CLN. Thus if LTO is enabled the compiler can find out all
call sites, inline the function eliminate the out-of-the-line copy. As a result
`cln::denominator` won't be exported from libcln.so
ISO C++ 98 standard demands (in 7.1.2.4) that "If a function with external linkage
is declared inline in one translation unit, it shall be declared inline in all translation
units in which it appears; no diagnostic is required." So our approach (having
an inline and non-inline function with the same name and argument(s)) violates
the standard and has caused quite a number of linking failures in the past.
The issue has been partially addressed by commit ce250e91fb8d16bc6b35be0add1896fc64f31ec1.
However there are still lots of non-conforming code, in particular cln::denominator.
In general the need to inline a function in some places and not in others came
from three situations:
1) Some functions, like cl_SF_zerop or cl_FF_zerop, are just a single
machine instruction when inlined. Putting their definitions into a public
header would expose too many internals of the library, leading violation
of abstraction and increased compilation times. Still, it would be nice
to use the inline version of these functions in the library itself.
2) Some functions, like cl_{SF,FF,DF,LF}_idecode, are usually only
invoked through a dispatcher cl_F_idecode that does nothing but dispatch
the call to the right function. Here inlining is used, regardless of
the size of the inlined functions, because it removes one function call
from the chain of function calls. A compiler cannot know that this
caller is the main caller for the 4 inlined functions (unless LTO is used)
3) Similarly, cl_I_from_NDS would be a bottleneck if not inlined: every
creation of a new cl_I goes through this function. A compiler cannot
know a priori the bottlenecks (unless LTO is used)
To solve the problem properly we should either
1) Stop playing tricks and use the LTO. The bug report being discussed kind of proves that
LTO actually works these days.
2) Introduce `denominator_inline` and `denominator` and use the former in CLN itself
(similarly to ce250e91fb8d16bc6b35be0add1896fc64f31ec1).
Best regards,
Alexey
More information about the CLN-list
mailing list