Code-Analysis of the day: the potential of __builtin_expect()
Richard B. Kreckel
kreckel at thep.physik.uni-mainz.de
Wed Sep 19 19:22:30 CEST 2001
Just discovered the Linux kernel folks are starting to use this fancy new
feature of GCC-3.0 (and also the triple-cursed GCC-2.96) at some places.
It lets us give the compiler a hint which branch is expected to be taken.
I found that it indeed does increase performance, even on Intel where it
cannot do much more than reordering the code. The simplest example is
ex's dtor:
inline ex::~ex()
{
if (--bp->refcount==0)
delete bp;
}
which, using GCC-3.0.1, compiles to (CXXFLAGS="-O2"):
00000000 <GiNaC::ex::~ex()>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 08 sub $0x8,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax
9: 8b 10 mov (%eax),%edx
b: 8b 42 10 mov 0x10(%edx),%eax
e: 48 dec %eax
f: 85 c0 test %eax,%eax
11: 89 42 10 mov %eax,0x10(%edx)
14: 74 0a je 20 <GiNaC::ex::~ex()+0x20>
16: 89 ec mov %ebp,%esp
18: 5d pop %ebp
19: c3 ret
1a: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
20: 85 d2 test %edx,%edx
22: 74 f2 je 16 <GiNaC::ex::~ex()+0x16>
24: 8b 02 mov (%edx),%eax
26: 83 ec 0c sub $0xc,%esp
29: 52 push %edx
2a: ff 50 0c call *0xc(%eax)
2d: 83 c4 10 add $0x10,%esp
30: eb e4 jmp 16 <GiNaC::ex::~ex()+0x16>
Using quite expressive macros(*) lifted from the kernel, we can try to
pessimise the code by saying that deletion is going to be likely (it
isn't):
inline ex::~ex()
{
if (likely(--bp->refcount==0))
delete bp;
}
Under the same compilerflags as above we now get the slightly shorter
section:
00000000 <GiNaC::ex::~ex()>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 08 sub $0x8,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax
9: 8b 10 mov (%eax),%edx
b: 8b 42 10 mov 0x10(%edx),%eax
e: 48 dec %eax
f: 85 c0 test %eax,%eax
11: 89 42 10 mov %eax,0x10(%edx)
14: 75 10 jne 26 <GiNaC::ex::~ex()+0x26>
16: 85 d2 test %edx,%edx
18: 74 0c je 26 <GiNaC::ex::~ex()+0x26>
1a: 8b 02 mov (%edx),%eax
1c: 83 ec 0c sub $0xc,%esp
1f: 52 push %edx
20: ff 50 0c call *0xc(%eax)
23: 83 c4 10 add $0x10,%esp
26: 89 ec mov %ebp,%esp
28: 5d pop %ebp
29: c3 ret
However, as expected, this runs slower (approx 5%!) than the version
without the misleading compiler hint. It's quite interesting how the
blocks were reordered in this example. A good compiler hint would be:
inline ex::~ex()
{
if (unlikely(--bp->refcount==0))
delete bp;
}
This, however, compiles to the same code as the one without any hint.
Interesting. I tried to insert similar hints at several places where it
appears clear that I should know better than the compiler what is likely
going to happen next. It always turned out to produce the same code.
Why is the compiler is so good at making the right decision? I found
that I always was introducing unlikely(), never likely(). Is a branch
implicitly assumed as not taken???
Anyways, this stuff seems to be a bit premature. I just record it here in
case somebody wants to experiment with this stuff.
Regards
-richy.
(*) The exact definition was lifted from kernel-2.4.10-pre11:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect((x), 0)
--
Richard B. Kreckel
<Richard.Kreckel at Uni-Mainz.DE>
<http://wwwthep.physik.uni-mainz.de/~kreckel/>
More information about the GiNaC-devel
mailing list