Macros are evil (tm)
Richard B. Kreckel
kreckel at thep.physik.uni-mainz.de
Fri Jun 15 17:22:40 CEST 2001
Hi all,
This is gonna be long. If you just want to know what you soon need to
do to your source files and are not interested in why, please jump to
the end of this email.
The usual way of implementing polymorphic methods accepting ex
arguments in GiNaC involves checking the arguments for their types.
This results in switch-like statements of the form
ex mul::somemethod(const ex & other)
{
if (is_ex_of_type(add)) {
// do this...
} else if (is_ex_of_type(mul)) {
// do that...
} else {
// do something else...
}
}
Okay, language lawyers (*) usually construe those switch statements as
bad. But since C++ does not have generic multiple dispatch the usual
answer is that they offer some variation of home-grown double
dispatch, maybe by bloating the add and mul classes with overloaded
methods like this:
class basic {
// ...
virtual ex somemethod(const add & other);
virtual ex somemethod(const mul & other);
ex somemethod(const ex & other);
};
// same for add and mul
mul::somemethod(const ex & other)
{
return ex.bp.somemethod(*this);
}
So the proper implementation is called by two subsequent lookups in
the virtual function table. Nothing prohibits us from doing it this
way in GiNaC but for the objects we are dealing with in a CAS so far
we have always chosen the switch way of implementing and found this
much more accessible and readable. Just consider the various print
functions that recently were changed to accept an object of type
print_context or derived to format the output properly. The different
ways need to be implemented somewhere and why not deal with all of
them in foo::print().
What is this guy ranting about?!? Who cares?
Cool, if you don't care. The only thing that really bothered me so
far is the use of macros at this place. The definition of
is_of_type(obj, type) by a macro
1) lives outside the namespace and may collide some time,
2) accepts all kind of funny arguments with barely and possibility for
compile-time checking,
3) does not allow overloading, so there is an is_of_type and another
is_ex_of_type with the same semantics and
4) is generally crap with respect to readabilty and makes steam come
out of the ears of language lawyers.
A better approach would be to use a template here. We could write
is_a<numeric>(foo) where foo is either something derived from basic or
an ex and it will produce the expected outcome. There will of course
also be an is_exactly_a<tensor>(bar) matching only tensors and not
classes derived from tensor. We can implement it in exactly the same
way as the macros were implmented, for instance like this:
template <class T> inline bool is_a(const basic & obj)
{
return dynamic_cast<const T *>(&obj)!=0;
}
template <class T> inline bool is_a(const ex & obj)
{
return is_a<T>(*obj.bp);
}
The only cause for concern is
template <class T> inline bool is_exactly_a(const class basic & obj)
{
const T foo; return foo.tinfo()==obj.tinfo();
}
because it has to allocate a temporary. But this is not a big deal,
since we are allowed to specify template specializations, for instance
in file add.h we would write down
template<> inline bool is_exactly_a<add>(const basic & obj)
{
return obj.tinfo()==TINFO_add;
}
giving us all the performance back. We all know that "An Inline
Function is As Fast As a Macro" (an actual section title in GCC's
info page). So we should do it. Now.
Surprise! The inliner in GCC-2.95 does some very poor job at flow
analysis inside if statements when inlined functions return some
boolean (or integer, no matter). Consider this code:
struct ABC {
virtual ~ABC() {}
};
struct A : public ABC {};
template <class T> inline bool is_a<T>(const ABC & obj)
{
return (dynamic_cast<const T *>(&obj)!=0);
}
#define is_of_type(OBJ,TYPE) \
(dynamic_cast<const TYPE *>(&OBJ)!=0)
#ifdef USE_MACRO
int test(const ABC & e)
{
if (is_of_type(e,A))
return 1;
return 0;
}
#else
int test(const ABC & e)
{
if (is_a<A>(e))
return 1;
return 0;
}
#endif
The compiler generates in the case where USE_MACRO is defined at
preprocessing level:
00000000 <test(ABC const &)>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 08 sub $0x8,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax
9: 85 c0 test %eax,%eax
b: 74 28 je 35 <test(ABC const &)+0x35>
d: 83 c4 f8 add $0xfffffff8,%esp
10: 50 push %eax
11: 68 00 00 00 00 push $0x0
12: R_386_32 ABC type_info function
16: 8b 10 mov (%eax),%edx
18: 03 02 add (%edx),%eax
1a: 50 push %eax
1b: 6a 01 push $0x1
1d: 68 00 00 00 00 push $0x0
1e: R_386_32 A type_info function
22: ff 72 04 pushl 0x4(%edx)
25: e8 fc ff ff ff call 26 <test(ABC const &)+0x26>
26: R_386_PC32 __dynamic_cast
2a: 85 c0 test %eax,%eax
2c: 74 07 je 35 <test(ABC const &)+0x35>
2e: b8 01 00 00 00 mov $0x1,%eax
33: eb 02 jmp 37 <test(ABC const &)+0x37>
35: 31 c0 xor %eax,%eax
37: c9 leave
38: c3 ret
while in the inline case it generates this entirely contorted code:
00000000 <test(ABC const &)>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 08 sub $0x8,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax
9: 85 c0 test %eax,%eax
b: 74 24 je 31 <test(ABC const &)+0x31>
d: 83 c4 f8 add $0xfffffff8,%esp
10: 50 push %eax
11: 68 00 00 00 00 push $0x0
12: R_386_32 ABC type_info function
16: 8b 10 mov (%eax),%edx
18: 03 02 add (%edx),%eax
1a: 50 push %eax
1b: 6a 01 push $0x1
1d: 68 00 00 00 00 push $0x0
1e: R_386_32 A type_info function
22: ff 72 04 pushl 0x4(%edx)
25: e8 fc ff ff ff call 26 <test(ABC const &)+0x26>
26: R_386_PC32 __dynamic_cast
2a: 85 c0 test %eax,%eax
2c: 0f 95 c0 setne %al
2f: eb 02 jmp 33 <test(ABC const &)+0x33>
31: b0 00 mov $0x0,%al
33: 84 c0 test %al,%al
35: 75 09 jne 40 <test(ABC const &)+0x40>
37: 31 c0 xor %eax,%eax
39: eb 0a jmp 45 <test(ABC const &)+0x45>
3b: 90 nop
3c: 8d 74 26 00 lea 0x0(%esi,1),%esi
40: b8 01 00 00 00 mov $0x1,%eax
45: c9 leave
46: c3 ret
This turned out to be the performance hammer of about 25% that I saw
when I first tried substituting the macros by templates. And it also
turns out that the new GCC-3.0 produces better and roughly equivalent
code in both cases, the templated one being even slightly superior as
far as I can see.
So, this is what we'll do: (Remember that GCC-3.0 is going to be
released today.) In all cases the old macros will be supplemented by
the template functions and specializations for is_exactly_a<>() for
all critical cases. Inside the library we'll stick with the macros
for some while in the time critical parts until GCC-3.0 catches on.
(BTW: GCC-3.0 produces code that is roughly 10%-30% faster at the
GiNaC benchmarks. Rejoice and upgrade!) Eventually these macros will
be entirely phased out.
I just finished checking the changes in to CVS. In my applications I
used a Perl script to convert from the macros to the new inline
template functions which is supplied herewith WITHOUT ANY WARRANTY
WHATSOEVER. It actually works for converting the GiNaC library but
you should of course make a backup first. Once GiNaC 0.9.1 rolls out
(or if you are running from CVS) please apply this converter to your
source files.
----------8<------------------8<---------------------8<-----------------
#! /usr/bin/perl -w
my $file;
my $tmpfile;
if ($file = $ARGV[0]) {
print STDERR "replacing in file $file\n";
$tmpfile = "tmp${file}tmp";
} else {
print STDERR "*** no file given\n";
exit;
}
open (CPPFILE, $file) or die "Can't open source file: $!\n";
open (TMPFILE, "> $tmpfile") or die "Can't open temporary file: $!\n";
while ($_ = <CPPFILE>) {
# is_exactly_of_type(foo,bar) -> is_exactly_a<bar>(foo)
s/is_exactly_of_type\(([\*\.a-zA-Z_0-9\(\)\[\]\+\-\>]+)[, ]+([a-zA-Z_0-9]+)\)/is_exactly_a\<$2\>\($1\)/g;
# is_ex_exactly_of_type(foo,bar) -> is_exactly_a<bar>(foo)
s/is_ex_exactly_of_type\(([\*\.a-zA-Z_0-9\(\)\[\]\+\-\>]+)[, ]+([a-zA-Z_0-9]+)\)/is_exactly_a\<$2\>\($1\)/g;
# is_of_type(foo,bar) -> is_a<bar>(foo)
s/is_of_type\(([\*\.a-zA-Z_0-9\(\)\[\]\+\-\>]+)[, ]+([a-zA-Z_0-9]+)\)/is_a\<$2\>\($1\)/g;
# is_ex_of_type(foo,bar) -> is_a<bar>(foo)
s/is_ex_of_type\(([\*\.a-zA-Z_0-9\(\)\[\]\+\-\>]+)[, ]+([a-zA-Z_0-9]+)\)/is_a\<$2\>\($1\)/g;
print TMPFILE "$_";
}
close CPPFILE;
close TMPFILE;
`mv $tmpfile $file`;
---------->8------------------>8--------------------->8-----------------
Regards
-richy.
(*) "The first thing we do, let's kill all the language lawyers."
Henry VI, part II, taken from TC++PL-3, chapt 2.
--
Richard Kreckel
<Richard.Kreckel at Uni-Mainz.DE>
<http://wwwthep.physik.uni-mainz.de/~kreckel/>
More information about the GiNaC-devel
mailing list