[GiNaC-devel] GiNaC is not a C compiler [Was: Optimized C-style output]

Wed Mar 21 12:46:43 CET 2007

On Wed, Mar 21, 2007 at 11:25:21AM +0100, Stefan Weinzierl wrote:

> I think you miss the point. A typical situation were the generation of C
> code is useful, is the following:
> You use GiNaC to compute symbolically  a rather large expression which
> depends on a handful symbols.
> You then want to evaluate this expression numerically a large number of
> times for different values of the variables/symbols ( a typical example is
> Monte Carlo integration).

> There are two large quantities in the game: The size of the expression
> and the number of evaluations.

Yes, if one about to calculate the expression ~ 10^N times, compiling
it might give considerable speedup.

> I believe an efficient way to do this is to write the expression as C code
> into a file, compile and reload the compiled function into memory.
> 
> There is certainly an overhead involved for the time needed to generate
> the C code and the compilation, but this occurs only once and is usually
> outweighted by the speed improvement of a single evaluation.

That depends.

> To give an example I took an expression,

If it is not top secret, could you please post it here? Otherwise it
is difficult to draw any conclusions. It is easy to construct an
expression such that optimizer

1) won't changed them at all;
2) will die due to exhausted memory (or my patience, or both);
3) will make them even worse.

> which I want to evaluate 10^6 times. A single evaluation with evalf()
> takes 0.55 s.

Well, things are not that simple. One can do partial evaluation with
GiNaC (which is difficult in plain C/C++). For example, simple-minded
code would do

double x[N];
double y[N];

ex e = // some huge expression depending on x and y.
double ed[N*N];
for (size_t i=0; i<=N-1; i++)
	for(size_t j=0; j<=N-1; j++)
		ed[i+j*(N-1)] = e.subs(lst(x==x[i], y==y[j])).evalf().to_double();

Smarter code[r] would do

for (size_t i=0; i<=N-1; i++) {
	ex tmp = e.subs(x==x[i]).evalf();
	for (size_t j=0; j<=N-1; j++)
		ed[i+j*(N-1)] = tmp.subs(y==y[j]).evalf().to_double();
}

Such a technique is a viable alternative to dumping expression as
a C code and compiling it. I admit it is almost useless when evaluation
points are random (as in Monte Carlo).

> If I use the plain print_csrc routine build into GiNaC, the code is
> generated in less than a second, compilation takes 74s, but the time for a
> single evaluation improves to 3.8 10^-3 s.
> 
> If I take the routine for the generation of optimized code, code
> generation takes 49s (yes, code optimization has a price, but you pay only
> once),

And the price grows faster-then-exponentially with the size of the
expression...

> compilation takes 7 s, and the time for a single evaluation is now 
> 3.2 10^-4 s.
> 
> So the total cost for one million evaluations is
>  evalf() :       550000 s
>  standard csrc:    3875 s
>  optimized:         376 s
> 
> Therefore I don't understand why you propose evalf(), the difference in
> cost are orders of magnitude.

Because the problem was (is?) not stated accurately enough. Anyway, I
do agree that "optimizer" improves _some_ expressions.

> A few more comments: I agree that GiNaC is not a compiler, but a nice
> feature is that it is based on C++ and should therefore allow the
> combination of symbolic and numerical code.

Yes, it allows that. One can use any C/C++ (and with some labour
F*****N) libraries _without_ dumping expressions into C/C++/whatever
code.

> Generating optimized C code is a feature I miss in GiNaC.

To some extent GiNaC was desinged to _avoid_ code generation:

"This border line between analytical and numerical methods, however,
has quite often remained a painful obstacle in the successful
application of computer algebra systems (CASs). Usually, some
results are obtained with the help of a CAS and these results
are integrated later into some other program. This is not only
restricted to numerical results, where one frequently translates the
output of some CAS to C or even FORTRAN. It is questionable whether
the distinction between one language for writing down symbolical
algebra, one for obtaining numerical results and maybe a third one
for integrating everything in a graphical user interface has any
reason other than an historical one." [cs.sc/0004015]

However, GiNaC does support code generation, and even got that evil
GiNaC::excompiler thing...

> Finally, one should make the attempt to generate "the" optimal code,
> anything which performs better than the standard csrc output is
> already an improvment.

It performs better for _particular kind of expressions you happen to 
work with_. I doubt it will improve something in general case. That
was point of my previous post (mostly, I don't like the particular
implementation you've suggested, but that's another story).

That said, I've got a similar "optimizer" too. :) Basically, it collect()s
certain type of (sub)expressions and runs normal() on coefficients. (BTW
your code would look simpler if you use collect()). It works very good
for some kinds of expressions, but in general case it just sucks horribly.

Best regards,
 Alexei

-- 
All science is either physics or stamp collecting.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 827 bytes
Desc: Digital signature
URL: <http://www.ginac.de/pipermail/ginac-devel/attachments/20070321/32b717f6/attachment.sig>