[GiNaC-devel] GiNaC is not a C compiler [Was: Optimized C-style output]
Sheplyakov Alexei
varg at theor.jinr.ru
Wed Mar 21 09:06:18 CET 2007
Hello,
On Tue, Mar 20, 2007 at 01:40:22PM +0100, Stefan Weinzierl wrote:
>
> recently I was generating C-style output from GiNaC expressions and ran
> into the following problems:
>
> - the compiled code was inefficient, as common subexpressions are
> calculated over and over again,
Fetching something from memory is quite an expensive operation on modern
(and not so modern) hardware, so calculating simple subexpressions over
and over again *can* be much faster. For instance, consider common
subexpressions of the form (x+y), (x+2*y), ... (x+1000000*y). Evaluation
of such an subexpression takes just few cycles, so it would be really
inefficient (in terms of both calculation time and memory) to use 10^6
variables to store their values.
On the other hand, there are definitely a lot of cases when it is more
efficient to calculate the subexpression(s) only once. Deciding if
subexpression(s) are complicated or not is a very complicated problem.
GiNaC does not even try to solve it (GiNaC is Not A C compiler).
> - the compiler didn't really appreciate single lines with about one
> million or more characters.
If the generated code size is _that_ big, direct calculation with .evalf()
might be much faster. In fact, I often do the inverse thing: use GiNaC for
numeric evaluation of lenghthy (>= 100Mb) expressions. No C/C++/whatever
compiler is able to process them...
> On the mailing list I found that the issue of long lines was already
> discussed in July 2003, but it seems that this discussion did not lead to
> additional functionality of GiNaC.
> To overcome these problems, I wrote a routine which generates optimized
> C code. This routine
>
> - substitutes every subexpression, which occurs more than once with a
> temporary variable.
First of all, scanning for common subexpressions is _very_ expansive, as
any pattern matching is. Secondly, generated code is not really optimal
(as I explained before).
> - splits long expressions into several lines.
IMHO this is good idea. However I don't like your implementation at
all. (That said, I used to pipe GiNaC output through `fold -w 78')
>
> As this could be of use to other people as well, it could be included into
> GiNaC. The source files are attached to this mail.
>
> An example for this routine would be the following code fragment
>
> symbol x("x"), y("y");
> ex f = pow(sin(x+y),2)+sin(x+y)+log(sin(x+y));
> optimized_C_code(std::cout, std::string("double"), std::string("tmp"), f);
>
> which prints
>
> double tmp1 = sin(y+x);
> double tmp0 = tmp1+(tmp1*tmp1)+log(tmp1);
>
> to standard output.
>
Could you please give less trivial examples?
I don't like the idea at all (that is just IMHO). But the implementation
itself is far from perfect.
> /** @file code_generation.h
> *
> * Interface to code_generation
> *
> */
>
> #ifndef __GINAC_CODE_GENERATION_H__
> #define __GINAC_CODE_GENERATION_H__
>
> #include <iostream>
> #include <string>
>
> #include "ex.h"
>
> namespace GiNaC {
It would be nice to have several functions for each task -- e.g. one
which transforms the expression to the "optimal" form, one which
converts it into exmap (e.g. splits it), one which prints it...
And the function for "optimizing" should probably take a hint from
user (he might knew about some sort of subexpressions in advance).
E.g.
const ex heuristic_collect(const ex& e, exmap& subexprs);
> void optimized_C_code(std::ostream & os, const std::string & T, const std::string & temp_var_base_name, ex expr);
const ex& expr?
>
> } // namespace GiNaC
>
> #endif // ndef __GINAC_CODE_GENERATION_H__
>
Content-Description: cpp file
>
>
> /** @file code_generation.cc
> *
> * Implementation of code_generation
> *
> */
>
> #include <cstddef>
> #include <vector>
> #include <sstream>
>
> #include "ginac.h"
> #include "code_generation.h"
>
> namespace GiNaC {
>
> // anonymous namespace for helper functions
> namespace {
>
> /**
> *
> * Helper function, which converts an integer into a string
> *
> */
> std::string itos(int arg)
> {
> std::ostringstream buffer;
> buffer << arg;
> return buffer.str();
> }
>
> /**
> *
> * Checks if an expression can potentially be substituted.
> *
> * At present, expressions which are considered for substitution are of the
> * following types:
> * add, mul, power or function.
> *
> */
> bool check_substitution_type(ex expr)
^^
You'd better use const ex& here
> {
> if ( is_a<add>(expr)
> || is_a<mul>(expr)
> || is_a<power>(expr)
> || is_a<function>(expr)
> ) return true;
>
> return false;
> }
>
> /**
> *
> * This function writes a C statement of the form
> * T base_namexxx = expr
> * to the stream.
> *
> * If the expression contains a subexpression with a large number of operands, the output is split
> * into smaller subexpressions.
The number of operands is not really good as code complexity metrics.
> *
> */
> void write_line(std::ostream & os, const std::string & T, const std::string & base_name,
> std::vector<symbol> & tv, int tv_i, ex expr)
I think this should be splitted into two separate functions -- one
which transforms expression and one which prints it.
> {
> size_t MAX_NOPS = 1024;
>
> bool flag_long_expression = false;
>
> // -------------------------------------------
> // check for long add and mul
> lst subexpr_lst;
> for (const_preorder_iterator it = expr.preorder_begin(); it != expr.preorder_end(); it++)
> {
> if ( is_a<add>(*it) || is_a<mul>(*it) )
> {
> if ( (*it).nops() > MAX_NOPS )
> {
> subexpr_lst.append(*it);
> flag_long_expression = true;
> }
> }
> }
>
> // -------------------------------------------
> if ( flag_long_expression )
> {
> exmap subs_map;
>
> for ( lst::const_reverse_iterator k = subexpr_lst.rbegin(); k != subexpr_lst.rend(); k++ )
> {
> // two new symbols
> int i1 = tv.size();
^^^^^^
This ought to be std::size_t ...
> tv.push_back( symbol( base_name+itos(i1) ) );
> int i2 = tv.size();
^^^^^^
and this one too.
> tv.push_back( symbol( base_name+itos(i2) ) );
>
> ex expr_1, expr_2;
>
> if ( is_a<add>(*k) )
> {
> expr_1 = 0;
> const_iterator j_it = k->begin();
> for (size_t j=0; j<MAX_NOPS; j++)
> {
> expr_1 += *j_it;
> j_it++;
> }
> expr_2 = (*k) - expr_1;
>
> subs_map[*k] = tv[i1]+tv[i2];
> }
> else if ( is_a<mul>(*k) )
> {
> expr_1 = 1;
> const_iterator j_it = k->begin();
> for (size_t j=0; j<MAX_NOPS; j++)
> {
> expr_1 *= *j_it;
> j_it++;
> }
> expr_2 = (*k) / expr_1;
>
> subs_map[*k] = tv[i1]*tv[i2];
> }
>
> // print expr_2 and expr_1
> write_line(os, T, base_name, tv, i2, expr_2);
> write_line(os, T, base_name, tv, i1, expr_1);
>
> } // lst::const_iterator k
>
> expr = expr.subs( subs_map, subs_options::no_pattern );
>
> } // flag_long_expression
>
>
> // -------------------------------------------
> // print one line
>
> os << " " << T << " " << tv[tv_i] << " = ";
>
> if ( T == std::string("double") )
> {
> expr.print(print_csrc_double(os));
> }
> else if ( T == std::string("float") )
> {
> expr.print(print_csrc_float(os));
> }
> else if ( T == std::string("cl_N") )
> {
> expr.print(print_csrc_cl_N(os));
> }
> else // default
> {
> expr.print(print_csrc(os));
> }
>
> os << ";" << std::endl;
> }
>
> /**
> *
> * This function substitutes all subexpressions which occur more than once in the
> * expression tree.
> *
> * The function proceeds iteratively and substitutes first subexpressions, which are
> * "closer" to the root of the expression tree.
> * If these subexpressions in turn contain subsubexpressions, which occur multiple times,
> * the latter are substituted in subsequent iterations.
> *
> */
I doubt this will result in better code.
And I think this function should be splitted into two separate functions
-- one which transforms expression and one which prints it.
> void substitute_subexpressions(std::ostream & os, const std::string & T, const std::string & base_name, ex expr)
> {
> bool flag_try_substitution = true;
> std::vector<symbol> tv;
>
> lst expr_lst;
> ex ex_expr_lst;
> lst replies;
>
> // initialize
> expr_lst.append(expr);
> ex_expr_lst = expr_lst;
> tv.push_back( symbol( base_name+itos(tv.size()) ) );
>
... so you don't need to loop here, you can call (recursively)
this function.
> while ( flag_try_substitution )
> {
> // expand tree into a long list
> lst subexpr_lst;
You'd better use exset (e.g., std::set<ex, ex_is_less>)...
> for (const_preorder_iterator it = ex_expr_lst.preorder_begin(); it != ex_expr_lst.preorder_end(); it++)
> {
> if ( check_substitution_type(*it) ) subexpr_lst.append(*it);
> }
> subexpr_lst.sort();
>
> // find dublicate entries
> replies.remove_all();
... so you won't get any duplicates at all.
> ex subexpr = subexpr_lst.op(0);
> int counter = 1;
>
> lst::const_iterator it_start = subexpr_lst.begin();
> it_start++;
> for ( lst::const_iterator it = it_start; it != subexpr_lst.end(); it++ )
> {
> if ( subexpr.is_equal(*it) )
> {
> counter++;
>
> if ( counter == 2 )
> {
> bool flag_subs = true;
>
> size_t k = 0;
> for ( lst::const_iterator k_it = replies.begin(); k_it != replies.end(); k_it++ )
> {
> // subexpression of something already in replies
> if ( (*k_it).has(subexpr) )
> {
> flag_subs = false;
> }
> // replies contains a subsubexpression of subexpr
> // note that subexpr can be the mother of more than one child
> else if ( subexpr.has(*k_it) )
> {
> replies.let_op(k) = subexpr;
> flag_subs = false;
> }
>
> // update k
> k++;
> } // const_iterator k_it
>
> // new substitution
> if ( flag_subs )
> {
> replies.append(subexpr);
> }
> } // counter == 2
>
> } // equal
> else // not equal
> {
> // new subexpression
> subexpr = *it;
> counter = 1;
> } // subexpr not equal
> } // lst::const_iterator it
>
> // sort and remove double entries
> replies.sort();
> replies.unique();
>
> // substitute
> if ( replies.nops() > 0 )
> {
> exmap subs_map;
>
> for ( lst::const_iterator k = replies.begin(); k != replies.end(); k++ )
> {
> int i = tv.size();
> tv.push_back( symbol( base_name+itos(i) ) );
> subs_map[*k] = tv[i];
> }
> ex_expr_lst = ex_expr_lst.subs( subs_map, subs_options::no_pattern );
>
> expr_lst = ex_to<lst>(ex_expr_lst);
> for ( lst::const_iterator k = replies.begin(); k != replies.end(); k++ )
> {
> expr_lst.append(*k);
> }
> ex_expr_lst = expr_lst;
> }
> else
> {
> flag_try_substitution = false;
> }
>
> } // flag_try_substitution
>
>
> // -----------------------------------------------------------
> // output
>
> int tv_i = tv.size();
> for ( lst::const_reverse_iterator k = expr_lst.rbegin(); k != expr_lst.rend(); k++ )
> {
> tv_i--;
> write_line(os, T, base_name, tv, tv_i, *k);
> }
> }
>
>
> } // end of anonymous namespace
>
> /**
> *
> * A routine which generates optimized C code.
> *
> * expr is the expression which should be translated to C code.
> * os is the stream to which the output is written.
> * T is a string giving the data type in C of the expression, usually something like "double" or "float".
> * Intermediate variables obtain their names from base_name, to which a serial number is attached, e.g.
> * if base_name=string("t"), the intermediate variables are labelled t0, t1, t2, etc..
> * The routine generates a C code fragment, such that after execution of this code, t0 contains the value
> * of the original expression expr.
> *
> * The code is generated in two steps:
> * - First all sub-expressions, which occur more than once are replaced by intermediate variables.
No, it does not... Example: x+y+cos(x+y) does not get modified at all.
> * - In a second step, expressions with a large number of operands are split into smaller parts.
> * This step does not lead to optimizations, but allows large expressions to be processed by the
> * compiler.
I don't think 1K is a "large number". And evalf() is much faster for
numeric evaluation of large expressions anyway...
> The second part is done by the function "write_line".
> *
> */
> void optimized_C_code(std::ostream & os, const std::string & T, const std::string & base_name, ex expr)
> {
> substitute_subexpressions(os, T, base_name, expr);
> }
>
> } // namespace GiNaC
Best regards,
Alexei
--
All science is either physics or stamp collecting.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 827 bytes
Desc: Digital signature
URL: <http://www.ginac.de/pipermail/ginac-devel/attachments/20070321/e3fcf46f/attachment.sig>
More information about the GiNaC-devel
mailing list