[GiNaC-devel] GiNaC is not a C compiler [Was: Optimized C-style output]

Wed Mar 21 09:06:18 CET 2007

Hello,

On Tue, Mar 20, 2007 at 01:40:22PM +0100, Stefan Weinzierl wrote:
> 
> recently I was generating C-style output from GiNaC expressions and ran
> into the following problems:
> 
> - the compiled code was inefficient, as common subexpressions are
>   calculated over and over again,

Fetching something from memory is quite an expensive operation on modern
(and not so modern) hardware, so calculating simple subexpressions over
and over again *can* be much faster. For instance, consider common 
subexpressions of the form (x+y), (x+2*y), ... (x+1000000*y). Evaluation
of such an subexpression takes just few cycles, so it would be really
inefficient (in terms of both calculation time and memory) to use 10^6
variables to store their values. 

On the other hand, there are definitely a lot of cases when it is more
efficient to calculate the subexpression(s) only once. Deciding if
subexpression(s) are complicated or not is a very complicated problem.
GiNaC does not even try to solve it (GiNaC is Not A C compiler).

> - the compiler didn't really appreciate single lines with about one
>   million or more characters.

If the generated code size is _that_ big, direct calculation with .evalf()
might be much faster. In fact, I often do the inverse thing: use GiNaC for
numeric evaluation of lenghthy (>= 100Mb) expressions. No C/C++/whatever
compiler is able to process them...

> On the mailing list I found that the issue of long lines was already
> discussed in July 2003, but it seems that this discussion did not lead to
> additional functionality of GiNaC.
> To overcome these problems, I wrote a routine which generates optimized
> C code. This routine
> 
> - substitutes every subexpression, which occurs more than once with a
>   temporary variable.

First of all, scanning for common subexpressions is _very_ expansive, as
any pattern matching is. Secondly, generated code is not really optimal
(as I explained before).

> - splits long expressions into several lines.

IMHO this is good idea. However I don't like your implementation at
all. (That said, I used to pipe GiNaC output through `fold -w 78')

> 
> As this could be of use to other people as well, it could be included into
> GiNaC. The source files are attached to this mail.
> 
> An example for this routine would be the following code fragment
> 
>   symbol x("x"), y("y");
>   ex f = pow(sin(x+y),2)+sin(x+y)+log(sin(x+y));
>   optimized_C_code(std::cout, std::string("double"), std::string("tmp"), f);
> 
> which prints
> 
>  double tmp1 = sin(y+x);
>  double tmp0 = tmp1+(tmp1*tmp1)+log(tmp1);
> 
> to standard output.
> 

Could you please give less trivial examples?

I don't like the idea at all (that is just IMHO). But the implementation
itself is far from perfect.

> /** @file code_generation.h
>  *
>  * Interface to code_generation
>  *
>  */
> 
> #ifndef __GINAC_CODE_GENERATION_H__
> #define __GINAC_CODE_GENERATION_H__
> 
> #include <iostream>
> #include <string>
> 
> #include "ex.h"
> 
> namespace GiNaC {

It would be nice to have several functions for each task -- e.g. one
which transforms the expression to the "optimal" form, one which
converts it into exmap (e.g. splits it), one which prints it...
And the function for "optimizing" should probably take a hint from 
user (he might knew about some sort of subexpressions in advance).
E.g.

const ex heuristic_collect(const ex& e, exmap& subexprs);

>   void optimized_C_code(std::ostream & os, const std::string & T, const std::string & temp_var_base_name, ex expr);

const ex& expr?

> 
> } // namespace GiNaC
> 
> #endif // ndef __GINAC_CODE_GENERATION_H__
> 

Content-Description: cpp file
> 
> 
> /** @file code_generation.cc
>  *
>  * Implementation of code_generation 
>  *
>  */
> 
> #include <cstddef>
> #include <vector>
> #include <sstream>
> 
> #include "ginac.h"
> #include "code_generation.h"
> 
> namespace GiNaC {
> 
> // anonymous namespace for helper functions
> namespace {
> 
>   /**
>    *
>    * Helper function, which converts an integer into a string
>    *
>    */
>   std::string itos(int arg)
>   {
>     std::ostringstream buffer;
>     buffer << arg; 
>     return buffer.str(); 
>   }
> 
>   /**
>    *
>    * Checks if an expression can potentially be substituted.
>    *
>    * At present, expressions which are considered for substitution are of the 
>    * following types:
>    * add, mul, power or function.
>    *
>    */
>   bool check_substitution_type(ex expr)
                                 ^^
You'd better use const ex& here

>   {
>     if ( is_a<add>(expr) 
> 	 || is_a<mul>(expr) 
> 	 || is_a<power>(expr) 
> 	 || is_a<function>(expr) 
> 	 ) return true;
> 
>     return false;
>   }
> 
>   /**
>    *
>    * This function writes a C statement of the form
>    * T base_namexxx = expr
>    * to the stream.
>    *
>    * If the expression contains a subexpression with a large number of operands, the output is split 
>    * into smaller subexpressions.

The number of operands is not really good as code complexity metrics.

>    *
>    */
>   void write_line(std::ostream & os, const std::string & T, const std::string & base_name, 
> 		  std::vector<symbol> & tv, int tv_i, ex expr)

I think this should be splitted into two separate functions -- one
which transforms expression and one which prints it.

>   {
>     size_t MAX_NOPS = 1024;
> 
>     bool flag_long_expression = false;
> 
>     // -------------------------------------------
>     // check for long add and mul
>     lst subexpr_lst;
>     for (const_preorder_iterator it = expr.preorder_begin(); it != expr.preorder_end(); it++)
>       {
> 	if ( is_a<add>(*it) || is_a<mul>(*it) ) 
> 	  {
> 	    if ( (*it).nops() > MAX_NOPS )
> 	      {
> 		subexpr_lst.append(*it);
> 		flag_long_expression = true;
> 	      }
> 	  }
>       }
> 
>     // -------------------------------------------
>     if ( flag_long_expression )
>       {
> 	exmap subs_map;
> 
> 	for ( lst::const_reverse_iterator k = subexpr_lst.rbegin(); k != subexpr_lst.rend(); k++ )
> 	  {
> 	    // two new symbols
> 	    int i1 = tv.size();
        ^^^^^^
This ought to be std::size_t ...

> 	    tv.push_back( symbol( base_name+itos(i1) ) );
> 	    int i2 = tv.size();
        ^^^^^^
and this one too.

> 	    tv.push_back( symbol( base_name+itos(i2) ) );
> 
> 	    ex expr_1, expr_2;
> 
> 	    if ( is_a<add>(*k) )
> 	      {
> 		expr_1 = 0;
> 		const_iterator j_it = k->begin();
> 		for (size_t j=0; j<MAX_NOPS; j++)
> 		  {
> 		    expr_1 += *j_it;
> 		    j_it++;
> 		  }
> 		expr_2 = (*k) - expr_1;
>  
> 		subs_map[*k] = tv[i1]+tv[i2];
> 	      }
> 	    else if ( is_a<mul>(*k) )
> 	      {
> 		expr_1 = 1;
> 		const_iterator j_it = k->begin();
> 		for (size_t j=0; j<MAX_NOPS; j++)
> 		  {
> 		    expr_1 *= *j_it;
> 		    j_it++;
> 		  }
> 		expr_2 = (*k) / expr_1;
>  
> 		subs_map[*k] = tv[i1]*tv[i2];
> 	      }
> 
> 	    // print expr_2 and expr_1
> 	    write_line(os, T, base_name, tv, i2, expr_2);
> 	    write_line(os, T, base_name, tv, i1, expr_1);
> 
> 	  } // lst::const_iterator k
> 
> 	expr = expr.subs( subs_map, subs_options::no_pattern );
> 
>       } // flag_long_expression
> 
> 
>     // -------------------------------------------
>     // print one line
> 
>     os << " " << T << " " << tv[tv_i] << " = ";
> 
>     if ( T == std::string("double") )
>       {
> 	expr.print(print_csrc_double(os));
>       }
>     else if ( T == std::string("float") )
>       {
> 	expr.print(print_csrc_float(os));
>       }
>     else if ( T == std::string("cl_N") )
>       {
> 	expr.print(print_csrc_cl_N(os));
>       }
>     else // default
>       {
> 	expr.print(print_csrc(os));
>       }
> 
>     os << ";" << std::endl;
>   }
> 
>   /**
>    *
>    * This function substitutes all subexpressions which occur more than once in the
>    * expression tree.
>    *
>    * The function proceeds iteratively and substitutes first subexpressions, which are
>    * "closer" to the root of the expression tree.
>    * If these subexpressions in turn contain subsubexpressions, which occur multiple times,
>    * the latter are substituted in subsequent iterations.
>    *
>    */
I doubt this will result in better code.

And I think this function should be splitted into two separate functions
-- one which transforms expression and one which prints it.

>   void substitute_subexpressions(std::ostream & os, const std::string & T, const std::string & base_name, ex expr)
>   {
>     bool flag_try_substitution = true;
>     std::vector<symbol> tv;
> 
>     lst expr_lst;
>     ex ex_expr_lst;
>     lst replies;
> 
>     // initialize
>     expr_lst.append(expr);
>     ex_expr_lst = expr_lst;
>     tv.push_back( symbol( base_name+itos(tv.size()) ) );
> 

... so you don't need to loop here, you can call (recursively)
this function.

>     while ( flag_try_substitution )
>       {
> 	// expand tree into a long list
> 	lst subexpr_lst;

    You'd better use exset (e.g., std::set<ex, ex_is_less>)...

> 	for (const_preorder_iterator it = ex_expr_lst.preorder_begin(); it != ex_expr_lst.preorder_end(); it++)
> 	  {
> 	    if ( check_substitution_type(*it) ) subexpr_lst.append(*it);
> 	  }
> 	subexpr_lst.sort();
> 
> 	// find dublicate entries
> 	replies.remove_all();

    ... so you won't get any duplicates at all.

> 	ex subexpr = subexpr_lst.op(0);
> 	int counter = 1;
> 
> 	lst::const_iterator it_start = subexpr_lst.begin();
> 	it_start++;
> 	for ( lst::const_iterator it = it_start; it != subexpr_lst.end(); it++ )
> 	  {
> 	    if ( subexpr.is_equal(*it) ) 
> 	      {
> 		counter++;
> 
> 		if ( counter == 2 )
> 		  {
> 		    bool flag_subs = true;
> 
> 		    size_t k = 0;
> 		    for ( lst::const_iterator k_it = replies.begin(); k_it != replies.end(); k_it++ )
> 		      {
> 			// subexpression of something already in replies
> 			if ( (*k_it).has(subexpr) ) 
> 			  {
> 			    flag_subs = false;
> 			  }
> 			// replies contains a subsubexpression of subexpr
> 			// note that subexpr can be the mother of more than one child
> 			else if ( subexpr.has(*k_it) )
> 			  {
> 			    replies.let_op(k) = subexpr;
> 			    flag_subs = false;
> 			  }
> 
> 			// update k
> 			k++;
> 		      } // const_iterator k_it
> 
> 		    // new substitution
> 		    if ( flag_subs )
> 		      {
> 			replies.append(subexpr);
> 		      }
> 		  } // counter == 2
> 
> 	      } // equal
> 	    else // not equal
> 	      {
> 		// new subexpression
> 		subexpr = *it;
> 		counter = 1;
> 	      } // subexpr not equal
> 	  } // lst::const_iterator it
> 
> 	// sort and remove double entries
> 	replies.sort();
> 	replies.unique();
> 
> 	// substitute
> 	if ( replies.nops() > 0 )
> 	  {
> 	    exmap subs_map;
> 
> 	    for ( lst::const_iterator k = replies.begin(); k != replies.end(); k++ )
> 	      {
> 		int i = tv.size();
> 		tv.push_back( symbol( base_name+itos(i) ) );
> 		subs_map[*k] = tv[i];
> 	      }
> 	    ex_expr_lst = ex_expr_lst.subs( subs_map, subs_options::no_pattern );
> 
> 	    expr_lst = ex_to<lst>(ex_expr_lst);
> 	    for ( lst::const_iterator k = replies.begin(); k != replies.end(); k++ )
> 	      {
> 		expr_lst.append(*k);
> 	      }
> 	    ex_expr_lst = expr_lst;
> 	  }
> 	else
> 	  {
> 	    flag_try_substitution = false;
> 	  }
> 
>       } // flag_try_substitution
> 
> 
>     // -----------------------------------------------------------
>     // output
> 
>     int tv_i = tv.size();
>     for ( lst::const_reverse_iterator k = expr_lst.rbegin(); k != expr_lst.rend(); k++ )
>       {
> 	tv_i--;
> 	write_line(os, T, base_name, tv, tv_i, *k);
>       }
>   }
> 
> 
> } // end of anonymous namespace
> 
>   /**
>    *
>    * A routine which generates optimized C code.
>    *
>    * expr is the expression which should be translated to C code.
>    * os is the stream to which the output is written.
>    * T is a string giving the data type in C of the expression, usually something like "double" or "float".
>    * Intermediate variables obtain their names from base_name, to which a serial number is attached, e.g.
>    * if base_name=string("t"), the intermediate variables are labelled t0, t1, t2, etc..
>    * The routine generates a C code fragment, such that after execution of this code, t0 contains the value
>    * of the original expression expr.
>    *
>    * The code is generated in two steps:
>    * - First all sub-expressions, which occur more than once are replaced by intermediate variables.

No, it does not... Example: x+y+cos(x+y) does not get modified at all.

>    * - In a second step, expressions with a large number of operands are split into smaller parts.
>    *   This step does not lead to optimizations, but allows large expressions to be processed by the 
>    *   compiler.

I don't think 1K is a "large number". And evalf() is much faster for
numeric evaluation of large expressions anyway...

>        The second part is done by the function "write_line".
>    *
>    */
>   void optimized_C_code(std::ostream & os, const std::string & T, const std::string & base_name, ex expr)
>   {
>     substitute_subexpressions(os, T, base_name, expr);
>   }
> 
> } // namespace GiNaC

Best regards,
 Alexei

-- 
All science is either physics or stamp collecting.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 827 bytes
Desc: Digital signature
URL: <http://www.ginac.de/pipermail/ginac-devel/attachments/20070321/e3fcf46f/attachment.sig>