RFC: print() double dispatch

Thu Jul 17 21:06:32 CEST 2003

Hi!

GiNaC objects can be printed in a variety of different formats. Originally,
there were separate virtual methods for each output type (with different
parameter lists):

  basic::print(...)
  basic::printtree(...)
  basic::printcsrc(...)
  etc.

This made adding new output types (LaTeX output was planned at that time)
cumbersome, as the corresponding method would have to be implemented for
every single class, even if the output wasn't different from that of any of
the existing output types.

In GiNaC 0.8.0, this was radically changed by creating a hierarchy of
"print_context" classes, corresponding to the output types, and reducing
the print methods to a single one that accepts a print_context object as
an argument:

  basic::print(const print_context &, ...)

The implementations of print() for each class then perform a type switch on
the actual (dynamic) type of the passed print_context. This design later
also proved useful for adding stream manipulators ("cout << latex" etc.).
It still has problems, though:

 - It's not possible for GiNaC users to add new output types that work with
   existing classes without modifying and recompiling GiNaC (e.g. you could
   implement a print_fortran class but you need to change power::print() to
   have powers printed in the proper way).
 - There is no proper overload resolution performed with respect to the
   print_context type. You can't easily make a class have a different
   csrc output (with all other output types being the same) than its parent
   without manually adding fallback code. This is already a problem with
   print_tree which produces output that is radically different from any
   other print_context type (see also utils.h/DEFAULT_PRINT).
 - There are some design issues with type switches, the most severe being
   that the order of tests matters when dealing with class hierarchies
   (e.g. an "if (is_a<print_csrc_cl_N>(c))" must come before an
   "if (is_a<print_csrc>(c))", otherwise it won't work as expected).

A swift design analysis yields that what's really needed here is double
dispatch, the ability to dispatch methods depending on the dynamic type of
two objects (the algebraic object and the print_context object) which C++,
unfortunately, doesn't provide. So this behavior has to be emulated.

It is tempting to use visitors here, especially the kind that is already
implemented in GiNaC 1.2. For every print_context type, there would be a
corresponding visitor class (they even have the proper overload resolution
already) but this setup would make adding new algebraic classes to GiNaC
impossible, as you couldn't add the required "visit(const myclass &)"
function to an existing visitor (sudden crazy thought: might it be possible
to apply the visitor scheme to the visitors themselves to achieve that?
Probably just results in an unmaintainable mess of NxM classes...).

Anyway, I have a working test implementation of a possible double dispatcher
for print(). First, the good news:
 - It appears to work perfectly.
 - It's dynamic: you can change print methods at runtime.
 - It's backwards compatible: basic::print() remains a virtual function
   with the same arguments, so derived classes can overload it as before.
   They won't get any of the benefits of the new dispatcher then, though.
 - In any case, unless you write your own symbolic classes you won't notice
   any difference.

The bad news:
 - It's slower than the current print(). It performs a few virtual function
   calls and vector lookups, but if the types match it still dispatches in
   constant time. I doubt that the efficiency of print() is of much importance,
   though.
 - Specifying print methods for classes is a little awkward (see below).

Print methods are specified in a way that resembles the specification of
eval() etc. methods for functions (and designed in a way to allow possible
future extension with more per-class attributes):

  class foo : public basic {
    GINAC_DECLARE_REGISTERED_CLASS(foo, basic)
  public:
    void do_print(const print_context & c, unsigned level) const;
    void do_print_csrc(const print_csrc & c, unsigned level) const;
  };

  GINAC_IMPLEMENT_REGISTERED_CLASS_OPT(foo, basic,
    print_func(&foo::do_print).print_func(&foo::do_print_csrc))

  class bar : public foo {
    GINAC_DECLARE_REGISTERED_CLASS(bar, foo)
  public:
    // methods can also be static members, and even ordinary global functions
    static void do_print_csrc(const bar & obj, const print_csrc & c, unsigned level) const;
  };

  GINAC_IMPLEMENT_REGISTERED_CLASS_OPT(bar, foo,
    print_func(&bar::do_print_csrc))

Now, when you call bar::print() with a print_csrc or derived type, it calls
bar::do_print_csrc(), with any other print_context object it calls
foo::do_print().

There is also a global function set_print_func() that allows changing the
methods at run-time. So to get proper Fortran output for powers, you can do

  void print_power_fortran(const power & p, const print_fortran & c, unsigned level)
  { ... }

  set_print_func(print_power_fortran);

Another example for a possible application would be to change the LaTeX
output of indexed objects.

How it works:
 - Each print_context type gets a unique ID number, similar to the tinfo_key
   of the algebraic classes, but assigned automatically.
 - There are convenience macros GINAC_DECLARE_PRINT_CONTEXT and
   GINAC_IMPLEMENT_PRINT_CONTEXT that work similar to the corresponding
   macros for algebraic classes. Their main purpose is to handle said ID
   number.
 - registered_class_info is augmented by a vector that contains (possibly
   NULL) pointers (actually somewhat generalized functors) to the methods for
   each output type. The print_context ID is used as the vector index.
   Thus, each class gets its own dispatch table for print().
 - basic::print() performs the actual method lookup using the dynamic type
   of the algebraic object (from where it obtains the method table) and
   the ID number of the passed print_context object. If no method for that
   combination was registered, it repeats the process with the parent
   print_context class. If still no method is found, it repeats the process
   with the method vector of the parent algebraic class (stopping at the
   basic class itself). Overload resolution is thus first performed on the
   type of the algebraic object, then on the type of the print_context
   object. This matches the current behavior of print().

The current test implementation still has one drawback, though: when
specifying a method, the types of the two objects for which this method
will be registered are automatically (via some template magic) inferred from
the argument types of the supplied function. That is, a
"do_print(const lst &, const print_csrc &)" can only be used for the
lst/print_csrc combination of types. If you want to, for example, use the
same generic "my_print(const basic &, const print_myformat &)" function for
a set of different algebraic types, you have to write trampolines like
"my_print(const foo &, const print_myformat &)" etc. On the positive side,
though, you don't have to explicitly specify the classes via template
parameters when registering methods as you have to with most other
implementations of double dispatch in C++ (e.g. you would have to write
"set_print_func<power, print_fortran>(...)",
"print_func<bar, print_csrc>(&bar::do_print_csrc)" etc.).

I guess the implementation can be changed to allow the explicit specification
of the classes for which a method is to be registered, but then they would
always need to be specified. Which way is better? I don't know...

Planned future work:
 - The default output format gets its own context type (print_dflt?), rather
   than the base class print_context. This will solve the print_tree problem.
 - Extend this scheme to GiNaC functions, making it possible to, e.g. have
   abs(x) printed as |x| in LaTeX mode, fabs(x) in C++ mode, etc.

Questions? Comments? Suggestions? Do we need this at all, or is the current
print() implementation good enough?

Bye,
Christian

-- 
  / Physics is an algorithm
\/ http://www.uni-mainz.de/~bauec002/