[GiNaC-list] Term ordering and compiling C++ code
jros
jros at unavarra.es
Mon May 24 20:34:55 CEST 2010
Doug wrote:
kreckel wrote:
...and second, sharing is currently not pursued
aggressively! Rather, it is exploited whenever it is
trivial to do so. (The product rule of differentiation
is an example where exactly the same terms pop up again
and again so exploiting sharing comes at no cost.)
I think the benefit would still be large, even if coverage isn't
100%. See below.
I fully agree with this, at least when it is done to a level that does
not affect overall performance, possibly there is room for improvement
in this direction.
Well, I think that if the size of generated code is so
prohibitively large and compiler CSE doesn't help you
may be better off writing your own algorithm collecting
similar terms in an associative array. You could then
artificially introduce temporaries, in order to help the
compiler. This would boil down to a more aggressive
variant of GiNaC's subexpression sharing. What do you
think?
This is exactly what I have done. As mentioned, I will post
that solution here in the next few days. This works well, but
is very slow, as it takes something that is sub-linear in the
size of the expression tree and makes it O(n^2).
If GiNaC expression tree nodes had a unique identifier, this
approach could achieve sub-linear performance. Specifically, it
would be linear in the size of the unique subexpressions, not
the size of the actual expression. (If you have a lot of
sharing, your expression can be large, but the number of unique
expressions can be small.)
I fully agree with that, and I think this should be possible, probably
simple, and won't hurt anybody.
Even if GiNaC doesn't identify every single instance of a shared
sub-expression, the benefits could still be large. In fact, I
tried an experiment to test this. I treated ex::gethash() as a
unique identifier, even though it isn't. The code I generated
was wrong, of course, but it was very fast to generate it, and
very fast to compile and execute. Thus, whatever number of
unique expressions were not actually identified as such was
small enough that this approach would still be quite useful.
I don't know how much would impact in performance the increase of
expression sharing. I'm not aware of the copy on write implementation,
but I think that restricting the effect of copy to the non-shareable
parts, can be probably the single piece needed to increase performance.
The implementation should be straightforward, and the computational
overhead, probably is not that big. I think this performance can be there,
without hurting anybody just providing a mechanism to runtime activate
it. Memory savings could be Huge!!!.
May be it is enough to adopt a programing style to construct expresions
that would increase sharing without further changes within GiNaC,
or any other mechanism is available to increase sharing??.
Also, FYI, when using the hash value as a unique identifier, I
only got a few collisions out of thousands of sub-expressions.
That is, I only found a few expressions with the same hash even
though they were different expressions.
Is this correct behavior?. Shouldn't hash values be unique?.
Thanks,
Doug
We are looking forward to your code. I would be nice I you could
describe your algorithm, when you finish, in a little more detail.
Are you exporting a single expression or a set of them?.
Thank you,
Javier
More information about the GiNaC-list
mailing list