[GiNaC-list] [PATCH] fix bug/feature causing non-deterministic output

Alexei Sheplyakov alexei.sheplyakov at gmail.com
Thu May 29 07:31:02 CEST 2014


Hello,

On Sun, May 25, 2014 at 12:30:54AM +0200, Valery Yundin wrote:

> I happen to know what kind of expressions one can get in high energy
> physics quite well. Determinism has nothing to do with humans reading
> output, but with reproducibility of results.

The output of GiNaC is reproducible: the same calculation gives equal
expressions, it's just the term order which might differ.

> Reproducibility is an
> important property of any computational program, even Monte-Carlo
> simulations are deterministic. Being able to compare two 10G files of
> expressions by simple diff or md5sum/etc is rather convenient.

( echo 'expand(('; cat file1; echo ') - ('; cat file2 '));' ) | ginsh

> You are right that it isn't doing what was intended and any class
> which doesn't overload calchash will use the same seed, which is a
> mistake. It is of course arguable whether different classes must use
> different seed, because the only drawback of a hash collision for
> different types is one extra typeid call in basic::compare.

I'm terribly sorry, but everything which makes the hash function worse
(slower or produces more collisions) is not acceptable.

> I'm attaching a revised version of a patch, it adds virtual functions
> to every derived class of basic, which effectively cache the class
> name hashes in local static variables.

Unfortunately this version is even worse than the previous one.

> --- a/ginac/basic.h
> +++ b/ginac/basic.h
> @@ -226,4 +226,5 @@ protected:
>  
>	virtual unsigned calchash() const;
> +	virtual unsigned gethashseed() const;

1) Now everyone have to copy-paste gethashseed() into their classes.

> @@ -782,5 +781,5 @@ return_type_t basic::return_type_tinfo() const
>  unsigned basic::calchash() const
>  {
> -	unsigned v = make_hash_seed(typeid(*this));
> +	unsigned v = gethashseed();
>  	for (size_t i=0; i<nops(); i++) {
>  		v = rotate_left(v);
> @@ -797,4 +796,11 @@ unsigned basic::calchash() const

2) Previously the hash seed was computed by a simple (inline) function.
   Now the calculation involves a call of the virtual method (which is
   *much* slower).

> It is rather small (if noticeable) price to pay for greater convenience
> for users.

If you don't mind making your computations 10% slower you can just
unconditionally

#define GINAC_HASH_USE_MANGLED_NAME 1

in hash_seed.h (no, this change is not going to be applied to
the official version of GiNaC).

> Yeah, it always puzzled me why GiNaC doesn't have a sorted_print
> function but suggests users to write their own.

None of developers happened to need such a function.

Best regards,
	Alexei



More information about the GiNaC-list mailing list