From kreckel at thep.physik.uni-mainz.de Mon Mar 18 22:31:39 2002 From: kreckel at thep.physik.uni-mainz.de (Richard B. Kreckel) Date: Mon, 18 Mar 2002 22:31:39 +0100 (CET) Subject: And GiNaC-1.0.7 shall be made legion... Message-ID: Its numbers shall be increased a thousand thousand fold. The din of a million keyboards like unto a great storm shall cover the earth, and the followers of Mammon shall tremble. adapted from The Book of Mozilla, 3:31 (Red Letter Edition) From bagnara at cs.unipr.it Fri Mar 22 19:27:06 2002 From: bagnara at cs.unipr.it (Roberto Bagnara) Date: Fri, 22 Mar 2002 19:27:06 +0100 Subject: Annoying warning (II attempt) Message-ID: <3C9B777A.3050901@cs.unipr.it> Dear all, there is a warning about an unused parameter due to line 84 in basic.h (GiNaC 1.0.7): virtual void destroy(bool call_parent) { } Having seen the warning two million times I thought it was better to report it ;-) Actually, I reported this problem already on February 2 but it seems I was not very successful. This time I enclose a patch fixing the problem. All the best Roberto -- Prof. Roberto Bagnara Computer Science Group Department of Mathematics, University of Parma, Italy http://www.cs.unipr.it/~bagnara/ mailto:bagnara at cs.unipr.it diff -ru GiNaC-1.0.7.orig/ginac/basic.h GiNaC-1.0.7/ginac/basic.h --- GiNaC-1.0.7.orig/ginac/basic.h Thu Jan 24 23:28:20 2002 +++ GiNaC-1.0.7/ginac/basic.h Fri Mar 22 16:35:07 2002 @@ -81,7 +81,7 @@ tinfo_key = other.tinfo_key; } /** For use by dtor and assignment operator. */ - virtual void destroy(bool call_parent) { } + virtual void destroy(bool /* call_parent */) { } // other ctors /** ctor with specified tinfo_key */ From bagnara at cs.unipr.it Fri Mar 22 19:44:28 2002 From: bagnara at cs.unipr.it (Roberto Bagnara) Date: Fri, 22 Mar 2002 19:44:28 +0100 Subject: Problem with `is_ex_the_function' Message-ID: <3C9B7B8C.5040709@cs.unipr.it> Dear all, there is a bug in macro `is_ex_the_function' defined in ginac/function.h. As it is now, the macro only works if the file is included in a context where using declarations are in effect, since a couple of GiNaC:: qualifications are missing. I enclose a patch fixing the problem. All the best Roberto -- Prof. Roberto Bagnara Computer Science Group Department of Mathematics, University of Parma, Italy http://www.cs.unipr.it/~bagnara/ mailto:bagnara at cs.unipr.it diff -ru GiNaC-1.0.7.orig/ginac/function.h GiNaC-1.0.7/ginac/function.h --- GiNaC-1.0.7.orig/ginac/function.h Sun Jan 27 18:39:38 2002 +++ GiNaC-1.0.7/ginac/function.h Fri Mar 22 16:18:43 2002 @@ -416,7 +416,7 @@ } #define is_ex_the_function(OBJ, FUNCNAME) \ - (is_exactly_a(OBJ) && ex_to(OBJ).get_serial() == function_index_##FUNCNAME) + (GiNaC::is_exactly_a(OBJ) && GiNaC::ex_to(OBJ).get_serial() == function_index_##FUNCNAME) } // namespace GiNaC From pearu at cens.ioc.ee Sat Mar 23 14:40:51 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat, 23 Mar 2002 15:40:51 +0200 (EET) Subject: Possible memory leak? (was: update on fast mult benchmark (fwd)) Message-ID: Hi! I have tried to run Richard Fateman's multiplication benchmark also with GiNaC (1.0.2,1.0.6) but unsuccefully because expand() method starts to eat large amounts of memory so that I had to kill the process. Could this indicate a possible memory leak in GiNaC? Any other ideas? Here is my test program: #include using namespace GiNaC; int main() { symbol x("x"), y("y"), z("z"); ex r = (ex(power(1 + x + y + z, 20 ))).expand(); ex r1 = r+1; ex m = (r*r1).expand(); return 0; } Regards, Pearu ---------- Forwarded message ---------- Date: Fri, 22 Mar 2002 16:01:01 -0800 From: Richard Fateman Newsgroups: sci.math.symbolic Subject: update on fast mult benchmark I've added some more programs. It looks like this now. MockMMA 5.8 seconds (including 2.1 in GC) Macsyma (in ACL 6.1) 6.9 seconds (including 2.0 in GC) Hashing 8.6 seconds (including 2.2 in GC) Fermat 9.2 seconds Macsyma (maxima5.5) 32.0 seconds (? GC not reported, second trial) Macsyma (commercial) 37.5 seconds (including 29.78 in GC) Macsyma (maxima5.5) 53.5 seconds (? GC not reported, first trial) Maple7 50.0 seconds (? GC, it was slower on 2nd trial) Mathematica 4.1 82.3 seconds (? it was slower on 2nd trial) MuPad 1.4.2 Pro 118.6 seconds for multiplying the expanded version of (1+x+y+z)^20 by itself. details on http://www.cs.berkeley.edu/~fateman/papers/fastmult.pdf comments invited. RJF From kreckel at thep.physik.uni-mainz.de Sun Mar 24 15:37:46 2002 From: kreckel at thep.physik.uni-mainz.de (Richard B. Kreckel) Date: Sun, 24 Mar 2002 15:37:46 +0100 (CET) Subject: Possible memory leak? (was: update on fast mult benchmark (fwd)) In-Reply-To: Message-ID: Hi, On Sat, 23 Mar 2002, Pearu Peterson wrote: > I have tried to run Richard Fateman's multiplication benchmark also with > GiNaC (1.0.2,1.0.6) but unsuccefully because expand() method starts to eat > large amounts of memory so that I had to kill the process. > > Could this indicate a possible memory leak in GiNaC? Any other ideas? Here > is my test program: > > #include > using namespace GiNaC; > int main() > { > symbol x("x"), y("y"), z("z"); > ex r = (ex(power(1 + x + y + z, 20 ))).expand(); > ex r1 = r+1; > ex m = (r*r1).expand(); > return 0; > } Incidentally, we have had this problem before. See the message and the following one. That has been solved then, but the test by Richard Fateman shows clearly that there are other problematic cases. What happens? The above test runs when you give it one GigaByte. And there is no memory-leak, because you can run it infinitely many times. If we look at GiNaC's memory footprint of the final result, it becomes obvious that its representation is quite competitive, too. The culprit is mul::expand(), where two expanded polynomials (class add), one with n1 terms and another with n2 terms are first written out by brute force in an n1*n2 vector. This is then returned through an add constructor which sorts it (calling expairseq::canonize()) and then compacitifies it (calling expairseq::combine_same_terms_sorted_seq()). For two large input polynomials this is clearly bogus, because it does not allow for early cancellations. It was done this way because the compexity of sorting is asymptotically ideal (n*log(n) comparisons). This seems to have been a short-sighted decision. I can think of a gazillion ways how to repair this and will try to provide a fix really soon now. It needs some careful balancing, however. (Incidentally, I already had some trial-code in my private tree which performs much better.) To me it appears that this test per se does not tell us very much about representation. Besides questions of representation, the actual implementation of expand() might differ between the systems compared by Richard Fateman. In particular, I am amazed to see the difference between Maple, Mathematica and MuPAD being so tiny. This contradicts my own experience with handling polynomials in these three systems. The pattern t(MuPAD) > t(Maple) > t(Mma) looks reasonable, but the differnece is usually *much* bigger. Geez, and we thought polynomial expansion was trivial... Regards -richy. PS: Richard: 1234567890 < (2^31)-1, on page 1, third paragraph. -- Richard B. Kreckel From fateman at cs.berkeley.edu Sun Mar 24 18:01:57 2002 From: fateman at cs.berkeley.edu (Richard Fateman) Date: Sun, 24 Mar 2002 09:01:57 -0800 Subject: Possible memory leak? (was: update on fast mult benchmark (fwd)) References: Message-ID: <3C9E0685.2010206@cs.berkeley.edu> Richard B. Kreckel wrote: > Hi, > > On Sat, 23 Mar 2002, Pearu Peterson wrote: > >>I have tried to run Richard Fateman's multiplication benchmark also with >>GiNaC (1.0.2,1.0.6) <<>> > The culprit is mul::expand(), where two expanded polynomials (class > add), one with n1 terms and another with n2 terms are first written out by > brute force in an n1*n2 vector. This is then returned through an add > constructor which sorts it (calling expairseq::canonize()) and then > compacitifies it (calling expairseq::combine_same_terms_sorted_seq()). This seems to me to be a design that should be revisited. For the univariate case it is particularly bad since multiplying polynomials of degree u and v gives you only degree u+v, and you will have allocated (u+1)*(v+1) cells. > > For two large input polynomials this is clearly bogus, because it does not > allow for early cancellations. It was done this way because the compexity > of sorting is asymptotically ideal (n*log(n) comparisons). This seems > to have been a short-sighted decision. The sorting cost is going to be dominated by the coefficient operations most likely, even if it is n^2. Note that Maple doesn't bother to sort at all. I think that if you test the others against sorting of Maple's results, Maple will look far worse. > > I can think of a gazillion ways how to repair this and will try to provide > a fix really soon now. It needs some careful balancing, however. > (Incidentally, I already had some trial-code in my private tree which > performs much better.) > > To me it appears that this test per se does not tell us very much about > representation. Besides questions of representation, the actual > implementation of expand() might differ between the systems compared by > Richard Fateman. There are not too many ways of multiplying sparse polynomials without using u*v multiplies and about u*v -size(answer) adds. For dense polynomials in a finite field, FFT can be used. My guess is that an FFT approach would not be faster on such a small ( :) ) problem as this one. But a careful implementation might win. In order to represent the inputs and the outputs, the FFT would probably have to be done quite a few times since the bignum coefficients would need to be built up by Chinese Remainder. In particular, I am amazed to see the difference between > Maple, Mathematica and MuPAD being so tiny. This contradicts my own > experience with handling polynomials in these three systems. The pattern > t(MuPAD) > t(Maple) > t(Mma) looks reasonable, but the differnece is > usually *much* bigger. > > Geez, and we thought polynomial expansion was trivial... Incidentally, you should be able to do w^2 much faster than w*w. RJF > > Regards > -richy. > > PS: Richard: 1234567890 < (2^31)-1, on page 1, third paragraph. > Thanks. I changed it. That number is large enough so that in my Lisp it is not a single-word "fixnum" .. some bits are used for type tags. I don't know anything about GiNaC internals, but such design decisions involving a unified treatment of arbitrary size integers can affect time, programmer convenience, data size etc. in a very complicated way. When this problem works on GiNaC I'd like to know its speed! RJF From kreckel at thep.physik.uni-mainz.de Sun Mar 24 18:56:38 2002 From: kreckel at thep.physik.uni-mainz.de (Richard B. Kreckel) Date: Sun, 24 Mar 2002 18:56:38 +0100 (CET) Subject: Possible memory leak? (was: update on fast mult benchmark (fwd)) In-Reply-To: <3C9E0685.2010206@cs.berkeley.edu> Message-ID: Hi, On Sun, 24 Mar 2002, Richard Fateman wrote: > > For two large input polynomials this is clearly bogus, because it does not > > allow for early cancellations. It was done this way because the compexity > > of sorting is asymptotically ideal (n*log(n) comparisons). This seems > > to have been a short-sighted decision. > > > The sorting cost is going to be dominated by the coefficient operations > most likely, even if it is n^2. Note that Maple doesn't bother to > sort at all. I think that if you test the others against sorting of > Maple's results, Maple will look far worse. Sure Maple bothers to sort! The order may not be apparent, but they do sort their expressions. Hasn't that been the consensus when the discussion last came up on sci.math.symbolic? By the way: This observation of yours that Maple gets slower on the 2nd trial, isn't this exactly the problem Carl DeVore was describing in sci.math.symbolic recently? It seems to get even slower after more trials. Is this a new "feature" or do older releases of Maple also suffer from this? [...] > There are not too many ways of multiplying sparse polynomials without > using u*v multiplies and about u*v -size(answer) adds. I was referring more to implementation details than to fundamental algorithmic things. Hence the gazillions. :-) > For dense > polynomials in a finite field, FFT can be used. My guess is that > an FFT approach would not be faster on such a small ( :) ) problem > as this one. But a careful implementation might win. In order > to represent the inputs and the outputs, the FFT would probably > have to be done quite a few times since the bignum coefficients > would need to be built up by Chinese Remainder. This is exactly what I was thinking when Dan Bernstein recommended the FFT-based method for univariate polynomials when I last met him. I don't know if anything came out of his Zmult project, though... [...] > > Geez, and we thought polynomial expansion was trivial... > > > Incidentally, you should be able to do w^2 much faster than w*w. Sure in the sense that (w*w).expand() is much faster than (w.expand()*w.expand()).expand(). Oh, and w*w equals pow(w,2) trivially due to evaluation. > > PS: Richard: 1234567890 < (2^31)-1, on page 1, third paragraph. > > > > Thanks. I changed it. That number is large enough so that in my > Lisp it is not a single-word "fixnum" .. some bits are used for > type tags. I don't know anything about GiNaC internals, but > such design decisions involving a unified treatment of arbitrary > size integers can affect time, programmer convenience, data size > etc. in a very complicated way. Very true. CLN does exactly this: it distinguishes between fast fixnums and slow bignums in a tranparent way using type tags. Must be because another Lisp-fan wrote it... :-) Regards -richy. -- Richard B. Kreckel From fateman at cs.berkeley.edu Sun Mar 24 21:18:04 2002 From: fateman at cs.berkeley.edu (Richard Fateman) Date: Sun, 24 Mar 2002 12:18:04 -0800 Subject: Possible memory leak? (was: update on fast mult benchmark (fwd)) References: Message-ID: <3C9E347C.4080706@cs.berkeley.edu> Richard B. Kreckel wrote: >> > > Sure Maple bothers to sort! The order may not be apparent, but they do > sort their expressions. Hasn't that been the consensus when the > discussion last came up on sci.math.symbolic? I don't call what they do sorting, since they do not impose an external order on expressions: it is merely collecting of terms. This can be done by a hash table in (probably) O(n) time. An answer which does not allow you to tell the degree of a polynomial without looking through all the terms is less useful. A test which requires division would probably show this. > > By the way: This observation of yours that Maple gets slower on the > 2nd trial, isn't this exactly the problem Carl DeVore was describing in > sci.math.symbolic recently? It seems to get even slower after more > trials. Is this a new "feature" or do older releases of Maple also suffer > from this? I have no old copies "alive". > > [...] > >>There are not too many ways of multiplying sparse polynomials without >>using u*v multiplies and about u*v -size(answer) adds. >> > > I was referring more to implementation details than to fundamental > algorithmic things. Hence the gazillions. :-) > > >> For dense >>polynomials in a finite field, FFT can be used. My guess is that >>an FFT approach would not be faster on such a small ( :) ) problem >>as this one. But a careful implementation might win. In order >>to represent the inputs and the outputs, the FFT would probably >>have to be done quite a few times since the bignum coefficients >>would need to be built up by Chinese Remainder. >> > > This is exactly what I was thinking when Dan Bernstein recommended > the FFT-based method for univariate polynomials when I last met him. > I don't know if anything came out of his Zmult project, though... It would be possible to encode the question and the answer as a huge multiplication of 2 integers. The loss in encoding and decoding would be either very wasteful or very tricky. RJF From kreckel at thep.physik.uni-mainz.de Thu Mar 28 15:31:26 2002 From: kreckel at thep.physik.uni-mainz.de (Richard B. Kreckel) Date: Thu, 28 Mar 2002 15:31:26 +0100 (CET) Subject: Possible memory leak? (was: update on fast mult benchmark (fwd)) In-Reply-To: <3C9E0685.2010206@cs.berkeley.edu> Message-ID: Hi, On Sun, 24 Mar 2002, Richard Fateman wrote: > This seems to me to be a design that should be revisited. For > the univariate case it is particularly bad since multiplying polynomials > of degree u and v gives you only degree u+v, and you will have > allocated (u+1)*(v+1) cells. Sure. Done now. > When this problem works on GiNaC I'd like to know its speed! Here are timings on a P-III, 1GHz, 512kB cache, running Linux, with approximate memory usage: System time Memory --------------------------------------------------+--------+--------- GiNaC-1.0.8 (from CVS), using GCC-3.1 prerelease: 57.3s 8MB MuPAD 1.4.2: 119.2s 25MB MuPAD 2.0: 117.0s 30MB Form 3.0: 0.5s 6MB Maxima 5.6 (GCL 2.4.0): 51.0s 8MB MapleVR3: 16.7s 7MB MapleVR4: 17.9s 8MB MapleVR5: 18.4s 8MB Mathematica 4.1: 80.5s 11MB Reduce 3.6: Segmentation Fault Yacas 1.0.48: N/A (smaller examples fail to expand fully) Richard, I noticed a mistake in the printed version of the Mathematica program (missing Expand[]) and the MockMMA doesn't seem right to me either. Note that Form can really only do ring operations like in this test, but its speed there should come as no surprise. You probably just cannot beat it with any more general system. Regards -richy. -- Richard B. Kreckel From fateman at cs.berkeley.edu Thu Mar 28 20:14:03 2002 From: fateman at cs.berkeley.edu (Richard Fateman) Date: Thu, 28 Mar 2002 11:14:03 -0800 Subject: Form is really slow and fast for me. What could this mean? References: Message-ID: <3CA36B7B.2050003@cs.berkeley.edu> Thanks for this data. Your timings look like mine for the ones that overlap. The difference between 1Gh and 933Mhz is not that great, and I guess the other parameters are close too. It is odd that Maple 7 is so much slower than Maple4 etc. Maybe the change to longer forms slowed it down. I tried downloading form3.0, and I find it mysterious. Maybe I am not using it correctly. My first version took 101.23 seconds. I repeated it and it took 97 seconds This is what I did: C:\form>form sample3.frm form sample3.frm FORM by J.Vermaseren,version 3.0(Jan 28 2001) Run at: Thu Mar 28 10:26:41 2002 Symbols x,y,z; Local q=(1+x+y+z)^20; Local q1=q+1; Local res=q*q1; .end ..... This changed program however took only 0.98 second: C:\form>form sample7.frm form sample7.frm FORM by J.Vermaseren,version 3.0(Jan 28 2001) Run at: Thu Mar 28 10:50:53 2002 Symbols x,y,z; Local q2=(1+x+y+z)^20*(1+(1+x+y+z)^20); .end Time = 1.04 sec Generated terms = 14112 q2 Terms in output = 12341 Bytes used = 305378 Richard B. Kreckel wrote: > Hi, > > On Sun, 24 Mar 2002, Richard Fateman wrote: > >>This seems to me to be a design that should be revisited. For >>the univariate case it is particularly bad since multiplying polynomials >>of degree u and v gives you only degree u+v, and you will have >>allocated (u+1)*(v+1) cells. >> > > Sure. Done now. > > >>When this problem works on GiNaC I'd like to know its speed! >> > > Here are timings on a P-III, 1GHz, 512kB cache, running Linux, with > approximate memory usage: > > System time Memory > --------------------------------------------------+--------+--------- > GiNaC-1.0.8 (from CVS), using GCC-3.1 prerelease: 57.3s 8MB > MuPAD 1.4.2: 119.2s 25MB > MuPAD 2.0: 117.0s 30MB > Form 3.0: 0.5s 6MB > Maxima 5.6 (GCL 2.4.0): 51.0s 8MB > MapleVR3: 16.7s 7MB > MapleVR4: 17.9s 8MB > MapleVR5: 18.4s 8MB > Mathematica 4.1: 80.5s 11MB > Reduce 3.6: Segmentation Fault > Yacas 1.0.48: N/A (smaller examples fail to expand fully) > > Richard, I noticed a mistake in the printed version of the Mathematica > program (missing Expand[]) and the MockMMA doesn't seem right to me > either. Note that Form can really only do ring operations like in this > test, but its speed there should come as no surprise. You probably just > cannot beat it with any more general system. > > Regards > -richy. > From parisse at mozart.ujf-grenoble.fr Thu Mar 28 20:03:56 2002 From: parisse at mozart.ujf-grenoble.fr (parisse) Date: Thu, 28 Mar 2002 20:03:56 +0100 Subject: Form is really slow and fast for me. What could this mean? References: <3CA36B7B.2050003@cs.berkeley.edu> Message-ID: <3CA3691C.3@fourier.ujf-grenoble.fr> > I tried downloading form3.0, and I find it mysterious. Maybe I am not > using > it correctly. My first version took 101.23 seconds. I repeated it and > it took 97 seconds > > This is what I did: > > C:\form>form sample3.frm > form sample3.frm > FORM by J.Vermaseren,version 3.0(Jan 28 2001) Run at: Thu Mar 28 > 10:26:41 2002 > Symbols x,y,z; > Local q=(1+x+y+z)^20; > Local q1=q+1; > Local res=q*q1; > .end > ..... > This changed program however took only > 0.98 second: > > C:\form>form sample7.frm > form sample7.frm > FORM by J.Vermaseren,version 3.0(Jan 28 2001) Run at: Thu Mar 28 > 10:50:53 2002 > Symbols x,y,z; > Local q2=(1+x+y+z)^20*(1+(1+x+y+z)^20); > .end > > > Time = 1.04 sec Generated terms = 14112 > > q2 Terms in output = 12341 > > Bytes used = 305378 > The reason might be that the first power is expanded as (1+x+y+z)*(1+x+y+z)*...*(1+(1+x+y+z)*...*(1+x+y+z)) which will expand much faster than making the product of 1+x+y+z expanded. This would be the same reason why it is never a good idea to use the chinese powering algorithm for expansion of power of multivariate polynomials. From pearu at cens.ioc.ee Thu Mar 28 21:45:25 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 28 Mar 2002 22:45:25 +0200 (EET) Subject: Form is really slow and fast for me. What could this mean? In-Reply-To: <3CA3691C.3@fourier.ujf-grenoble.fr> Message-ID: Hi, I just discovered Form for myself. It is just too amazing how efficient and powerful and yet so inefficient and limited this program is (if misused in the late case, I guess). Here follows another example of misusing the Form program. Just from the curiosity I tried to evaluate q/q where q=(1+x+y+z)^20. Any (?) other CAS would result in 1, but not the Form: ------------------------------------------- FORM by J.Vermaseren,version 3.0(Oct 31 2001) Run at: Thu Mar 28 21:30:41 2002 Symbols x,y,z; Local q=(1+x+y+z)^20; Local res=q/q; .end Time = 0.06 sec Generated terms = 1771 q Terms in output = 1771 Bytes used = 31782 Sorted function argument too long. Called from InFunction ---------------------------------------------- Using smaller exponent shows what is going on: ----------------------------------------------- FORM by J.Vermaseren,version 3.0(Oct 31 2001) Run at: Thu Mar 28 21:35:35 2002 Symbols x,y,z; Local q=(1+x+y+z); Local res=q/q; print; .end Time = 0.00 sec Generated terms = 4 q Terms in output = 4 Bytes used = 54 Time = 0.00 sec Generated terms = 4 res Terms in output = 4 Bytes used = 122 q = 1 + x + y + z; res = 1/(1 + x + y + z) + 1/(1 + x + y + z)*x + 1/(1 + x + y + z)*y + 1/(1 + x + y + z)*z; -------------------------------------------- Regards, Pearu From pearu at cens.ioc.ee Thu Mar 28 22:27:25 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 28 Mar 2002 23:27:25 +0200 (EET) Subject: Form is really slow and fast for me. What could this mean? In-Reply-To: <3CA36B7B.2050003@cs.berkeley.edu> Message-ID: Hi, On Thu, 28 Mar 2002, Richard Fateman wrote: > C:\form>form sample3.frm > form sample3.frm > FORM by J.Vermaseren,version 3.0(Jan 28 2001) Run at: Thu Mar 28 > 10:26:41 2002 > Symbols x,y,z; > Local q=(1+x+y+z)^20; > Local q1=q+1; > Local res=q*q1; > .end > ..... Note that using .sort makes the above example approx. 4 times faster: Symbols x,y,z; Local q=(1+x+y+z)^20; .sort Local q1=1+q; .sort Local res=q1*q; .end Regards, Pearu From fateman at cs.berkeley.edu Thu Mar 28 23:58:49 2002 From: fateman at cs.berkeley.edu (Richard Fateman) Date: Thu, 28 Mar 2002 14:58:49 -0800 Subject: Form is really slow and fast for me. What could this mean? References: Message-ID: <3CA3A029.1050106@cs.berkeley.edu> It is not necessary to have two .sorts in there. I took out the first one, and it was just as fast. Why should this help? I tried this: Symbols x,y,z; Local q=(1+x+y+z)^20*(1+(1+z+x+y)^20)-(1+x+z+y)^20*(1+(1+z+x+y)^20); Print; .end and it took about twice as long as the longest time. (190 seconds) I also tried Local q=(1+x+y+z)^40 and that takes about .98 seconds Local q=(1+x+y+z)^40- (1+x+z+y)^40 and that takes about 1.7 seconds. What is going on? Pearu Peterson wrote: > Hi, > > On Thu, 28 Mar 2002, Richard Fateman wrote: > > >>C:\form>form sample3.frm >>form sample3.frm >>FORM by J.Vermaseren,version 3.0(Jan 28 2001) Run at: Thu Mar 28 >>10:26:41 2002 >> Symbols x,y,z; >> Local q=(1+x+y+z)^20; >> Local q1=q+1; >> Local res=q*q1; >> .end >>..... >> > > Note that using .sort makes the above example approx. 4 times faster: > > Symbols x,y,z; > Local q=(1+x+y+z)^20; > .sort > Local q1=1+q; > .sort > Local res=q1*q; > .end > > Regards, > Pearu > > From parisse at mozart.ujf-grenoble.fr Fri Mar 29 07:09:17 2002 From: parisse at mozart.ujf-grenoble.fr (parisse) Date: Fri, 29 Mar 2002 07:09:17 +0100 Subject: Form is really slow and fast for me. What could this mean? References: <3CA3A029.1050106@cs.berkeley.edu> Message-ID: <3CA4050D.1050201@fourier.ujf-grenoble.fr> Richard Fateman wrote: > It is not necessary to have two .sorts in there. > I took out the first one, and it was just as fast. > Why should this help? > > I tried this: > > Symbols x,y,z; > Local q=(1+x+y+z)^20*(1+(1+z+x+y)^20)-(1+x+z+y)^20*(1+(1+z+x+y)^20); > Print; > .end > > and it took about twice as long as the longest time. > (190 seconds) > > I also tried > Local q=(1+x+y+z)^40 > and that takes about .98 seconds > Local q=(1+x+y+z)^40- (1+x+z+y)^40 > and that takes about 1.7 seconds. > > What is going on? Taking the power is much faster simply because you do *not* apply the chinese powering algorithm as a simple cost analysis shows for any multivariate polynomial. Just make the 40* multiplication loop. From kreckel at thep.physik.uni-mainz.de Fri Mar 29 18:17:01 2002 From: kreckel at thep.physik.uni-mainz.de (Richard B. Kreckel) Date: Fri, 29 Mar 2002 18:17:01 +0100 (CET) Subject: Form is really slow and fast for me. What could this mean? In-Reply-To: <3CA4050D.1050201@fourier.ujf-grenoble.fr> Message-ID: Hi, On Fri, 29 Mar 2002, parisse wrote: > Richard Fateman wrote: > > > It is not necessary to have two .sorts in there. > > I took out the first one, and it was just as fast. > > Why should this help? > > > > I tried this: > > > > Symbols x,y,z; > > Local q=(1+x+y+z)^20*(1+(1+z+x+y)^20)-(1+x+z+y)^20*(1+(1+z+x+y)^20); > > Print; > > .end > > > > and it took about twice as long as the longest time. > > (190 seconds) > > > > I also tried > > Local q=(1+x+y+z)^40 > > and that takes about .98 seconds > > Local q=(1+x+y+z)^40- (1+x+z+y)^40 > > and that takes about 1.7 seconds. > > > > What is going on? > > > Taking the power is much faster simply because you do *not* apply > the chinese powering algorithm as a simple cost analysis shows > for any multivariate polynomial. Just make the 40* multiplication loop. Honestly, I've never heard the term "chinese powering". I assume you mean fast exponentiation (by successively going through the binary representation of the integer exponent)? Sure the multiplication loops are much faster. I guess this is what Knuth means in volume 2, in his section "evaluation of powers". BTW, even faster you can directly build up the result with the proper coefficients in place (as GiNaC does in power::expand_add(), slightly fixed in CVS, too). Errhh, D. Knuth gives a citation of a review article in that section which sounds interesting to me. Since it is quite hard to get, could that be made available online by its author or was it prior to electronic preparation of manuscripts? ;-) However, I am not sure how that can be the culprit why Form is so slow in the second case. The difference for exponent 20 seems to be a factor of ten, based on my own tests. Certainly not a power of 100 or more... Regards -richy. -- Richard B. Kreckel From fateman at cs.berkeley.edu Fri Mar 29 18:33:18 2002 From: fateman at cs.berkeley.edu (Richard Fateman) Date: Fri, 29 Mar 2002 09:33:18 -0800 Subject: Form is really slow and fast for me. What could this mean? References: Message-ID: <3CA4A55E.7D044FB6@cs.berkeley.edu> My guess is that in Form when you write Local q=(1+x+y+z)^40 what is produced is some internal encoded form that is inadequate for addition of polynomials, but is somehow adequate to know how many terms there are. Form, in its devotion to handling large expressions, is not so efficient on adding this expression which has "only" 13,000 terms and can be handled by other computer algebra systems. (Like a page-oriented sort). But this may be entirely wrong. If the review article is something I wrote, (I wrote 2 papers on polynomial powering: sparse and dense), I'll try to get it on line. RJF "Richard B. Kreckel" wrote: > > Hi, > > On Fri, 29 Mar 2002, parisse wrote: > > Richard Fateman wrote: > > > > > It is not necessary to have two .sorts in there. > > > I took out the first one, and it was just as fast. > > > Why should this help? > > > > > > I tried this: > > > > > > Symbols x,y,z; > > > Local q=(1+x+y+z)^20*(1+(1+z+x+y)^20)-(1+x+z+y)^20*(1+(1+z+x+y)^20); > > > Print; > > > .end > > > > > > and it took about twice as long as the longest time. > > > (190 seconds) > > > > > > I also tried > > > Local q=(1+x+y+z)^40 > > > and that takes about .98 seconds > > > Local q=(1+x+y+z)^40- (1+x+z+y)^40 > > > and that takes about 1.7 seconds. > > > > > > What is going on? > > > > > > Taking the power is much faster simply because you do *not* apply > > the chinese powering algorithm as a simple cost analysis shows > > for any multivariate polynomial. Just make the 40* multiplication loop. > > Honestly, I've never heard the term "chinese powering". I assume you mean > fast exponentiation (by successively going through the binary > representation of the integer exponent)? Sure the multiplication loops > are much faster. I guess this is what Knuth means in volume 2, in his > section "evaluation of powers". BTW, even faster you can directly build > up the result with the proper coefficients in place (as GiNaC does in > power::expand_add(), slightly fixed in CVS, too). > > Errhh, D. Knuth gives a citation of a review article in that section which > sounds interesting to me. Since it is quite hard to get, could that be > made available online by its author or was it prior to electronic > preparation of manuscripts? ;-) > > However, I am not sure how that can be the culprit why Form is so slow in > the second case. The difference for exponent 20 seems to be a factor of > ten, based on my own tests. Certainly not a power of 100 or more... > > Regards > -richy. > -- > Richard B. Kreckel > > From cbauer at thep.physik.uni-mainz.de Sun Mar 31 20:42:15 2002 From: cbauer at thep.physik.uni-mainz.de (Christian Bauer) Date: Sun, 31 Mar 2002 20:42:15 +0200 Subject: GiNaC 1.0.8 now available Message-ID: <20020331204215.A23556@higgs.physik.uni-mainz.de> Hello! Following our recent trend of Sunday evening releases we are proud to announce the immediate availability of GiNaC 1.0.8 which you can get from ftp://ftpthep.physik.uni-mainz.de/pub/GiNaC Changes include, but are not limited to - Improvements in memory usage of the expand() methods. Hm, wait... that's the only change!? Aha, sorting of vector is a little faster, and the namespace fix for is_ex_the_function() is in. Alas, long gone now are the days of 0.9.x where changelog pages filled screen after screen... Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/