From tkluck at infty.nl Mon Aug 13 01:19:22 2012 From: tkluck at infty.nl (Timo Kluck) Date: Mon, 13 Aug 2012 01:19:22 +0200 Subject: [GiNaC-devel] Add method ex::symbols for obtaining all symbols appearing in an expression In-Reply-To: <50283963.6256b40a.1fd5.ffffb000@mx.google.com> References: <50283963.6256b40a.1fd5.ffffb000@mx.google.com> Message-ID: Hi, You may know the CAS Sage [1] which uses GiNaC. It wraps ginac expressions in Python, and it adds some extra methods of its own. One of those methods is variables(), which returns a list of symbols appearing in the expression. It does so by recursively walking down the expression tree. At one point, that was a bottle neck in my algorithm, and I realized that it would be very easy to cache those variables at construction time from the subexpressions. I think the natural place to implement such a thing would be in GiNaC itself, so that everyone (not only sage users) may benefit. Is this planned and/or would you accept such a patch? Best regards, Timo Kluck [1] www.sagemath.org From alexei.sheplyakov at gmail.com Mon Aug 13 08:55:05 2012 From: alexei.sheplyakov at gmail.com (Alexei Sheplyakov) Date: Mon, 13 Aug 2012 09:55:05 +0300 Subject: [GiNaC-devel] Add method ex::symbols for obtaining all symbols appearing in an expression In-Reply-To: References: <50283963.6256b40a.1fd5.ffffb000@mx.google.com> Message-ID: <20120813065505.GA10841@vargsbox.jinr.ru> Hello, On Mon, Aug 13, 2012 at 01:19:22AM +0200, Timo Kluck wrote: > One of those methods is variables(), which returns a list of symbols > appearing in the expression. It does so by recursively walking down > the expression tree. At one point, that was a bottle neck in my > algorithm, and I realized that it would be very easy to cache those > variables at construction time from the subexpressions. > > I think the natural place to implement such a thing would be in GiNaC > itself, so that everyone (not only sage users) may benefit. I don't think it's a good idea. This adds quite a non-negligible overhead to every eval() and expression construction, even if the actual calculation does not need the variables() (symbols(), or whatever it is). Besides I don't think caching really helps. Consider the following code: symbol x, y; ex e = x + y; ex g = x - y; ex e_plus_g = e + g; One would expect e_plus_g.symbols() to return just x, that means symbols() needs to scan the whole e_plus_g expression (to find out that it doesn't contain y), which makes the cache kind of useless. Best regards, Alexei From kreckel at ginac.de Mon Aug 13 09:02:19 2012 From: kreckel at ginac.de (Richard B. Kreckel) Date: Mon, 13 Aug 2012 09:02:19 +0200 Subject: [GiNaC-devel] Add method ex::symbols for obtaining all symbols appearing in an expression In-Reply-To: References: <50283963.6256b40a.1fd5.ffffb000@mx.google.com> Message-ID: <5028A67B.60007@ginac.de> Hi, On 08/13/2012 01:19 AM, Timo Kluck wrote: > You may know the CAS Sage [1] which uses GiNaC. It wraps ginac > expressions in Python, and it adds some extra methods of its own. > > One of those methods is variables(), which returns a list of symbols > appearing in the expression. It does so by recursively walking down > the expression tree. At one point, that was a bottle neck in my > algorithm, and I realized that it would be very easy to cache those > variables at construction time from the subexpressions. > > I think the natural place to implement such a thing would be in GiNaC > itself, so that everyone (not only sage users) may benefit. Is this > planned and/or would you accept such a patch? Wouldn't it be better to go one step further and cache those variables when the variables() function is first used? This avoids quite a lot of overhead in space and time! And then, variables() wouldn't have to be a member function, right? Cheers -richy. -- Richard B. Kreckel From tkluck at infty.nl Mon Aug 13 10:31:56 2012 From: tkluck at infty.nl (Timo Kluck) Date: Mon, 13 Aug 2012 10:31:56 +0200 Subject: [GiNaC-devel] Fwd: Add method ex::symbols for obtaining all symbols appearing in an expression In-Reply-To: <5028b107.2562b40a.5afc.413d@mx.google.com> References: <50283963.6256b40a.1fd5.ffffb000@mx.google.com> <5028A67B.60007@ginac.de> <5028b107.2562b40a.5afc.413d@mx.google.com> Message-ID: On ma, aug 13, 2012 at 9:02 , Richard B. Kreckel wrote: >Wouldn't it be better to go one step further and cache those variables when > the variables() function is first used? This avoids quite a lot of overhead in > space and time! And then, variables() wouldn't have to be a member > function, right? That is probably best, especially in the light of the (x+y) - y example that Alexei mentioned. I could easily do that in Sage, but I think it wouldn't help in my case because I'm constructing so many expressions. However, I'm thinking it may make sense to do that in GiNaC, because from what I've seen it has some sort of a copy-on-right mechanism for subexpressions? As in: f = sin(x+y) g = f * cos(x-y) h = f * tan(z) will make g and h share the instance (and therefore the cache!) for f? Please correct me if I've misunderstood that. But if it has, this is abstracted away in Sage, so caching the variables() function in Sage would have much less benefit than caching it in GiNaC. I'm not sure whether this will actually give me any speed benefit in practice, because even though I'm using very similar subexpressions over and over again, I'm not sure if I could construct them as the same object without writing really awkward code. I'm interested in your opinions, though. So my next question is: would you accept a patch implementing a recursive, cached version of symbols(), for Sage to call directly, and then have responsibility for optimizing / caching that going to the level of GiNaC? I would normally think that it would be a method of ex (and basic). Where would you say the cache lives if it where a global function? Timo From kreckel at ginac.de Tue Aug 14 09:54:17 2012 From: kreckel at ginac.de (Richard B. Kreckel) Date: Tue, 14 Aug 2012 09:54:17 +0200 Subject: [GiNaC-devel] Fwd: Add method ex::symbols for obtaining all symbols appearing in an expression In-Reply-To: References: <50283963.6256b40a.1fd5.ffffb000@mx.google.com> <5028A67B.60007@ginac.de> <5028b107.2562b40a.5afc.413d@mx.google.com> Message-ID: <502A0429.60301@ginac.de> On 08/13/2012 10:31 AM, Timo Kluck wrote: > On ma, aug 13, 2012 at 9:02 , Richard B. Kreckel wrote: >> Wouldn't it be better to go one step further and cache those variables when >> the variables() function is first used? This avoids quite a lot of overhead in >> space and time! And then, variables() wouldn't have to be a member >> function, right? > > That is probably best, especially in the light of the (x+y) - y > example that Alexei mentioned. > > I could easily do that in Sage, but I think it wouldn't help in my > case because I'm constructing so many expressions. > > However, I'm thinking it may make sense to do that in GiNaC, because > from what I've seen it has some sort of a copy-on-right mechanism for > subexpressions? As in: > > f = sin(x+y) > g = f * cos(x-y) > h = f * tan(z) > > will make g and h share the instance (and therefore the cache!) for f? Yes, that is right. > Please correct me if I've misunderstood that. But if it has, this is > abstracted away in Sage, so caching the variables() function in Sage > would have much less benefit than caching it in GiNaC. > > I'm not sure whether this will actually give me any speed benefit in > practice, because even though I'm using very similar subexpressions > over and over again, I'm not sure if I could construct them as the > same object without writing really awkward code. I'm interested in > your opinions, though. Hmm, note that even if you cannot construct them as one object they might end up as one object after a while (c.f. the private ex::share(ex) method). Whether that helps or doesn't help in your case depends on the actual use pattern. There remains Alexei's worry that this is likely to substantially slow down all applications. Optimization is full of surprises. In any case, you should experiment with different strategies and carefully observe their impact! > So my next question is: would you accept a patch implementing a > recursive, cached version of symbols(), for Sage to call directly, and > then have responsibility for optimizing / caching that going to the > level of GiNaC? Well, any patch would have to go into Pynac, in order to be of advantage in Sage. Pynac is a fork of GiNaC. (In the past, we've repeatedly fixed Sage bugs in GiNaC and suggested patches for Pynac and vice-versa, but that doesn't mean that changes in GiNaC propagate automatically into Sage. At least I'm not aware of that.) > I would normally think that it would be a method of ex (and basic). > Where would you say the cache lives if it where a global function? It should be possible to equip that global function with a static hashmap ex -> symbols. Of course, that opens a memory leak because the map is never cleaned up. So, as an alternative, one could maintain that map only while it can really speed up things: store it in an object and delete it when that object is destructed. Cheers -richy. -- Richard B. Kreckel From tkluck at infty.nl Tue Aug 14 11:00:03 2012 From: tkluck at infty.nl (Timo Kluck) Date: Tue, 14 Aug 2012 11:00:03 +0200 Subject: [GiNaC-devel] Fwd: Add method ex::symbols for obtaining all symbols appearing in an expression In-Reply-To: <502A0429.60301@ginac.de> References: <50283963.6256b40a.1fd5.ffffb000@mx.google.com> <5028A67B.60007@ginac.de> <5028b107.2562b40a.5afc.413d@mx.google.com> <502A0429.60301@ginac.de> Message-ID: 2012/8/14 Richard B. Kreckel : > There remains Alexei's worry that this is likely to substantially slow down > all applications. I think his worry related to my initial suggestion for doing this at construction time. I think that just adding a member function implementing its own cache shouldn't add overhead to applications not using it. (The only difference would be the constructor having to initialize the cache to NULL). > > Well, any patch would have to go into Pynac, in order to be of advantage in > Sage. Pynac is a fork of GiNaC. (In the past, we've repeatedly fixed Sage > bugs in GiNaC and suggested patches for Pynac and vice-versa, but that > doesn't mean that changes in GiNaC propagate automatically into Sage. At > least I'm not aware of that.) Thanks for pointing that out; I was under the impression that Pynac was a Python wrapper for Ginac. I might just have to look at Pynac then. Do you happen to know what the reason was for forking? >> I would normally think that it would be a method of ex (and basic). >> Where would you say the cache lives if it where a global function? > It should be possible to equip that global function with a static hashmap ex > -> symbols. Of course, that opens a memory leak because the map is never > cleaned up. So, as an alternative, one could maintain that map only while it > can really speed up things: store it in an object and delete it when that > object is destructed. There would be no memory leak if the cache were to live in the object. From kreckel at ginac.de Wed Aug 22 00:51:39 2012 From: kreckel at ginac.de (Richard B. Kreckel) Date: Wed, 22 Aug 2012 00:51:39 +0200 Subject: [GiNaC-devel] Fwd: Add method ex::symbols for obtaining all symbols appearing in an expression In-Reply-To: References: <50283963.6256b40a.1fd5.ffffb000@mx.google.com> <5028A67B.60007@ginac.de> <5028b107.2562b40a.5afc.413d@mx.google.com> <502A0429.60301@ginac.de> Message-ID: <503410FB.2050202@ginac.de> Hi! On 08/14/2012 11:00 AM, Timo Kluck wrote: > 2012/8/14 Richard B. Kreckel: >> There remains Alexei's worry that this is likely to substantially slow down >> all applications. > I think his worry related to my initial suggestion for doing this at > construction time. I think that just adding a member function > implementing its own cache shouldn't add overhead to applications not > using it. (The only difference would be the constructor having to > initialize the cache to NULL). That is correct. How would you manage that list of symbols? One may have to take care about when to destroy it, in order not to break refcounting and create a memory leak, I suppose. > Do you happen to know what the reason was for forking? Sage already had its own number system and didn't want to bring another one (CLN). Also, there wasn't much interest in some of the special algebra for high energy physics. No worries. > There would be no memory leak if the cache were to live in the object. That's correct, too. Still, you should carefully benchmark your proposal to see if it really helps. If it does, and if a .symbols() MF is really a bottleneck in some applications, then, well, why not add it to GiNaC? And, please, discuss this with the Pynac developers, too, for the obvious reasons. Cheers -richy. -- Richard B. Kreckel