[CLN-list] CLN

Mon Nov 8 09:17:54 CET 2010

Dear Kåre,

On 10/29/2010 09:33 PM, Kåre Olaussen wrote:
> Dear Bruno and Richard.
>
> I tried to subscribe to the CLN mailing list, but the subscription page does not work.

Indeed. There is a problem in openSUSE 11.3 packaging leading to 
spurious error messages after subscribing. (The version of MailMan in 
openSUSE 11.3 is less than the one in previous versions of openSUSE and 
not up-to-date for Python 2.6 shipped with openSUSE 11.*.) This will be 
fixed some time.

Anyway, you are subscribed now, according to the members list. I'm 
CC'ing you just to be sure.

> We want to use CLN to make world record calculations in various directions,
> but this requires effective use of parallel (or multi-thread) computations.
>
> Our experience with (careful) multithread computations using CLN is spotless.
> I have read your reservation in the mailing list archive  that CLN is not 'thread-safe'.
>
> I don't understand what you mean by that!

All CLN objects are reference-counted and this code is susceptible to 
race-conditions where different threads manipulate the reference counts 
without paying attention to concurrent threads.

> Is that like declaring that "no car is collision safe" (in which sense I think no software
> package is 'thread-safe'), or are there specific problem areas you know of?
> We have already made many-many-million CLN algebraic operations both in
> serial and threaded (using openMP), and always obtained the same results.

That is entirely possible if you don't share CLN objects between threads 
or do so only very infrequently. The most likely problem you'll ever hit 
is probably a segmentation fault while accessing memory another thread 
as prematurely freed.

> We also want to run our (CLN) programs on large computer clusters -- using MPI.
> This could be made much more efficient if we could transfer for instance CL_F
> numbers between different computers in binary form.
>
> Could you please help me understand how this can be done?

Instead of hacking something low-level, let me make a suggestion:
convert your floating-point numbers to a triple (sign, mantissa, 
exponent), serialize it to a buffer in base 32, send that across your 
cluster and let the receiver reverse this operation. Check the manual 
for the type decoded_float.

The point is that conversion to/from a base that is a power of two is 
O(N) with little overhead. It is significantly more efficient than 
conversion to any other base which is O(N*log(N)*log(log(N))). And 32 is 
the largest supported base. This way, the overhead compared to the 
internal representation is only 1-log2(32)/log2(256)=3/8. Unless your 
program is extremely communication-intense (in which case you should 
think about redesigning it), this is almost optimal.

Best wishes
   -richy.
-- 
Richard B. Kreckel
<http://www.ginac.de/~kreckel/>