1 Tips for performance tuning on a specific architecture:
3 1. Choose the optimal limb size (intDsize). This is fundamental. On 32-bit
4 platforms intDsize=32 is best. On 64-bit platforms intDsize=64 may be
5 better, especially if there is a 64x64-bit multiplication in hardware.
9 3. The break-even points between several algorithms for the same task
10 have to be determined experimentally.
13 cl_DS_mul.cc karatsuba_threshold
14 cl_DS_mul.cc function cl_fftm_suitable
16 cl_DS_div.cc function cl_recip_suitable
18 cl_2DS_recip.cc recip2adic_threshold
20 cl_2DS_div.cc function cl_recip_suitable
22 cl_DS_sqrt.cc function cl_recipsqrt_suitable
23 cl_LF_sqrt.cc "if (len > ...)"
25 cl_I_gcd.cc cl_gcd_double_threshold
26 binary->decimal conversion:
27 cl_I_to_digits.cc cl_digits_div_threshold
29 cl_LF_pi.cc best of 4 algorithms
31 cl_F_expx.cc factor limit_slope of isqrt(d)
32 cl_R_exp.cc inside function exp
33 cl_R_ln.cc inside function ln
35 cl_LF_eulerconst.cc function compute_eulerconst
37 cl_F_sinx.cc factor limit_slope of isqrt(d)
38 cl_R_sin.cc inside function sin
39 cl_R_cos.cc inside function cos
40 cl_R_cossin.cc inside function cl_cos_sin
41 cl_F_sinhx.cc factor limit_slope of isqrt(d)
42 cl_R_sinh.cc inside function sinh
43 cl_R_cosh.cc inside function cosh
44 cl_R_coshsinh.cc inside function cl_cosh_sinh
45 cl_F_atanx.cc factor limit_slope of isqrt(d)
46 cl_F_atanx.cc inside function atanx
47 cl_F_atanhx.cc factor limit_slope of isqrt(d)
48 cl_F_atanhx.cc inside function atanhx