I made a test which calculates the sin of 10000000 randomly generated number between -16 and +16. And the SSE2 based algorithm was about 10x faster. So I think we should stick with the current implementation. Btw, someone mentioned that 6.2831853071795862 = 2 * PI, so the sin(x) should be 0 theoretically. This is a 32/64 bit difference issue on one sputnik test, since the test rejects the x87 result. Regards, Zoltan
Would be interesting to see a performance comparison across a range of values.
Geoff
On Jun 7, 2012, at 11:34 AM, Zoltan Herczeg <zherczeg@inf.u-szeged.hu> wrote:
What kind of optimization did you have in mind?
A 3 line assembly code which mimic the old libc: fld [src] (must be a double) fsin fst [dst]
Not sure the best way of adding it to JSC.
And we should also check which is correct:
32 bit fsin: sin(6.2831853071795862) = -2.4492127076447545e-16
64 bit software based fsin: sin(6.2831853071795862) = -2.4492935982947064e-16
The difference is quite big.
Regards, Zoltan