sin is slow on 64 bit
Hi guys, I played with result differences on 32/64 bit systems today, and here is something interesting revealed by one sputnik test: sputnik/Conformance/15_Native_Objects/15.8_Math/15.8.2/15.8.2.16_sin/S15.8.2.16_A7.html Try this snippet on a 32 and a 64 bit system: #include <math.h> #include <stdio.h> int main() { int i; double d = 6.283185307179586, k; for (i = 0; i < 10000000; ++i) k = sin(d); return 0; } The 32 bit is 15x faster. Why? Because it simply calls fsin, while the 64 bit has an own SSE2 based implementation. I know x87 is obsolote and everything, but it is 15x faster... Shall we optimize this? Regards, Zoltan
On Thu 07 Jun 2012 16:05, Zoltan Herczeg <zherczeg@inf.u-szeged.hu> writes:
int main() { int i; double d = 6.283185307179586, k;
for (i = 0; i < 10000000; ++i) k = sin(d); return 0; }
The 32 bit is 15x faster.
At doing nothing? :) With -O2, this loop folds entirely at compile-time. Andy -- http://wingolog.org/
What kind of optimization did you have in mind?
A 3 line assembly code which mimic the old libc: fld [src] (must be a double) fsin fst [dst] Not sure the best way of adding it to JSC. And we should also check which is correct: 32 bit fsin: sin(6.2831853071795862) = -2.4492127076447545e-16 64 bit software based fsin: sin(6.2831853071795862) = -2.4492935982947064e-16 The difference is quite big. Regards, Zoltan
Would be interesting to see a performance comparison across a range of values. Geoff On Jun 7, 2012, at 11:34 AM, Zoltan Herczeg <zherczeg@inf.u-szeged.hu> wrote:
What kind of optimization did you have in mind?
A 3 line assembly code which mimic the old libc: fld [src] (must be a double) fsin fst [dst]
Not sure the best way of adding it to JSC.
And we should also check which is correct:
32 bit fsin: sin(6.2831853071795862) = -2.4492127076447545e-16
64 bit software based fsin: sin(6.2831853071795862) = -2.4492935982947064e-16
The difference is quite big.
Regards, Zoltan
I made a test which calculates the sin of 10000000 randomly generated number between -16 and +16. And the SSE2 based algorithm was about 10x faster. So I think we should stick with the current implementation. Btw, someone mentioned that 6.2831853071795862 = 2 * PI, so the sin(x) should be 0 theoretically. This is a 32/64 bit difference issue on one sputnik test, since the test rejects the x87 result. Regards, Zoltan
Would be interesting to see a performance comparison across a range of values.
Geoff
On Jun 7, 2012, at 11:34 AM, Zoltan Herczeg <zherczeg@inf.u-szeged.hu> wrote:
What kind of optimization did you have in mind?
A 3 line assembly code which mimic the old libc: fld [src] (must be a double) fsin fst [dst]
Not sure the best way of adding it to JSC.
And we should also check which is correct:
32 bit fsin: sin(6.2831853071795862) = -2.4492127076447545e-16
64 bit software based fsin: sin(6.2831853071795862) = -2.4492935982947064e-16
The difference is quite big.
Regards, Zoltan
participants (3)
-
Andy Wingo
-
Geoffrey Garen
-
Zoltan Herczeg