HomeUser Control Panel (unavailable in archive)ForumsTutorialsArt GalleryResourcesMaps

Self made trigometric funcs?

07-18-2006, 12:50 PM#1
BertTheJasser
On wc3sear.ch sombody submitted this peace of code, and guessed this being faster then the natives.(I doubt this, but no matter)

I opened this thread to ask if it does actually work for all values for x, and how to get to the algorithm.

I know Pow is dumb and very slow, I just copied what was posted on wc3sear.ch.
Collapse JASS:
//sin(x) = x- (x^3)/3+ (x^5)/5! + (x^7)/7
function TSin takes real x returns real
 return (x - Pow(x,3)/3 + Pow(x,5)/5 + Pow(x,7)/7)
endfunction

//cos(x) = 1- (x^2)/2+ (x^4)/4 - (x^6)/6
function TCos takes real x returns real
 return (1 - Pow(x,2)/2 + Pow(x,4)/4 - Pow(x,6)/6)
endfunction

Thx for reply.

EDIT: I hope I corrected the formula right right now.
07-18-2006, 01:19 PM#2
Vexorian
3 pows + non-native function call sure are slower than native trig functions. Time to test.

Should it take radians or degrees?
07-18-2006, 01:27 PM#3
Mezzer
Go for degrees. And I actually asked you if this method would be faster a while ago Vex, but nevermind. It's also a lot more imprecise, but I've no idea how much that will affect stuff ingame, as the sum of the infinite progession of those is actually equal to sin and cos. That's probably how the native's calculate it, but I don't know how many they take into account
07-18-2006, 01:28 PM#4
Vexorian
Tests show:
Sin : 11.21 seconds
TSin : 36.06 seconds

Collapse 5000 times:
function yyY takes nothing returns nothing
 local real x=-2*bj_PI
 local real L=-x
 local real y
    loop
        exitwhen x>L
        set y=(T)Sin(x)
        set x=x+0.01
    endloop
 set udg_log=udg_log+1
endfunction

Edit: Should I test in degrees instead then, ok I will?
07-18-2006, 01:38 PM#5
Vexorian
Even multiplying Sin's argument with bj_RADTODEG and not doing so on TSin, and using values from -360 to 360, TSin takes 21.28 seconds and Sin 8.53 seconds, would like to know who said that TSin was faster than Sin
07-18-2006, 01:40 PM#6
iNfraNe
sounds like a stupid idea to start with anyway.. pow is sooo slow.
07-18-2006, 01:50 PM#7
Mezzer
Actually, those funcs there have nothing to do with sin and cos now that I look at them, they should be
sin => x - pow(x,3)/6 + pow(x,5)/120 - pow(x,7)/5040
cos => 1 - pow(x,2)/2 + pow(x,4)/24 - pow(x,6)/720

And you could speed them up by swapping pow(x,3) with x*x*x and the like
07-18-2006, 06:00 PM#8
BertTheJasser
Yes there were locla var versions without pow too, but the primaray aspect was to get some one on hand, to explain me WHY it returns the same values as sin/cos and how to get, to expire or to "invent" such a algorithm.

@Mezzer: Have look at the top, edited the func for easier reading.

@Vexorian: I have no clue if it should take degrees or radians.
07-18-2006, 06:21 PM#9
iNfraNe
Pow(x,3/6) is NOT (x*x*x)/6
Pow(x,3)/6 is.

Oh and, no it doesnt work for all values of x, sin/cos is an polynomial (as it seems, also dont know if its the correct term for it, im rather sloppy on my terms :) ) and you would have to continue with x^9/9! etc.
07-18-2006, 06:27 PM#10
PipeDream
Hehe, I remember when those were submitted. Not only are they WAY SLOWER but they're wrong in a few different ways.

The usual algorithm:
1. transform angle to range 0..pi/2 or at least 0..2pi
2. run a few iterations of an angle reduction identity like sin3x = 3sinx - 4sin^3x
3. terminate with a few terms of the taylor series or a polynomial that approximates the trig function close to zero. for sin, just 'x' can be good enough, for cos, just 1.

You could study blizzard's routine as well. It's faked in integer arithmetic but the structure of the algorithm should be apparent.
Code:
.text:6F395D70 game_sin        proc near               ; CODE XREF: sub_6F08B1A0+E2p
.text:6F395D70                                         ; sub_6F1F19E0+17Ep ...
.text:6F395D70
.text:6F395D70 var_4           = dword ptr -4
.text:6F395D70
.text:6F395D70                 push    ebp
.text:6F395D71                 mov     ebp, esp
.text:6F395D73                 push    ecx
.text:6F395D74                 push    ebx
.text:6F395D75                 push    esi
.text:6F395D76                 push    edi
.text:6F395D77                 mov     edi, ecx
.text:6F395D79                 push    offset dword_6F874BD8
.text:6F395D7E                 lea     ecx, [ebp+var_4] ; float *
.text:6F395D81                 call    game_multiply
.text:6F395D86                 mov     eax, [eax]
.text:6F395D88                 lea     ecx, [ebp+var_4]
.text:6F395D8B                 mov     [ebp+var_4], eax
.text:6F395D8E                 call    sub_6F395060
.text:6F395D93                 mov     ecx, eax
.text:6F395D95                 and     ecx, 0FFh
.text:6F395D9B                 mov     esi, ecx
.text:6F395D9D                 shl     esi, 8
.text:6F395DA0                 or      ecx, esi
.text:6F395DA2                 mov     edx, eax
.text:6F395DA4                 sar     eax, 12h
.text:6F395DA7                 mov     esi, ecx
.text:6F395DA9                 and     eax, 3
.text:6F395DAC                 sar     edx, 8
.text:6F395DAF                 shl     esi, 10h
.text:6F395DB2                 mov     ebx, eax
.text:6F395DB4                 and     edx, 3FFh
.text:6F395DBA                 or      ecx, esi
.text:6F395DBC                 test    bl, 1
.text:6F395DBF                 jz      short loc_6F395DE2
.text:6F395DC1                 lea     eax, ds:0[edx*4]
.text:6F395DC8                 mov     edx, offset unk_6F726E68
.text:6F395DCD                 sub     edx, eax
.text:6F395DCF                 mov     esi, [edx]
.text:6F395DD1                 mov     edx, offset unk_6F726E64
.text:6F395DD6                 sub     edx, eax
.text:6F395DD8                 mov     eax, esi
.text:6F395DDA                 sub     eax, [edx]
.text:6F395DDC                 mul     ecx
.text:6F395DDE                 neg     edx
.text:6F395DE0                 jmp     short loc_6F395DF4
.text:6F395DE2 ; ---------------------------------------------------------------------------
.text:6F395DE2
.text:6F395DE2 loc_6F395DE2:                           ; CODE XREF: game_sin+4Fj
.text:6F395DE2                 mov     esi, ds:dword_6F725E68[edx*4]
.text:6F395DE9                 mov     eax, ds:dword_6F725E6C[edx*4]
.text:6F395DF0                 sub     eax, esi
.text:6F395DF2                 mul     ecx
.text:6F395DF4
.text:6F395DF4 loc_6F395DF4:                           ; CODE XREF: game_sin+70j
.text:6F395DF4                 shl     ebx, 1Eh
.text:6F395DF7                 add     edx, esi
.text:6F395DF9                 sar     ebx, 1Fh
.text:6F395DFC                 xor     edx, ebx
.text:6F395DFE                 sub     edx, ebx
.text:6F395E00                 lea     ecx, [ebp+var_4]
.text:6F395E03                 call    I2R
.text:6F395E08                 mov     eax, [eax]
.text:6F395E0A                 mov     ecx, eax
.text:6F395E0C                 and     ecx, 7F800000h
.text:6F395E12                 neg     ecx
.text:6F395E14                 sbb     ecx, ecx
.text:6F395E16                 and     ecx, 0F800000h
.text:6F395E1C                 sub     eax, ecx
.text:6F395E1E                 mov     [edi], eax
.text:6F395E20                 mov     eax, edi
.text:6F395E22                 pop     edi
.text:6F395E23                 pop     esi
.text:6F395E24                 pop     ebx
.text:6F395E25                 mov     esp, ebp
.text:6F395E27                 pop     ebp
.text:6F395E28                 retn
.text:6F395E28 game_sin        endp
.text:6F395E28

.text:6F395060 sub_6F395060    proc near               ; CODE XREF: sub_6F008FB0+143p
.text:6F395060                                         ; sub_6F008FB0+285p ...
.text:6F395060                 push    esi
.text:6F395061                 mov     esi, [ecx]
.text:6F395063                 mov     edx, esi
.text:6F395065                 shr     edx, 17h
.text:6F395068                 and     edx, 0FFh
.text:6F39506E                 cmp     edx, 7Fh
.text:6F395071                 jnb     short loc_6F395077
.text:6F395073                 xor     eax, eax
.text:6F395075                 pop     esi
.text:6F395076                 retn
.text:6F395077 ; ---------------------------------------------------------------------------
.text:6F395077
.text:6F395077 loc_6F395077:                           ; CODE XREF: sub_6F395060+11j
.text:6F395077                 mov     eax, esi
.text:6F395079                 and     eax, 7FFFFFh
.text:6F39507E                 mov     ecx, 96h
.text:6F395083                 or      eax, 800000h
.text:6F395088                 sub     ecx, edx
.text:6F39508A                 js      short loc_6F39509D
.text:6F39508C                 shr     eax, cl
.text:6F39508E                 mov     ecx, esi
.text:6F395090                 sar     ecx, 1Fh
.text:6F395093                 pop     esi
.text:6F395094                 mov     edx, eax
.text:6F395096                 mov     eax, ecx
.text:6F395098                 xor     eax, edx
.text:6F39509A                 sub     eax, ecx
.text:6F39509C                 retn
.text:6F39509D ; ---------------------------------------------------------------------------
.text:6F39509D
.text:6F39509D loc_6F39509D:                           ; CODE XREF: sub_6F395060+2Aj
.text:6F39509D                 neg     ecx
.text:6F39509F                 shl     eax, cl
.text:6F3950A1                 mov     ecx, esi
.text:6F3950A3                 sar     ecx, 1Fh
.text:6F3950A6                 pop     esi
.text:6F3950A7                 mov     edx, eax
.text:6F3950A9                 mov     eax, ecx
.text:6F3950AB                 xor     eax, edx
.text:6F3950AD                 sub     eax, ecx
.text:6F3950AF                 retn
.text:6F3950AF sub_6F395060    endp
07-18-2006, 06:27 PM#11
BertTheJasser
Ok, this seems to get even more usless, without any reachable aim ...

Vex, plx close this thread or just delete it.
Thx.
07-18-2006, 06:42 PM#12
Mezzer
You should change it in the code Bert since what you are doing there is taking x to the 0.5th power and the like and not to the 3rd power and the dividing by 6 as you should be.
And infrane, you don't need to do the whole thing, the first few will suffice to a certain degree of error
07-18-2006, 06:46 PM#13
Vexorian
actually if you want to improve speed you could do what old school game developers have been doing over the time and use a look up table. Wait a second some one is calling, I am gonna edit this post soon with more info

note to self: learn asm
07-18-2006, 06:51 PM#14
iNfraNe
Quote:
Originally Posted by Mezzer
You should change it in the code Bert since what you are doing there is taking x to the 0.5th power and the like and not to the 3rd power and the dividing by 6 as you should be.
And infrane, you don't need to do the whole thing, the first few will suffice to a certain degree of error
I only said that cuz he asked if it worked for all x, with only the first 3 therms it will divert after some time and will look like x^7/7! because that will be so much bigger than say x^5 for large x.
So if you want it for all x you must include the whole polynomial, which you cant so you must use either a radial/degree normalization.

Or just use a table, indeed, which is alot faster.
07-18-2006, 06:54 PM#15
Mezzer
Ok, true enough. Still, if he means to use it, he need the actual values in there