Re: [squirrelfish] Coalescing slow cases to improve JIT memory usage.

26 Jun 2010

      Hey Nathan,

Absolutely - this is effective what we do for calls, where the bulk of the work of the slow case is performed by a shared routine.  We've experimented with using this technique more broadly in the early stages of developing the JIT, and back then there was a measurable performance degradation from the call overhead – but the code had changed a lot since then, and the tradeoffs and requirements may be different now (particularly across the varying hardware platforms the JIT has now been ported to).

However returning to an offset to a return address is probably not a good plan.  Upon executing a call, processors commonly cache the return address of the call instruction in a circular buffer used to predict return destinations.  When it reaches the return instruction it pops a value from the return address stack to predict the destination of the return.  If you change the address you're going to get a mispredict and probably a pipe flush.  (We modify the return address in our exception handling path, but we don't expect exceptions to be high performance).  However bear in mind that there is no conditional call instruction on x86, so to eliminate the slow path altogether you'd have to litter the hot path with inverted branches over the calls out to the trampolines.  I'd suggest you'd be more likely to find success in keeping the hot path branching out to slow cases, and experiment with moving the bulk of the work of larger slow cases out into shared routines (which would be being called from the slow case).

This is certainly an interesting area to investigate.

cheers,
G.

On Jun 25, 2010, at 4:20 PM, Nathan Lawrence wrote:
...
The size of our JIT generated code is a memory known issue.  According to Oliver the slow cases for some of our operations is on the order of 128 bytes.  It occurred to me that we could reduce the JITed code by only compiling the slow case once and having all of the subsequent generated code jump to that specific slow case.  The issue with this is our slow cases jump back to specific locations in the hot path, with potentially different values on the stack, as opposed to a normal function which returns back to a very specific state.  We can circumvent this issue by hand writing the assembly to return to an offset of the return address with the required state.
What do people think?
-- Nathan
_______________________________________________
squirrelfish-dev mailing list
squirrelfish-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/squirrelfish-dev

Re: [squirrelfish] Coalescing slow cases to improve JIT memory usage.

Gavin Barraclough