I tried to count the number of cycle taken by my 3D transform methods and i obtained 814 cycles just for the vertex transformation and almost 400 cycles for vertex 2D projection which is actually more than yours ! But i counted 70 cycles for MULS instruction and 160 cycles for DIVS which is pessimist and probably more than the average case (actually it is as i benchmarked about 10000 vertices transformed per second which mean close to 800 cycles instead of the 1200 counted here).
You can see the source here :
https://code.google.com/p/sgdk/sourc...rc/maths3D_a.s
About the polygon filling, i tried to do the count on a per line basis. I think i have a minimum of 250/300 cycles per line and maximum of 650/700 cycles... It's funny to see we have really close numbers here, i guess our code are somehow similar
Same for the bitmap clear stuff, i heavily used the MOVEM instruction to make it as fast as possible.
Again, you can see the code here :
https://code.google.com/p/sgdk/sourc...nk/src/bmp_a.s
I think that there is no way to use the DMA for the transfer as it seems you cannot use it neither, or not without heavy changes in the bitmap rendering code.
I'm also using the extended blank area to transfer to VRAM at full speed (at least, full 68k speed). It's why the Bitmap engine use 256x160 resolution, i actually split the transfer on 3 frames in NTSC and 2 frames on PAL systems which also mean you are limited to maximum of 20 FPS for NTSC and 25 FPS for PAL.
See the bitmap engine description for more details in the bmp.h header file :
https://code.google.com/p/sgdk/sourc.../include/bmp.h
Funny enough it seems we done very similar stuff after all, i'm not the only crazy guy attempting doing 3D on the Sega Genesis
I'm really impatient to see your own starfox / 3d demo stuff by the way