Oh yeah it can have some use for code size reducing (still the instruction eat 1 extra byte) or just for easy indexing with offset but when you want fast code, you just avoid using them...
In the 4PCM driver i need fast buffer read, then write... and in this case the POP/PUSH instruction are very efficients :)Quote:
Not sure why you're so down on the block processing instructions though. I can see why you didn't use them in your 4-PCM driver: you didn't need the counter so wasting BC for that purpose is undesirable, but it's not really faster. The fastest copy using pop looks something like:
That's 32 cycles to copy two bytes which is 16 cycles per byte assuming you don't need to check for some kind of end condition and you don't need to be able to cross a 256 byte boundary. ldi also takes 16 cycles to copy a byte, but also decrements a counter for you and doesn't have the 256 byte boundary limitation. ldir adds another 5 cycles, but given that you're getting looping, 5 cycles seems pretty cheap. It's also nice for code density since 2 bytes gives you what is basically a memcpy instruction.Code:pop de ;10
ld (hl), e ;7, 17
inc l ;4, 21
ld (hl), d ;7, 28
inc l ;4, 32
In the XGM driver i do use LDI instruction as i need to preserve SP to keep trace of V-interrupt but destroying BC register is really a problem for me... and 16 cycles per byte is still a lot !
The LDIR instruction is even slower, 21 cycles per byte copied because of the lazy internal implementation... Of course it does everything in a single instruction but that is really slow to copy a block of memory :-/ It could have been 8 cycles per byte + init time (counting BUS access limitation) with an optimized implementation.
Also with SP register you can do this :
104 cycles to copy 8 bytes so 13 cycles per byte, already a nice improvement over the 16 default cycles (and you can lower it to 12.5 cycles per byte if you can erase all your registers :p).Code:exx ; 4 (save registers)
ld sp,src+len ; 10
pop af ;10
pop bc ;10
pop de ;10
pop hl ;10
ld sp,dest ; 10
push hl ;11
push de ;11
push bc ;11
push af ;11
; repeat as needed to copy whole buffer
....
exx ; 4 (restore registers)
But you know, i was not the one who started to compare these CPU ;) Imo they just don't run in the same class... but i so often saw that the 6502 is fast and could be faster than a 68000 at same clock rate. That just does not make any sense, the 68000 is far much more advanced. Even the 65816 is far behind the 68000 in term of efficiency. As you said the fast implementation in PC-Engine make it somehow "interesting" (and the Hu6280 add many news instructions which make the CPU definitely better) but that was a very atypical design and Nintendo as Sega could not afford a so expensive memory solution.Quote:
As for the actual topic, I find the comparison between the 65XX and the 68000 pretty amusing. As has been stated, at typical clock speeds for each it's not much of a contest. The PC-Engine with it's ~7MHz 65C02 makes things a little interesting, but it's hard to really make a fair comparison between a 16/32-bit CPU and an 8-bit one. The Z80/6502 comparison is more interesting, but at least in the home computer arena the Z80 was clocked typically 3-4X as fast as the 6502 in contemporary systems (presumably due to the RAM constraints Stef mentioned). My fastest Z80 implementation of the little 6502 code sample is 225 cycles and a perhaps more realistic one is 264. On a clock for clock basis, that's terrible, but if we're comparing a 3.5Mhz Z80 system to a 1MHz 6502 it's more like 64-75 cycles (assuming we're normalizaing to the 6502 clock).
Of course, if we're limiting ourselves to consoles the clock ratios are not so bad (only 2X comparing the NES and the Master System) and the 6502 pulls ahead again.
Anyone have any actual price data from the relevant time periods?
I do agree the Z80 versus 6502 is more interesting, honestly i don't like much the Z80 and i always have some bad time coding sound drivers with this CPU :p
I think it's not very efficient (still better than the 6502) and the custom GB Z80 seems to fix some of the Z80 flaws and remove useless parts except they should have keep the second set of registers (very useful for interrupt or fast save in general) but definitely all others changes are very positive =)

