I should probably post this over at spritesmind, but I thought it would be fun to explore this topic here.
So here's the guidelines; we want to make a 'driver' that will frequency scale 4 PCM channels, independently of course, apply soft volume (independent to each channel), and finally mix them all into a single stream (8bit output). Aka.. driver for a MOD type player.
The approach is to have the 68k handle the all the prep work and the z80 strictly handle the PCM playback of the final mixed output stream. The goal to get a 4 channel MOD type driver running on the system, at a decent playback rate, while still having enough cpu resource to do something else (besides showing a simply title screen or static image).
For the playback rate, 22khz sounds doable. I remember TmEE saying when going above that rate, sample writes start to get missed (unless you consistently poll the status flag). IIRC, his method at 22khz was to write the same sample value 2 or 3 times to the DAC (safety measure).
22khz output is 367 samples per frame (at 1/60). The z80 is 3.57mhz. In a single 1/60 frame, the z80 has ~59,659 T-cycles, so divided by 367 is ~162 T-cycles per sample write for timed code. Hmm.. it's been a while, I think T-cycles is the right notation. Depending on the overhead on the 68k side, the playback rate might need to be dropped down a little bit (say 16khz).
So frequency scaling is easy, technically. It's just like scaling graphics, nearest neighbor, except in this case you only have a single bitmap line to scale instead of a whole image (relative to the analogy). Well, a bitmap line for each channel. Enough analogies. You can scale a frequency be skipping samples or repeating samples. Using a fixed point counter (accumulator), you step through the sample stream. The fixed pointer accumulator could look something like 8bit:16bit. A 24bit value with the upper 8bit being a whole number and applying directly to the index (pointer) of the PCM stream, and the lower 16bit part being the fractional value. When the fractional value accumulates and "overflows", it contributes a carry of 1 to the 8bit index/pointer part. Of course, the 68k doesn't benefit from 8bit values and it's often faster to pad/use 16bit in place of. So maybe something like 32bit as 16bit:16bit. There are many ways to optimize this, but I want to show a solid idea first.
To do: // I'll post some pseudo code to show how it works
Software volume is pretty easy. There are a few ways to handle it. The straight forward approach is to simply think of the max volume of a sample to be 1*sample. So software volume is basically scaling the amplitude or height of the sample. But in this case, we're only reducing it - we don't need to scale it up. Multiplication is pretty slow on the 68k, so that's out the window. Another method is binary shifting. If you shift a value to the right by one bit, you divide the value by 2. If you shift more than one bit, then it's the accumulation of dividing by 2 <n> amount of times (2, 4, 8, 16, 32, etc). It works and is decently fast, but the volume steps are fairly coarse. Typically the route taken is using a LUT of pre-calculated multiplications of two values, and the sample itself is just the index into that table. It's decently fast, and it gives much finer gradient to the volume selection/range.
To do: // post some example code
And lastly, mixing multiple channels into a single stream. This is by far the easiest part. If you're ever taken trig, or calc, or even just college algebra, you'd probably remember that given two functions, f(x) and g(x), that f(x) + g(x) is just that; the accumulation of two Y values, is the final output. You simply add all the channels together. That's it... sort of. The output of the target DAC is 8bit, so you don't want a value to "overflow" and wrap around in that 8bit value. One method is to simple make sure the accumulate of all the output values of multiple functions don't exceed the final target value. That is to say, limit the range of Y. If we restrict each channel to 6bit range (0 to 63), then accumulation of all four channels won't exceed 8bit (0 to 255). 6bit+6bit has a max value of 7bit, 7bit+7bit has a max value of 8bit. This is the safe method. There is another method that gives back some resolution to the each channel, but in return allows the artifact of clipping to occur (distortion, not wrap around) as a possibility at any given point if parts of a sample are too "loud" for too many samples at that point in time. It's not an automatic thing either; you have to build a saturation mechanism to catch the overflow (in either direction) and convert it a ceiling or floor output (max or min value). How often do you do this? It directly depends on the number of accumulating outputs as well as their height in relation to the final output range. If each channel is 7bit instead of 6bit, then you could accumulate two channel outputs without without checking this since they have a potential output of 8bit range, but every channel accumulated into it after them, would need the check to be performed. So two check operations out of four channels, for that example. If you flip the problem on its head, you'll see that you could actually do eight 6bit PCM channels with only 4 checks on the last four accumulations. But that's not the goal here.
A note about channel PCM format. Technically it can be 2's complement or 1's complement format, as well as unsigned - and then converted to the native DAC output in after the final mix. Although I really don't recommend the last one. It results in your "center line" constantly moving, as if there's an additional random waveform being accumulated into the mix. And there's no reason to use it here, since the hardware DAC channel input is signed IIRC. I personally like 2's complement because it's fastest for multiple accumulations.
//.................
So anyway, with the primer out of the way and the example code to be added later, what I wanted this thread to be - is an open forum of how to get the best/fastest method of a MOD type driver. Contribute, critique, criticize, corrections, etc.
Note: I'll come back and edit any grammar or spelling mistakes later on.

