So it's not really that the hardware is hard to work with. The Genesis sound system from what I remember is actually pretty simple and straight forward to use. Composers like Yuzo Koshiro have even said in interviews that he found it much easier to work with than the SNES. He's even mentioned the struggles they had with Actraiser due to samples and cart size limitations.
Where the issue comes in is that FM Synth takes a bit of effort to get good sounding wave forms. A lot of developers didn't want to put that effort in. Capcom really had no excuse with Street Fighter II. The YM2612 is very similar to the YM2151 they used in the Arcade, so they could have done very direct conversions while just losing two channels. Which those two channels could have been reworked to use PSG instead. The tricky part would be the PCM samples, which as shown isn't impossible. After all they did pull off a driver that can do 2 PCM samples at once, they just failed to deal with DMA properly.
So I wouldn't say it was hard to work with, Capcom was just lazy and didn't really go about it in a smart way. People have actually done direct conversions of the Arcade instruments to the Genesis and it actually sounds pretty good. There's just no attempt to rework the missing 2 FM channels into PSG or something:
https://www.youtube.com/watch?v=hLyVmgXEiac
The point is, Street Fighter 2 could have sounded a lot better with minimal effort.
Street Fighter II SCE has more complete voice samples than the SNES. That alone would account for a bigger ROM. It also has the full arcade intro which would also require more ROM space.
So let's take a look at that. Here's Ryu's "Tatsumaki Senpukyaku!" sample on the Genesis:
https://i.imgur.com/EibfN7r.png
Here it is on SNES:
https://i.imgur.com/4n8vJGk.png
Both of these were recorded from emulators for simplicity's sake. Genesis used Fusion, SNES used bsnes.
Now take a close look at the two. Notice how the SNES one goes completely dead in between key syllables while the Genesis one doesn't. This is a trick Capcom did to save space on the samples. The Samples are broken up into "key" pieces, then put back together in with slight pauses to keep some sense of tempo, but removes "unnecessary" parts of the sample to reduce space. If you listen to the two of them you'll notice the SNES clip is very choppy sounding compared to the Genesis sample. You'll also notice it's speed up a little on the SNES. You can listen to both here:
http://www.mediafire.com/file/q144bs...Comparison.zip
I don't think the official Genesis version has the instruments dropping out issue. I think that's only in the beta version where the driver can only do 1 PCM channel. The official release can do 2 and doesn't have that issue last I checked. As for the music tempo some parts are off, but so is the SNES when compared to the CPS1 versions. It seems like with SCE Capcom tried to make the game sound like the SNES instead of the arcade original for some bizarre reason.

