LPC (think Speak'n'spell, PCjr speech adapter, etc.) and variants is really the only way to achieve this. A 1.44MB floppy has 1,457,664 bytes available after formatting. LPC at 2400 bits per second means you can store 80 minutes of speech on a 1.44MB floppy.
A big flaw in the OP's testing is that he converted the audio to 8-bit PCM before feeding it to the encoders. That added 48dB of high-frequency noise and distortion for absolutely no reason. He might have found his sibilants ("s" sounds) sounding better had he not done that.
A big flaw in the OP's testing is that he converted the audio to 8-bit PCM before feeding it to the encoders. That added 48dB of high-frequency noise and distortion for absolutely no reason. He might have found his sibilants ("s" sounds) sounding better had he not done that.