[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [linrad] Speech processing (Linrad-01.25)
Over the years I have done quit a bit of experimentation with
audio (and RF) processing for the SSB transmitter. It is my opinion
that audio equalization is perhaps more important then level processing.
My conclusions are as follows:
1) Start with equalization that reduces low frequencies and peaks
audio in the 1 to 2 Khz range. This will vary depending on voice and mic.
2) The second stage should be a fast attack, slow decay audio
compressor that will tend to keep the amount of processing constant.
It should have a dynamic range of perhaps 10db. Noise gating (-20db) can be used to
limit background noise.
3) After the signal is converted to RF, a proper filter should limit the
bandwidth that enters the RF clipper.
4) The RF clipper at this stage should be variable from about 6 db to
20 db of clipping.
5) A matching RF filter cleans up distortion products. The filter can
have a slope to reduce some un-needed low frequency
energy. In other words it might be flat from 1Khz to 3 Khz, but
500 hz may be 3 to 6 db down and 250 hz could be 6 to 12 db down.
This concentrates RF power in the frequency range that most of the
intellegibility occurs. The filter could peak around 1.5 to 2 Khz and
allow some additional rolloff at 3Khz, when 3Khz might otherwise be considered
the upper end of the filter (normal -3db point). This should restore a more natural
A full implementation of this might be difficult and expensive when
done with hardware, but could be relatively cheap and easy when
done in software. Some degree of varability may be desirable.
Most people would not have the ability to intelligently select various
parameters, so it might be a good idea to have only 2 or 3 presets.
When less processing is used, the bandwidth can be generally
wider and when more processing is used, the bandwidth can be generally
more narrow. When 3 settings are used it could be labled as follows:
1) Ragchew ( little clipping and wide band ).
2) Contest ( moderate clipping and bandwidth ).
3) DX ( maximum clipping and minimum bandwidth for difficult conditions ).
As well, 2 or 3 frequency response presets for stage one could help
tailor the rig to someones voice or choice of microphone. For instance,
3 switches could be selected on or off as follows:
1) Bass cut on/off. ( -6db )
2) Midrange boost on/off. ( +6db )
3) Treble boost on/off. (+6db )
Hidden pots can be made available for the advanced user with
default 6db settings for the beginner.
In my opinion:
Audio clipping can give more intelligibility under weak signal conditions
when the overall frequency response is not optimal. This is only due to
the generation of harmonics that are heard over the noise better, but harmonics
are a poor substitute for those frequencies that are really present in the human
voice, thus the need for good equalization with RF clipping.
Currently, I do not use all of my ideas here. I use the Yeasu FT736r.
I have increased the Rf clipping in the processor. There is an RF filter
both before and after the Rf clipper. It is sufficient to copy my own SSB
echos off the moon.
I hope this summary of my findings and opinions are useful.
73, Jim Shaffer, WB9UWA.
On 24 Sep 2004 at 20:43, Leif Åsbrink wrote:
> Hi All,
> Working with the WSE hardware for Linrad (and any other
> SDR package that someone might want to use it with) has
> diverted my focus from Linrad to performance of ham radio
> transceivers in general.
> As I see it, the use of ALC for speech processing is the
> most severe limitation for dynamic range right now. It is
> absolutely not the right way of compressing speech, that
> should be done before the signal is sent to the bandwidth-
> defining filter.
> There seems to be a general consensus among amateurs that
> RF clipping is very much better than audio clipping and
> therefore (I think) the obvious modification which is to
> use the clipped signal intended for FM to feed the SSB
> generator has not become popular.
> A voice peak that has been clipped to a flat top resembles
> a square wave. The corners caused by clipping contain energy
> over a wide frequency range. When the clipped signal is sent
> through a rectangular filter, there will be oscillations
> corresponding to the removed signal energy outside the passband.
> These oscillations are around the clipping level and they reach
> as high as 3 dB above the clipping level for very hard clipping.
> Using an ALC to flatten the waveform will of course restore the
> energy outside the passband which is a really bad idea. Energy
> outside the passband will be useless to the QSO partner.
> A properly operating ALC will set the gain for the peak of the
> oscillation to not saturate the power amplifier, but then the
> average power suffers slightly.
> There are many possible solutions to this problem. One wants to
> limit the peaks in the speech in a way that does not create signals
> outside the passband. One way is to have a chain of several band pass
> filters. The clipping then gradually becomes softer. Another way is to
> use a soft limiter (like vacuum tubes in HIFI amplifiers are supposed
> to work). It is also possible to find a signal that fits to the filter
> and that one can AM modulate the SSB signal with to generate a
> waveform that does not go above the desired maximum level but that
> retains as much as possible of the information content.
> I have just started to write a little about these things
> for Dubus. I do not know how to deduce what kind of processing
> to prefer from theory so I decided to do it experimentally.
> Linrad-01.25 contains a package "voicelab" which produces
> random Phonetics, Alfa, Bravo, Charlie.... with selectable
> processing and a fixed peak signal level. To this processed
> voice signal one can add white noise at suitable levels.
> When running the program one should then press the correct
> keyboard key for each letter or number, Linrad then gives
> the percentage of correct key pressings.
> When I do the testing with my own voice I find that there
> is a very small difference between RF clipping and audio
> clipping. Audio clipping actually gives slightly better
> intelligibility at the threshold. At high S/N the RF clipper
> sounds much better than the audio clipper.
> You can use your own voice to record a file containing the
> Phonetics and then you can evaluate what processing will give
> the best readability in your own and in your friends ears.
> You may also find that although very hard clipped signals
> can be copied at very low S/N, such signals sound very ugly
> and can not be copied to 100% at any signal level. In real life
> it may be better to loose 1 or 2 dB on the detect limit in
> order to have easy copy when qsb lifts the signal above the
> I would be interested in some feedback on your findings if you
> are interested to take the time to do some statistics on
> how you can copy at different signal levels. My interest is
> twofold. Firstly I want to know for sure how it really is
> for several different voices before I write what I think one
> should do to avoid the ALC generated splatter in modern
> transceivers. Secondly I want to know what types of processing
> to implement for the Linrad transmitter.
> The voicelab package is intended as an aid when setting up
> the speech processing for the Linrad transmitter. It is far
> too complicated and it it is very inefficient in terms of
> CPU usage to be practically useful in the Linrad transmitter,
> but once I know better what kind of processing to use I will
> make fast code to produce the same result. The current filters
> are implemented by FFTs that span the entire time of one
> Phonetic (2 seconds) and there are many filters at AF and RF
> (operating on real or complex waveforms)
> Leif / SM5BSZ