| |
DG1SCR > DSP 08.04.95 17:33l 506 Lines 20077 Bytes #999 (999) @ EU
BID : 0845DB0RBS7E
Read: DL2WP DL1EJW GUEST DK5RAS
Subj: comp.dsp-faq 3/8
Path: DB0GOS<DB0HAG<DB0END<DB0ACC<DB0PKE<DB0ACH<DB0NDK<DB0AIS<DB0HOM<DB0RBS
Sent: 950408/1448z @:DB0RBS.#BW.DEU.EU [SCHWIEBERDINGEN OP:DL1SEM] BCM1.36
From: DG1SCR @ DB0RBS.#BW.DEU.EU (Juergen)
To : DSP @ EU
1.2.14 What is MathViews? Where can I get it?
Package-Name: mathview.zip
MathViews for Windows/32 - Math Software for Windows
(32-bit). Current version is 1.60. "MathViews for
Windows/32 is Matlab look-alike. It has a full set of linear
algebra and signal processing functionality."
No sources. Windows 3.1. Shareware. Try:
ftp.cica.indiana.edu, oak.oakland.edu or wuarchive.wustl.edu
Author: Dr. Shalom Halevy 70274.2564@compuserve.com PO BOX
22564, San Diego, CA 92192 (619) 552-9031 USA (Tel/FAX)
1.2.15 What is Shorten? Where can I get it?
Shorten is a compressor/coder for waveform files. Two major
changes have been made since the last announcement:
a) Thanks to the efforts of two users there is now a MS-DOS
executable (version 1.09) available on:
file://svr-ftp.eng.cam.ac.uk/comp.speech/sources/shn109.exe
b) The lastest version, 1.11, has early support for lossy
compresson. This is achieved by quantisation of the
prediction residual which maximises the segmental signal to
noise ratio. This works well for many waveforms - for
speech the quality is sometimes better and sometimes worse
than the various CCITT ADPCM standards. The advantages are
that the code is very fast, will accept most known file
formats and will code from lossless compression down to
three bits per sample. The disadvange is that this is a
variable bit rate scheme and so is more suited to storage
than transmission applications. It is available from:
file://svr-ftp.eng.cam.ac.uk/comp.speech/sources/shorten-1.11.tar.Z
The MS-DOS version comes with no support whatsoever - you
have been warned. I'll be able to test and maintain this
code when someone decides that it is worth funding the kit
to enable me to do this.
The UNIX version has been tested on many platforms and there
are no known portability problems. If you have problems,
then please tell me.
Feedback from USENET readers has been very valuable in the
past, and I'd like to ask for this again. I'll incorporate
as many sugestions as I can into version 2.0.
Contact: Tony Robinson (ajr@dsl.eng.cam.ac.uk)
Q2.1: Where can I get some algorithms for general DSP?
The following archives contain things such as matrix
operations, FFT's and generally useful things like that, as
opposed to complete applications.
Netlib, which serves some of this software via email. Try mail to
netlib@ORNL.GOV with "send help" in the subject field.
For Europe:
Internet: netlib@nac.no
EARN/BITNET: netlib%nac.no@norunix.bitnet
X.400: s=netlib; o=nac; c=no;
EUNET/uucp: nac!netlib
For the Pacific, try netlib@draci.cs.uow.edu.au
For background about netlib, see Jack J. Dongarra and Eric Grosse,
"Distribution of Mathematical Software Via Electronic Mail,"
Comm. ACM (1987) 30,403--407.
A similar collection of statistical software is available from
statlib@temper.stat.cmu.edu.
The symbolic algebra system REDUCE is supported by
reduce-netlib@rand.org.
The Naval Surface Warfare Center has a library of mathematical
Fortran subroutines that may be of use. From the report itself:
NSWC Library of Mathematical Subroutines
Report No.: NSWC TR 90-21, January 1990
by Alfred H. Morris, Jr.
Naval Surface Warfare Center (E43)
Dahlgren, VA 22448-5000
U.S.A.
Distribution: Approved for public release; distribution unlimited.
Abstract:
The NSWC library is a library of general-purpose Fortran
subroutines that provide a basic computational capability in
a variety of mathematical activities. Emphasis has been
placed on the transportability of the codes. Subroutines are
available in the following areas: Elementary Operations,
Geometry, Special Functions, Polynomials, Vectors, Matrices,
Large Dense Systems of Linear Equations, Banded Matrices,
Sparse Matrices, Eigenvalues and Eigenvectors, l1 Solution of
Linear Equations, Least-Squares Solution of Linear Equations,
Optimization, Transforms, Approximation of Functions, Curve
Fitting, Surface Fitting, Manifold Fitting, Numerical
Integration, Integral Equations, Ordinary Differential
Equations, Partial Differential Equations
[Witold Waldman, witold@hotblk.aed.dsto.gov.au]
This is avialble from
file://euler.math.usma.edu/pub/misc/nswc.tar.Z
This is a 3.2 Mbyte file with 800+ Fortran routines mentioned
above.
The Fortran source code from the IEEE Press book "Programs
For Digital Signal Processing" is available from
file://nimios.eng.mcmaster.ca/pub/IEEE/software/dsp.zip
It includes FIR and IIR filter design software, FFT
subroutines, interpolation programs, a coherence and
cross-spectral estimation program, linear prediction analysis
programs, and a frequency domain filtering program.
[Witold Waldman, witold@hotblk.aed.dsto.gov.au, from
Charles Owen, mgcbo@uxa.ecn.bgu.edu]
Also, see the summary of DSP-related FTP sites, at the end of
this FAQ.
If you don't know where to find what you're after, try archie.
SigLib
SigLib is an ANSI C Source DSP library. Current version is
1.61 SigLib has been compiled to run on IBM PCs, Sun
Workstations and the following DSPs : TMS320C30, TMS320C40,
DSP96002 and ADSP21020.
SigLib contains about 130 base functions, from which many
others are derived and over 80 demonstration programs, all of
which exercise more than on part of the library at any one
time. The library source and examples supplied total more
than 18000 lines of code. SigLib also includes DFilter, an
FIR and IIR digital filter design program and WinBuf, a
Windows 3 graphical front end, for display of process
results.
Some applications of SigLib include drill string vibration
analysis, room response analysis, audio effects,
telecommunications, active control of sound and vibration,
system simmulation and medical imaging.
Registered users of SigLib get one years free upgrade and
maintenance.
Spectrum analysis : FFTs and IFFTs; real, complex, zoom and
spectrograms, microscan. Windowing types : Hanning, Hamming,
Blackman, Triangle, Rectangle, Kaiser and Blackman-Harris.
Fixed coefficient filtering : FIR, comb, IIR and one pole IIR
filters, filter design methods, polyphase multi-rate filters,
differentiation and integration filters, Hilbert
transformers. Adaptive coefficient filtering : LMS.
Convolution and Correlation : convolve, correlate. Imaging :
conv3x3, histogram and 2DFFT. Signal generation : Sine,
Cosine, White noise, Chirp (linear and non-linear), Square,
Triangular, Sawtooth, Impulse, PN sequence. Modulation : AM,
complex shift, FSK, spectral inversion, FM, QAM. Statistical
analysis : sum, mean, average, standard deviation and
variance, kurtosis. Regression analysis : linear,
logarithmic, exponential, power. Digital effects : reverb,
distortion, echo, pitch shifting Utility functions, including
: scaling (lin and log) offset, min/max find, clip, offset,
rotate, buffer lengthen, buffer shorten, buffer addition
multiplication etc., histogram, quantise, absolute, peak
hold, polynomial expansion. Control : PID. Graphics :
display_buffer, display_buffer_line, print_buffer,
display_3d_buffer, pole_zero_plot, xy_plot. Data stream disk
I/O functions.
ITEMPRICE
SigLib object code UKP20, US$30
SigLib source and object code UKP35, US$60
Educational Price (Source code) UKP25, US$40
Available on 3.5" diskette
UK shipping per package UKP3
Non UK shipping per diskette UKP5, US$8
These fees include 1 years free upgrade and maintenance.
Payment preferably by Money Order or Cheque.
email:johned@cix.compulink.co.uk
John Edwards, Numerix, 157 Sileby Road, Barrow-on-Soar,
Leics, LE12 8LW, UK.
Phone : +44 (0)509 413195, UK time between 17.30 PM and 9.00
PM.
Q2.2: What are CELP and LPC? Where can I get the source for CELP and LPC?
CELP stands for "code excited linear prediction". LPC stands for
"linear predictive coding". They are compression algorithms used for
low bit rate (2400 and 4800 bps) speech coding.
The U.S. DoD's Federal-Standard-1016 based 4800 bps code excited
linear prediction voice coder version 3.2 (CELP 3.2) Fortran and C
simulation source codes are available for worldwide distribution (on
DOS diskettes, but configured to compile on Sun SPARC stations) from
NTIS and DTIC. Example input and processed speech files are
included. A Technical Information Bulletin (TIB), "Details to Assist
in Implementation of Federal Standard 1016 CELP," and the official
standard, "Federal Standard 1016, Telecommunications: Analog to
Digital Conversion of Radio Voice by 4,800 bit/second Code Excited
Linear Prediction (CELP)," are also available.
This is available through the National Technical Information Service:
NTIS
U.S. Department of Commerce
5285 Port Royal Road
Springfield, VA 22161
USA
Phone: +1 703 487-4650
Source code may also be obtained from:
file://svr-ftp.eng.cam.ac.uk/comp.speech/sources/celp_3.2a.tar.gz and
file://super.org/pub/speech/celp_3.2a.tar.Z
The code (C, FORTRAN, diskio) all has been built and tested on a Sun4
under SunOS4.1.3. If you want to run it somewhere else, then you may
have to do a bit of work. (A Solaris 2.x-compatible release is
planned soon.)
One note to PCers. The files:
cbsearch.F celp.F csub.F mexcite.F psearch.F
are meant to be passed through the C preprocessor (cpp). We gather that DOS (or
whatever it's called) can't distinguish the .F from a .f. Be careful!
Very limited support is available from the authors (Joe, et
al.). Please do not send questions or suggestions without first
reading the documentation (README files, the Technical Information
Bulletin, etc.). The authors would enjoy hearing from you, but they
have limited time for support and would like to use it as efficiently
as possible. They welcome bug reports, but, again, please read the
documentation first. All users of FS-1016 CELP software are strongly
encouraged to acquire the latest release (version 3.2a as of this
writing).
The "AD" ordering number for the CELP software is AD M000 118 (US$
90.00) and for the TIB it's AD A256 629 (US$ 17.50). The LPC-10
standard, described below, is FIPS Pub 137 (US$ 12.50). There is a
$3.00 shipping charge on all U.S. orders. The telephone number for
their automated system is 703-487-4650, or 703-487-4600 if you'd
prefer to talk with a real person.
(U.S. DoD personnel and contractors can receive the package from the
Defense Technical Information Center: DTIC, Building 5, Cameron
Station, Alexandria, VA 22304-6145. Their telephone number is
703-274-7633.)
The following articles describe the Federal-Standard-1016 4.8-kbps
CELP coder (it's unnecessary to read more than one):
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
Federal Standard 1016 4800 bps CELP Voice Coder," Digital Signal
Processing, Academic Press, 1991, Vol. 1, No. 3, p. 145-155.
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
DoD 4.8 kbps Standard (Proposed Federal Standard 1016)," in Advances
in Speech Coding, ed. Atal, Cuperman and Gersho, Kluwer Academic
Publishers, 1991, Chapter 12, p. 121-133.
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
Proposed Federal Standard 1016 4800 bps Voice Coder: CELP," Speech
Technology Magazine, April/May 1990, p. 58-64.
The U.S. DoD's Federal-Standard-1015/NATO-STANAG-4198 based 2400 bps
linear prediction coder version 53 (LPC-10e v53) Fortran or C
simulation source codes are available on a limited basis upon written
request to:
Tom Tremain
Department of Defense
Ft. Meade, MD 20755-6000
USA
This is also available from
file://svr-ftp.eng.cam.ac.uk/comp.speech/sources/celp_3.2a.tar.gz and
file://super.org/pub/speech/lpc10-1.0.tar.gz
There is also a section about FS-1015 in the book: Panos
E. Papamichalis, Practical Approaches to Speech Coding, Prentice-Hall,
1987.
The following article describes the FS 1016 4.8-kbps CELP coder:
Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
Proposed Federal Standard 1016 4800 bps Voice Coder: CELP," Speech
Technology Magazine, April/May 1990, p. 58-64.
Copies of the official standard "Federal Standard 1016,
Telecommunications: Analog to Digital Conversion of Radio Voice by
4,800 bit/second Code Excited Linear Prediction (CELP)" are available
for US$ 5.00 each from:
GSA Federal Supply Service Bureau
Specification Section, Suite 8100
470 E. L'Enfant Place, S.W.
Washington, DC 20407
(202)755-0325
The U.S. Federal Standard 1015 (NATO STANAG 4198) is described in:
Thomas E. Tremain, "The Government Standard Linear Predictive Coding
Algorithm: LPC-10," Speech Technology Magazine, April 1982, p. 40-49.
The voicing classifier used in the enhanced LPC-10 (LPC-10e) is
described in: Campbell, Joseph P., Jr. and T. E. Tremain,
"Voiced/Unvoiced Classification of Speech with Applications to the
U.S. Government LPC-10E Algorithm," Proceedings of the IEEE
International Conference on Acoustics, Speech, and Signal Processing,
1986, p. 473-6.
Realtime DSP code for FS-1015 and FS-1016 is sold by several vendors,
including DSP Software Engineering and Analogical Systems (see the
vendor address list in section 5 for contact info). DSP Software
Engineering's FS-1016 code can run on a DSP Research's Tiger 30 or on
Intellibit's AE2000 TMS320C31 based 3" by 2.5" card. See section 4.1
for more on these cards. Analogical's product runs on a 27 MHz
DSP56001 chip.
[Most of the above from Joe Campbell, jpcampb@afterlife.ncsc.mil, with
additions from Dan Frankowski, drankow@cs.umn.edu, and Ed Hall,
edhall@rand.org]
Q2.3: What is ADPCM? Where can I get sources for it?
ADPCM stands for Adaptive Differential Pulse Code Modulation. It is a
family of speech compression and decompression algorithms. A common
implementation takes 16-bit linear PCM samples samples and converts
them to 4-bit samples, yeilding a compression rate of 4:1.
There is public domain C code available via anonymous ftp at
file://ftp.cwi.nl/pub/adpcm.shar written by Jack Jansen (email
Jack.Jansen@cwi.nl). It is very programmer-friendly. The ADPCM code
used is the Intel/DVI ADPCM code which is being recommended by the IMA
Digital Audio Technical Working Group. It allows the following calls:
adpcm_coder(short inbuf[], char outbuf[], int nsample, struct adpcm_state *state);
adpcm_decoder(char inbuf[], short outbuf[], int nsample, struct adpcm_state *state);
The routines have been tested on an SGI Indigo running Irix 4.0.2 and
on a Sparcstation 1+ running SunOS 4.1.1. On a Sun, the code will
compress at 250Ksample/sec and decompress at 300Ksample/sec. On an
SGI, the compressor runs at 350Ksample/sec and the decompressor at
700Ksample/sec.
Note that this is NOT a CCITT G722 coder. The CCITT ADPCM standard is
much more complicated, probably resulting in better quality sound but
also in much more computational overhead.
You can get a G.721/722/723 package by email to teledoc@itu.arcom.ch,
with GET ITU-3022 as the *only* line in the body of the message. This
was originaly written by SUN, and includes ADPCM and ulaw encoders.
This is also available as:
file://svr-ftp.eng.cam.ac.uk/comp.speech/sources/G711_G722_G723.tar.Z
[From Dan Frankowski, drankow@cs.umn.edu; Jack Jansen, Jack.Jansen@cwi.nl]
Q2.4: What is GSM? Where can I get source for it?
The Communications and Operating Systems Research Group (KBS) at the
Technische Universitaet Berlin is currently working on a set of
UNIX-based tools for computer-mediated telecooperation that will be
made freely available.
As part of this effort we are publishing an implementation of the
European GSM 06.10 provisional standard for full-rate speech
transcoding, prI-ETS 300 036, which uses RPE/LTP (residual pulse
excitation/long term prediction) coding at 13 kbit/s.
GSM 06.10 compresses frames of 160 13-bit samples (8 kHz sampling
rate, i.e. a frame rate of 50 Hz) into 260 bits; for compatibility
with typical UNIX applications, our implementation turns frames of 160
16-bit linear samples into 33-byte frames (1650 Bytes/s). The quality
of the algorithm is good enough for reliable speaker recognition; even
music often survives transcoding in recognizable form (given the
bandwidth limitations of 8 kHz sampling rate).
The interfaces offered are a front end modelled after compress(1), and
a library API. Compression and decompression run faster than realtime
on most SPARCstations. The implementation has been verified against
the ETSI standard test patterns.
Contacts:
Jutta Degener (jutta@cs.tu-berlin.de)
Carsten Bormann (cabo@cs.tu-berlin.de)
Communications and Operating Systems Research Group, TU Berlin
Fax: +49.30.31425156, Phone: +49.30.31424315
An implementation can be had from:
file://tub.cs.tu-berlin.de/pub/tubmik/gsm-1.0.tar.Z
with file://tub.cs.tu-berlin.de/pub/tubmik/gsm-1.0-patch1 and
file://tub.cs.tu-berlin.de/pub/tubmik/gsm-1.0-patch2
or as a faster but not always up-to-date alternative:
file://liasun3.epfl.ch/pub/audio/gsm-1.0pl2.tar.Z
[From Dan Frankowski, dfrankow@cs.umn.edu]
Q2.5: How does pitch perception work, and how do I implement it on my DSP chip?
Firstly, there is a FAQ devoted purely to the topic of human audio
perception. It is written by Argiris Kranidiotips
(akra@uranus.di.uoa.ariadne-t.gr) and is regularly posted to several
newsgroups, including comp.dsp and comp.speech. However it is not just
restricted in speech topics. A recent version of this text is
available from:
ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/info/HumanAudioPerception
Pitch is officially defined as "That attribute of auditory sensation
in terms of which sounds may be ordered on a musical scale." Several
good examples illustrating the subtleties of pitch perception are
included in the "Auditory Demonstrations CD" which is available from
the Acoustical Society of America, Woodbury, NY 10797 for $20.
A good general reference about the psychology of pitch perception is the book:
B.C.J. Moore, "An Introduction to the Psychology of Hearing", Academic
Press, London, 1989.
This book is available in paperback and makes a good desk reference.
An algorithm implementation that matches a large body of
psychoacoustical work, but which is computationally very intensive, is
presented in the paper:
Malcolm Slaney and Richard Lyon, "A Perceptual Pitch Detector,"
Proceedings of the International Conference of Acoustics, Speech, and
Signal Processing, 1990, Albuquerque, New Mexico.
The definitive papers describing the use of such a perceptual pitch
detector as applied to the classical pitch literature is in:
Ray Meddis and M. J. Hewitt. "Virtual pitch and phase sensitivity of a
computer model of the auditory periphery. " Journal of the Acoustical
Society of America 89 (6 1991): 2866-2682. and 2883-2894.
The current work that argues for a pure spectral method starts with
the work of Goldstein:
J. Goldstein, "An optimum processor theory for the central formation
of the pitch of complex tones," Journal of the Acoustical Society of
America 54, 1496-1516, 1973.
Two approaches are worth considering if something approximating pitch
is appropriate. The people at IRCAM have proposed a harmonic analysis
approach that can be implemented on a DSP:
Boris Doval and Xavier Rodet, "Estimation of Fundamental Frequency of
Musical Sound Signals," Proceedings of the 1991 International
Conference on Acoustics, Speech, and Signal Processing, Toronto,
Volume 5, pp. 3657-3660.
The classic paper for time domain (peak picking) pitch algorithms is:
B. Gold and L. Rabiner, "Parallel processing techniques for estimating
pitch periods of speech in the time domain," Journal of the Acoustical
Society of America, 46, pp 441-448, 1969.
Finally, a word of caution: Pitch is not single-valued. We can hear a
sound and match it to several different pitches. Imagine the number of
instruments in an orchestra, each with its own pitch. Even a single
sound can have more than one pitch. See for example Demonstration 27
from the ASA Auditory Demonstrations CD.
[The above from Malcolm Slaney, Apple Computer, and John Lazzaro,
U.C. Berkeley.]
Read previous mail | Read next mail
| |