Nonlinear long-term prediction of speech signals

M. Birgmeier; H.-P. Bernhard; G. Kubin

doi:10.1109/ICASSP.1997.596180

Nonlinear long-term prediction of speech signals

M. Birgmeier, H.-P. Bernhard, G. Kubin

Publikation: Konferenzband/Beitrag in Buch/Bericht › Konferenzartikel › Begutachtung

Abstract

This paper presents an in-depth study of nonlinear long-term prediction of speech signals. While previous studies of nonlinear prediction focused on short-term prediction (with only moderate performance advantage over adaptive linear prediction in most cases), successful long-term prediction strongly depends on the nonlinear oscillator framework for speech modeling. This hypothesis has been confirmed in a series of experiments run on a voiced speech database. We provide results for the prediction gain as a function of the prediction delay using two methods. One is based on an extended form of radial basis function networks and is intended to show what performance can be reached using a nonlinear predictor. The other relies on calculating the mutual information between multiple signal samples. We explain the role of this mutual information function as the upper bound on the achievable prediction gain. We show that with matching memory and dimension, the two methods yield nearly the same value for the achievable prediction gain. We try to make a fair comparison of these values against those obtained using optimized linear predictors of various orders. It turns out that the nonlinear predictor's gain is significantly higher than that for a linear predictor using the same parameters.

Originalsprache	Englisch
Titel	1997 IEEE International Conference on Acoustics, Speech, and Signal Processing
Herausgeber (Verlag)	IEEE Computer Society
Seiten	1283-1286
Seitenumfang	4
Band	2
ISBN (Print)	0-8186-7919-0
DOIs	https://doi.org/10.1109/ICASSP.1997.596180
Publikationsstatus	Veröffentlicht - 24 Apr. 1997
Extern publiziert	Ja
Veranstaltung	1997 IEEE International Conference on Acoustics, Speech, and Signal Processing - Munich, Germany Dauer: 21 Apr. 1997 → 24 Apr. 1997

Publikationsreihe

Name	1997 IEEE International Conference on Acoustics, Speech, and Signal Processing

Konferenz

Konferenz	1997 IEEE International Conference on Acoustics, Speech, and Signal Processing
Zeitraum	21/04/97 → 24/04/97

Zugriff auf Dokument

10.1109/ICASSP.1997.596180

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=596180

Andere Dateien und Links

https://ieeexplore.ieee.org/document/596180/

Dieses zitieren

Birgmeier, M., Bernhard, H.-P., & Kubin, G. (1997). Nonlinear long-term prediction of speech signals. in 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (Band 2, S. 1283-1286). Artikel 596180 (1997 IEEE International Conference on Acoustics, Speech, and Signal Processing). IEEE Computer Society. https://doi.org/10.1109/ICASSP.1997.596180

@inproceedings{58369045e9464c718830d807c06bc58c,

title = "Nonlinear long-term prediction of speech signals",

abstract = "This paper presents an in-depth study of nonlinear long-term prediction of speech signals. While previous studies of nonlinear prediction focused on short-term prediction (with only moderate performance advantage over adaptive linear prediction in most cases), successful long-term prediction strongly depends on the nonlinear oscillator framework for speech modeling. This hypothesis has been confirmed in a series of experiments run on a voiced speech database. We provide results for the prediction gain as a function of the prediction delay using two methods. One is based on an extended form of radial basis function networks and is intended to show what performance can be reached using a nonlinear predictor. The other relies on calculating the mutual information between multiple signal samples. We explain the role of this mutual information function as the upper bound on the achievable prediction gain. We show that with matching memory and dimension, the two methods yield nearly the same value for the achievable prediction gain. We try to make a fair comparison of these values against those obtained using optimized linear predictors of various orders. It turns out that the nonlinear predictor's gain is significantly higher than that for a linear predictor using the same parameters.",

keywords = "Upper bound, Predictive models, Delay, Radial basis function networks, Mutual information, Speech processing, Radio frequency, Oscillators, Databases, Signal sampling",

author = "M. Birgmeier and H.-P. Bernhard and G. Kubin",

year = "1997",

month = apr,

day = "24",

doi = "10.1109/ICASSP.1997.596180",

language = "English",

isbn = "0-8186-7919-0",

volume = "2",

series = "1997 IEEE International Conference on Acoustics, Speech, and Signal Processing",

publisher = "IEEE Computer Society",

pages = "1283--1286",

booktitle = "1997 IEEE International Conference on Acoustics, Speech, and Signal Processing",

address = "United States",

note = "1997 IEEE International Conference on Acoustics, Speech, and Signal Processing ; Conference date: 21-04-1997 Through 24-04-1997",

}

Birgmeier, M, Bernhard, H-P & Kubin, G 1997, Nonlinear long-term prediction of speech signals. in 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing. Bd. 2, 596180, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE Computer Society, S. 1283-1286, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 21/04/97. https://doi.org/10.1109/ICASSP.1997.596180

Nonlinear long-term prediction of speech signals. / Birgmeier, M.; Bernhard, H.-P.; Kubin, G.
1997 IEEE International Conference on Acoustics, Speech, and Signal Processing. Band 2 IEEE Computer Society, 1997. S. 1283-1286 596180 (1997 IEEE International Conference on Acoustics, Speech, and Signal Processing).

Publikation: Konferenzband/Beitrag in Buch/Bericht › Konferenzartikel › Begutachtung

TY - GEN

T1 - Nonlinear long-term prediction of speech signals

AU - Birgmeier, M.

AU - Bernhard, H.-P.

AU - Kubin, G.

PY - 1997/4/24

Y1 - 1997/4/24

N2 - This paper presents an in-depth study of nonlinear long-term prediction of speech signals. While previous studies of nonlinear prediction focused on short-term prediction (with only moderate performance advantage over adaptive linear prediction in most cases), successful long-term prediction strongly depends on the nonlinear oscillator framework for speech modeling. This hypothesis has been confirmed in a series of experiments run on a voiced speech database. We provide results for the prediction gain as a function of the prediction delay using two methods. One is based on an extended form of radial basis function networks and is intended to show what performance can be reached using a nonlinear predictor. The other relies on calculating the mutual information between multiple signal samples. We explain the role of this mutual information function as the upper bound on the achievable prediction gain. We show that with matching memory and dimension, the two methods yield nearly the same value for the achievable prediction gain. We try to make a fair comparison of these values against those obtained using optimized linear predictors of various orders. It turns out that the nonlinear predictor's gain is significantly higher than that for a linear predictor using the same parameters.

AB - This paper presents an in-depth study of nonlinear long-term prediction of speech signals. While previous studies of nonlinear prediction focused on short-term prediction (with only moderate performance advantage over adaptive linear prediction in most cases), successful long-term prediction strongly depends on the nonlinear oscillator framework for speech modeling. This hypothesis has been confirmed in a series of experiments run on a voiced speech database. We provide results for the prediction gain as a function of the prediction delay using two methods. One is based on an extended form of radial basis function networks and is intended to show what performance can be reached using a nonlinear predictor. The other relies on calculating the mutual information between multiple signal samples. We explain the role of this mutual information function as the upper bound on the achievable prediction gain. We show that with matching memory and dimension, the two methods yield nearly the same value for the achievable prediction gain. We try to make a fair comparison of these values against those obtained using optimized linear predictors of various orders. It turns out that the nonlinear predictor's gain is significantly higher than that for a linear predictor using the same parameters.

KW - Upper bound

KW - Predictive models

KW - Delay

KW - Radial basis function networks

KW - Mutual information

KW - Speech processing

KW - Radio frequency

KW - Oscillators

KW - Databases

KW - Signal sampling

UR - https://ieeexplore.ieee.org/document/596180/

U2 - 10.1109/ICASSP.1997.596180

DO - 10.1109/ICASSP.1997.596180

M3 - Conference Paper

SN - 0-8186-7919-0

VL - 2

T3 - 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing

SP - 1283

EP - 1286

BT - 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing

PB - IEEE Computer Society

T2 - 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing

Y2 - 21 April 1997 through 24 April 1997

ER -

Nonlinear long-term prediction of speech signals

Abstract

Publikationsreihe

Konferenz

Zugriff auf Dokument

Andere Dateien und Links

Fingerprint

Dieses zitieren