Multiple time-instances features of degraded speech for single ended quality measurement

Dubey, Rajesh Kumar; Kumar, Arun

dc.contributor.author	Dubey, Rajesh Kumar
dc.contributor.author	Kumar, Arun
dc.date.accessioned	2017-11-30T07:25:14Z
dc.date.available	2017-11-30T07:25:14Z
dc.date.issued	2017
dc.identifier.citation	Advances in electrical and electronic engineering. 2017, vol. 15, no. 3, p. 400-407 : ill.	cs
dc.identifier.issn	1336-1376
dc.identifier.issn	1804-3119
dc.identifier.uri	http://hdl.handle.net/10084/122089
dc.description.abstract	The use of single time-instance features, where entire speech utterance is used for feature computation, is not accurate and adequate in capturing the time localized information of short-time transient distortions and their distinction from plosive sounds of speech, particularly degraded by impulsive noise. Hence, the importance of estimating features at multiple time-instances is sought. In this, only active speech segments of degraded speech are used for features computation at multiple time-instances on per frame basis. Here, active speech means both voiced and unvoiced frames except silence. The features of different combinations of multiple contiguous active speech segments are computed and called multiple time-instances features. The joint GMM training has been done using these features along with the subjective MOS of the corresponding speech utterance to obtain the parameters of GMM. These parameters of GMM and multiple time-instances features of test speech are used to compute the objective MOS values of different combinations of multiple contiguous active speech segments. The overall objective MOS of the test speech utterance is obtained by assigning equal weight to the objective MOS values of the different combinations of multiple contiguous active speech segments. This algorithm outperforms the Recommendation ITU-T P.563 and recently published algorithms.	cs
dc.format.extent	862748 bytes
dc.format.mimetype	application/pdf
dc.language	Neuvedeno	cs
dc.language.iso	en	cs
dc.publisher	Vysoká škola báňská - Technická univerzita Ostrava	cs
dc.relation.ispartofseries	Advances in electrical and electronic engineering	cs
dc.relation.uri	http://dx.doi.org/10.15598/aeee.v15i3.2330
dc.rights	© Vysoká škola báňská - Technická univerzita Ostrava
dc.rights	© Vysoká škola báňská - Technická univerzita Ostrava
dc.rights	Attribution 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.subject	auditory feature	cs
dc.subject	degraded speech	cs
dc.subject	speech quality	cs
dc.title	Multiple time-instances features of degraded speech for single ended quality measurement	cs
dc.type	article	cs
dc.identifier.doi	10.15598/aeee.v15i3.2330
dc.rights.access	openAccess
dc.type.version	publishedVersion
dc.type.status	Peer-reviewed

Soubory tohoto záznamu

Název:: 2330-12521-1-PB.pdf
Velikost:: 842.5Kb
Formát:: PDF
Popis:: publishedVersion

Zobrazit/otevřít

Název:: license_rdf
Velikost:: 1.329Kb
Formát:: Neznámý

Zobrazit/otevřít

Tento záznam se objevuje v následujících kolekcích

AEEE. 2017, vol. 15 [102]

Zobrazit minimální záznam