Projekt koncepcyjny bazy danych do przechowywania nagrań z badań artykulograficznych mowy polskiej

Robert Wielgat; Rafał Jędryka; Anita Lorenc; Łukasz Mik; Daniel Król

doi:10.5604/01.3001.0010.7558

Authors

Robert Wielgat State Higher Vocational School in Tarnow, Poland https://orcid.org/0000-0003-0229-6493
Rafał Jędryka State Higher Vocational School in Tarnow, Poland https://orcid.org/0000-0001-6651-4985
Anita Lorenc Maria Curie-Skłodowska University, Department of Speech Therapy and Applied Linguistics, Poland; Warsaw University, Institute of Applied Polish Studies, Department of Speech and Language Therapy and Voice Production, Poland https://orcid.org/0000-0002-7614-0881
Łukasz Mik State Higher Vocational School in Tarnow, Poland https://orcid.org/0000-0003-2712-6861
Daniel Król State Higher Vocational School in Tarnow, Poland https://orcid.org/0000-0002-8611-0838

DOI:

https://doi.org/10.5604/01.3001.0010.7558

Keywords:

electromagnetic articulography, database, Bayesian networks, speech inversion, acoustic camera, articulatory phonetics, acoustic phonetics

Abstract

The article describes the structure and functionality of the articulographic database for storing data from articulographic research using an electromagnetic articulograph, an acoustic camera and 3 video cameras. The database enables selective extraction of various types of data for scientific research and interoperates with programs that carry out experiments. Structure and construction of the database is described. Potential future application in statistical analysis and experiments on speech inversion using dynamic Bayesian networks (DBN) was also presented.

Downloads

Download data is not yet available.

References

J. S. Perkell, M. H. Cohen, M. A. Svirsky, M.L. Matthies, I. Garabieta, and M. T. Jackson, The Journal of the Acoustic Society of America, 1992, 92(6), 3078–3096. Google Scholar

H. Kjellström, O. Engwall, Audiovisual to articulatory inversion, Speech Communication, 2009, 51(3), 195–209. Google Scholar

A. Katsamanis, G. Papandreou, and P. Maragos, Audiovisual-to-Articulatory Inversion Using Hidden Markov Models, Proceedings of the IEEE Workshop on Multimedia Signal Processing (MMSP-2007), 2007, 457–460. Google Scholar

K. Richmond, Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus, Proceedings of 12th Annual Conference of the International Speech Communication Association INTERSPEECH 2011, 1505–1508. Google Scholar

A. Lorenc, Wymowa normatywna polskich samogłosek nosowych i spółgłoski bocznej, Dom wydawniczy ELIPSA, Warszawa 2016, ISBN 978-83-8017-090-2. Google Scholar

D. Król, A. Lorenc, „Tarnowskie Colloquia Naukowe”, 2017, 4(3/2017), 9–16. Google Scholar

MOCHA-TIMIT database (2001), available online, http://www.cstr.ed.ac.uk/research/projects/artic/mocha.html, (accessed December 2017). Google Scholar

S. Narayanan, A. Toutios, V. Ramanarayanan, et al., The Journal of the Acoustical Society of America, 2014, 136(3), 1307–1311. Google Scholar

F. Rudzicz, A. K. Namasivayam, T. Wolff, Lang Resources & Evaluation, 2012, 46, 523–541. Google Scholar

A. Ji, J. J. Berry, M. T. Johnson, The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, 7719–7723. Google Scholar

J. Beskow, O. Engwall, and B. Granström, Resynthesis of Facial and Intraoral Articulation fromSimultaneous Measurements, Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS’03), 2003. Google Scholar

E. Meister, L. Meister, Multimodal Corpus of Speech Production: Work in Progress, in book: Human Language Technologies. The Baltic Perspective: Proceedings of the Fifth International Conference Baltic HLT 2012, Edition: Frontiers in Artificial Intelligence and Applications, IOS Press, 2012, ch. Multimodal Corpus of Speech Production: Work in Progress, pp.146–153. Google Scholar

M. Rochoń, B. Pompino-Marschall, The Articulation of Secondarily Palatalized Coronals in Polish. In Proceedings of XIVth International Congress of Phonetic Sciences, San Francisco, 1999, 1897–1900. Google Scholar

B. Pompino-Marschall, M. Żygis, Surface Palatalization of Polish Bilabial Stops: Articulation and Acoustics. Proceedings of the 15th International Congress of Phonetic Sciences, 2003, 1751–1754. Google Scholar

A. Trochymiuk, R. Święciński, Logopedia, 2009, 38, 173–201. Google Scholar

A. Lorenc, R. Święciński, Application of Phonetics in Speech Therapy: a Case of Abnormal Convex Tongue Setting in Polish. in Recent Developmnets in Applied Phonetics. Studies in Linguistics and Methodology 6, Wydawnictwo KUL, Lublin, 2014, 287–324. Google Scholar

R. Święciński, An EMA Study of Articulatory Settings in Polish Speakers of English. in Teaching and Researching English Accents in Native and Non-native Speakers, Springer, Heidelberg, 2013, 73–82. Google Scholar

Ł. Mik, R. Wielgat, A. Lorenc, D. Król, R. Święciński, R. Jędryka, Multimodal Speech Data Acquisition with the Use of EMA Fast-speed Video Cameras and a Dedicated Microphone Array, Proceedings of 2016 MIXDES – 23rd International Conference Mixed Design of Integrated Circuits and Systems, 2016, 415– 418. Google Scholar

P. Hoole and A. Zierdt, Five-dimensional articulography, Speech Motor Control: New developments in basic and applied research, eds. B. Maassen and P.H.H.M. Van Lieshout, 2009, 331–349. Google Scholar

M. Stella, P. Bernardini, F. Sigona, A. Stella, M. Grimaldi, B. Gili Fivela, J. Acoust. Soc. Am., 2012, 132(6), 3941– 949. Google Scholar

P. Boersma, D. Weenink, „Praat: doing phonetics by computer” [computer program, version 5.3.57]. webpage: http://www.praat.org/, 2014. Google Scholar

Hidden Markov Model Toolkit (HTK), available online, http://htk.eng.cam.ac.uk/, (accessed December 2017). Google Scholar

K. Murphy, Dynamic Bayesian networks: Representation, inference and learning, Ph.D. thesis, UC Berkeley, Computer Science Division (2002). Google Scholar

Xie, L., Liu, Z.-Q., Pattern Recognition, 2007, 40(8), 2325–2340. Google Scholar

A. Lorenc, R. Wielgat, „Tarnowskie Colloquia Naukowe” – Nauki humanistyczne, 2017, (2)1/2017, 129–157. Google Scholar

R. Wielgat, A. Lorenc, Science, Technology and Innovation, 2017, zgłoszone do publikacji. Google Scholar

R. Wielgat, Ł. Mik, A. Lorenc, A. Truchan, M. Szostek, Choice of optimal measurement conditions for calculating the correlation between EMA sensor and video marker position coordinates in electromagnetic articulography, Proceedings of 2017 International Conference on Systems, Signals and Image Processing (IWSSIP), Poznań, 2017. Google Scholar

R. Wielgat, Ł. Mik, A. Lorenc, Correlational and regressive analysis of the relationship between tongue and lips motion – An EMA and video study of selected polish speech sounds, 2017 MIXDES – 24th International Conference “Mixed Design of Integrated Circuits and Systems”, 509-514, 2017. Google Scholar

A. Ji, Speaker Independent Acoustic-to-Articulatory Inversion, Dissertation, Marquette University, 2014. Google Scholar

R. Wielgat, A. Lorenc, Speech inversion by dynamic time warping method, 2016 International Conference on Signals and Electronic Systems (ICSES), 2016, 81–84. Google Scholar

Conceptual design of a database to store recordings from articulographic studies of Polish speech

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

Language

Information

indexed