Projekt koncepcyjny bazy danych do przechowywania nagrań z badań artykulograficznych mowy polskiej

Robert Wielgat; Rafał Jędryka; Anita Lorenc; Łukasz Mik; Daniel Król

doi:10.5604/01.3001.0010.7558

Autor

Robert Wielgat Państwowa Wyższa Szkoła Zawodowa w Tarnowie https://orcid.org/0000-0003-0229-6493
Rafał Jędryka Państwowa Wyższa Szkoła Zawodowa w Tarnowie https://orcid.org/0000-0001-6651-4985
Anita Lorenc Uniwersytet Marii Curie-Skłodowskiej w Lublinie, Zakład Logopedii i Językoznawstwa Stosowanego; Uniwersytet Warszawski, Zakład Terapii Mowy i Emisji Głosu Instytutu Polonistyki Stosowanej https://orcid.org/0000-0002-7614-0881
Łukasz Mik Państwowa Wyższa Szkoła Zawodowa w Tarnowie https://orcid.org/0000-0003-2712-6861
Daniel Król Państwowa Wyższa Szkoła Zawodowa w Tarnowie https://orcid.org/0000-0002-8611-0838

DOI:

https://doi.org/10.5604/01.3001.0010.7558

Słowa kluczowe:

artykulografia elektromagnetyczna, bazy danych, sieci Bayesa, inwersja mowy, kamera akustyczna, fonetyka artykulacyjna, fonetyka akustyczna

Abstrakt

W artykule opisano strukturę i funkcjonalność bazy danych artykulograficznych do przechowywania danych z badań przeprowadzanych z wykorzystaniem artykulografu elektromagnetycznego, kamery akustycznej i 3 kamer wideo. Baza danych umożliwia selektywne pobieranie różnych typów danych, w szczególności dotyczących mówcy, sesji nagraniowej, nagrań oraz eksperymentów. Opisano strukturę i budowę bazy danych. Przedstawiono również potencjalne przyszłe zastosowania do przeprowadzania analiz statystycznych oraz w eksperymentach dotyczących inwersji mowy z wykorzystaniem modeli sieci Bayesa.

Statystyka pobrań

Statystyki pobrań nie są jeszcze dostępne

Bibliografia

J. S. Perkell, M. H. Cohen, M. A. Svirsky, M.L. Matthies, I. Garabieta, and M. T. Jackson, The Journal of the Acoustic Society of America, 1992, 92(6), 3078–3096. Google Scholar

H. Kjellström, O. Engwall, Audiovisual to articulatory inversion, Speech Communication, 2009, 51(3), 195–209. Google Scholar

A. Katsamanis, G. Papandreou, and P. Maragos, Audiovisual-to-Articulatory Inversion Using Hidden Markov Models, Proceedings of the IEEE Workshop on Multimedia Signal Processing (MMSP-2007), 2007, 457–460. Google Scholar

K. Richmond, Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus, Proceedings of 12th Annual Conference of the International Speech Communication Association INTERSPEECH 2011, 1505–1508. Google Scholar

A. Lorenc, Wymowa normatywna polskich samogłosek nosowych i spółgłoski bocznej, Dom wydawniczy ELIPSA, Warszawa 2016, ISBN 978-83-8017-090-2. Google Scholar

D. Król, A. Lorenc, „Tarnowskie Colloquia Naukowe”, 2017, 4(3/2017), 9–16. Google Scholar

MOCHA-TIMIT database (2001), available online, http://www.cstr.ed.ac.uk/research/projects/artic/mocha.html, (accessed December 2017). Google Scholar

S. Narayanan, A. Toutios, V. Ramanarayanan, et al., The Journal of the Acoustical Society of America, 2014, 136(3), 1307–1311. Google Scholar

F. Rudzicz, A. K. Namasivayam, T. Wolff, Lang Resources & Evaluation, 2012, 46, 523–541. Google Scholar

A. Ji, J. J. Berry, M. T. Johnson, The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, 7719–7723. Google Scholar

J. Beskow, O. Engwall, and B. Granström, Resynthesis of Facial and Intraoral Articulation fromSimultaneous Measurements, Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS’03), 2003. Google Scholar

E. Meister, L. Meister, Multimodal Corpus of Speech Production: Work in Progress, in book: Human Language Technologies. The Baltic Perspective: Proceedings of the Fifth International Conference Baltic HLT 2012, Edition: Frontiers in Artificial Intelligence and Applications, IOS Press, 2012, ch. Multimodal Corpus of Speech Production: Work in Progress, pp.146–153. Google Scholar

M. Rochoń, B. Pompino-Marschall, The Articulation of Secondarily Palatalized Coronals in Polish. In Proceedings of XIVth International Congress of Phonetic Sciences, San Francisco, 1999, 1897–1900. Google Scholar

B. Pompino-Marschall, M. Żygis, Surface Palatalization of Polish Bilabial Stops: Articulation and Acoustics. Proceedings of the 15th International Congress of Phonetic Sciences, 2003, 1751–1754. Google Scholar

A. Trochymiuk, R. Święciński, Logopedia, 2009, 38, 173–201. Google Scholar

A. Lorenc, R. Święciński, Application of Phonetics in Speech Therapy: a Case of Abnormal Convex Tongue Setting in Polish. in Recent Developmnets in Applied Phonetics. Studies in Linguistics and Methodology 6, Wydawnictwo KUL, Lublin, 2014, 287–324. Google Scholar

R. Święciński, An EMA Study of Articulatory Settings in Polish Speakers of English. in Teaching and Researching English Accents in Native and Non-native Speakers, Springer, Heidelberg, 2013, 73–82. Google Scholar

Ł. Mik, R. Wielgat, A. Lorenc, D. Król, R. Święciński, R. Jędryka, Multimodal Speech Data Acquisition with the Use of EMA Fast-speed Video Cameras and a Dedicated Microphone Array, Proceedings of 2016 MIXDES – 23rd International Conference Mixed Design of Integrated Circuits and Systems, 2016, 415– 418. Google Scholar

P. Hoole and A. Zierdt, Five-dimensional articulography, Speech Motor Control: New developments in basic and applied research, eds. B. Maassen and P.H.H.M. Van Lieshout, 2009, 331–349. Google Scholar

M. Stella, P. Bernardini, F. Sigona, A. Stella, M. Grimaldi, B. Gili Fivela, J. Acoust. Soc. Am., 2012, 132(6), 3941– 949. Google Scholar

P. Boersma, D. Weenink, „Praat: doing phonetics by computer” [computer program, version 5.3.57]. webpage: http://www.praat.org/, 2014. Google Scholar

Hidden Markov Model Toolkit (HTK), available online, http://htk.eng.cam.ac.uk/, (accessed December 2017). Google Scholar

K. Murphy, Dynamic Bayesian networks: Representation, inference and learning, Ph.D. thesis, UC Berkeley, Computer Science Division (2002). Google Scholar

Xie, L., Liu, Z.-Q., Pattern Recognition, 2007, 40(8), 2325–2340. Google Scholar

A. Lorenc, R. Wielgat, „Tarnowskie Colloquia Naukowe” – Nauki humanistyczne, 2017, (2)1/2017, 129–157. Google Scholar

R. Wielgat, A. Lorenc, Science, Technology and Innovation, 2017, zgłoszone do publikacji. Google Scholar

R. Wielgat, Ł. Mik, A. Lorenc, A. Truchan, M. Szostek, Choice of optimal measurement conditions for calculating the correlation between EMA sensor and video marker position coordinates in electromagnetic articulography, Proceedings of 2017 International Conference on Systems, Signals and Image Processing (IWSSIP), Poznań, 2017. Google Scholar

R. Wielgat, Ł. Mik, A. Lorenc, Correlational and regressive analysis of the relationship between tongue and lips motion – An EMA and video study of selected polish speech sounds, 2017 MIXDES – 24th International Conference “Mixed Design of Integrated Circuits and Systems”, 509-514, 2017. Google Scholar

A. Ji, Speaker Independent Acoustic-to-Articulatory Inversion, Dissertation, Marquette University, 2014. Google Scholar

R. Wielgat, A. Lorenc, Speech inversion by dynamic time warping method, 2016 International Conference on Signals and Electronic Systems (ICSES), 2016, 81–84. Google Scholar

Projekt koncepcyjny bazy danych do przechowywania nagrań z badań artykulograficznych mowy polskiej

Autor

DOI:

Słowa kluczowe:

Abstrakt

Statystyka pobrań

Bibliografia

Pobrania

Opublikowane

Jak cytować

Numer

Dział

Licencja

Inne teksty tego samego autora

Utwórz zgłoszenie

Język / Language

Informacje

indeksacja