9.6: References

Last updated
Save as PDF

Page ID: 129555

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Ansari, T. K., Kumar, R., Singh, S., & Ganapathy, S. (2017, December). Deep learning meth- ods for unsupervised acoustic modeling—Leap submission to ZeroSpeech challenge 2017. In Automatic Speech Recognition and Understanding Workshop (ASRU), 2017 IEEE (pp. 754-761). IEEE.
Badino, L., Mereta, A., & Rosasco, L. (2015). Discovering discrete subword units with bina- rized autoencoders and hidden-markov-model encoders. In Sixteenth Annual Conference of the International Speech Communication Association.
Baljekar, P., Sitaram, S., Muthukumar, P. K., & Black, A. W. (2015). Using articulatory fea- tures and inferred phonological segments in zero resource speech processing. In Sixteenth Annual Conference of the International Speech Communication Association.
Chen, H., Leung, C. C., Xie, L., Ma, B., & Li, H. (2015). Parallel inference of Dirichlet process Gaussian mixture models for unsupervised acoustic modeling: A feasibility study. In Sixteenth Annual Conference of the International Speech Communication Association.
Chen, H., Leung, C. C., Xie, L., Ma, B., & Li, H. (2017). Multilingual bottle-neck feature learning from untranscribed speech.
Chomsky, N. (1986). Knowledge of language: Its nature, origin, and use. Greenwood Pub- lishing Group.
Clair, M. C. S., Monaghan, P., & Christiansen, M. H. (2010). Learning grammatical cate- gories from distributional cues: Flexible frames for language acquisition. Cognition, 116(3), 341-360.
DeRose, S. J. (1998, September). XQuery: A unified syntax for linking and querying general XML documents. In QL.
de Vries, N., Davel, M., Badenhorst, J., Basson, W., de Wet, F., Barnard, E., et al. A smart- phone-based ASR data collection tool for under-resourced languages. Speech Communication 2014;56:119–131.
Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage- Based Language Learning. Language Learning, 67(S1), 40-65.
Gauthier, E., Besacier, L., Voisin, S., Melese, M., & Elingui, U. P. (2016, May). Collecting resources in sub-saharan african languages for automatic speech recognition: a case study of wolof. In 10th Language Resources and Evaluation Conference (LREC 2016).
Gerken, L., Wilson, R., & Lewis, W. (2005). Infants can use distributional cues to form syn- tactic categories. Journal of child language, 32(2), 249-268.
Glass, J. (2012, July). Towards unsupervised speech processing. In Information Science, Signal Processing and their Applications (ISSPA), 2012 11th International Conference on (pp. 1-4). IEEE.
Gómez, R. L., & Lakusta, L. (2004). A first step in form-based category abstraction by 12- month-old infants. Developmental science, 7(5), 567-580.
Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., … & Ng, A. Y. (2014). Deep speech: Scaling up end-to-end speech recognition. arXiv preprint arXiv 1412.5567. 21
Heck, M., Sakti, S., & Nakamura, S. (2017, December). Feature optimized dpgmm cluster- ing for unsupervised subword modeling: A contribution to zerospeech 2017. In Automatic Speech Recognition and Understanding Workshop (ASRU), 2017 IEEE (pp. 740-746). IEEE.
Jansen, A., Dupoux, E., Goldwater, S., Johnson, M., Khudanpur, S., Church, K., … & JHU CLSP Mini-Workshop Research Team. (2013). A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition.
Jansen, A., & Van Durme, B. (2011, December). Efficient spoken term discovery using ran- domized algorithms. In Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on (pp. 401-406). IEEE.
Kamper, H., Jansen, A., King, S., & Goldwater, S. (2014, December). Unsupervised lexical clustering of speech segments using fixed-dimensional acoustic embeddings. In Spoken Lan- guage Technology Workshop (SLT), 2014 IEEE(pp. 100-105). IEEE.
Kilgarriff, A. (2005). Language is never, ever, ever, random. Corpus linguistics and linguis- tic theory, 1(2), 263-276.
Kuhl, P. K. in Neonate Cognition: Beyond the Blooming Buzzing Confusion (eds Mehler, J. & Fox, R.) 231–262 (Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1985).
Kolodny, O., Lotem, A., & Edelman, S. (2015). Learning a Generative Probabilistic Gram- mar of Experience: A Process-Level Model of Language Acquisition. Cognitive Science, 39(2), 227-267.
Lany, J., & Gómez, R. L. (2008). Twelve-month-old infants benefit from prior experience in statistical learning. Psychological Science, 19(12), 1247-1252.
Manenti, C., Pellegrini, T., & Pinquier, J. (2017, October). Unsupervised Speech Unit Dis- covery Using K-means and Neural Networks. In International Conference on Statistical Language and Speech Processing (pp. 169-180). Springer, Cham.
Marcus, M. P., Marcinkiewicz, M. A., & Santorini, B. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational linguistics, 19(2), 313-330.
Miao, Y., Gowayyed, M., & Metze, F. (2015, December). EESEN: End-to-end speech recog- nition using deep RNN models and WFST-based decoding. In Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on (pp. 167-174). IEEE.
Minagawa-Kawai, Y., Cristià, A., & Dupoux, E. (2011). Cerebral lateralization and early speech acquisition: A developmental scenario. Developmental Cognitive Neuroscience, 1(3), 217-232.
Mintz, T. H., Newport, E. L., & Bever, T. G. (2002). The distributional structure of grammat- ical categories in speech to young children. Cognitive Science, 26(4), 393-424.
Mintz, T. H. (2003). Frequent frames as a cue for grammatical categories in child directed speech. Cognition, 90(1), 91-117.
Mitchell, T. M. (1997). Machine learning. WCB.
Moon, C., Lagercrantz, H., & Kuhl, P. K. (2013). Language experienced in utero affects vowel perception after birth: A two-country study. Acta Paediatrica, 102(2), 156-160.
Monaghan, P., & Rowland, C. F. (2017). Combining language corpora with experimental and computational approaches for language acquisition research. Language Learning, 67(S1), 14-39. 22
Räsänen, O., Doyle, G., & Frank, M. C. (2015). Unsupervised word discovery from speech using automatic segmentation into syllable-like units. In Sixteenth Annual Conference of the International Speech Communication Association.
Renshaw, D., Kamper, H., Jansen, A., & Goldwater, S. (2015). A comparison of neural net- work methods for unsupervised representation learning on the zero resource speech chal- lenge. In Sixteenth Annual Conference of the International Speech Communication Associa- tion.
Schatz, T., Peddinti, V., Back, F., Jansen, A., Hermansky, H., Dupoux, E.. Evaluating speech features with the minimal-pair abx task (i): Analysis of the classical mfc/plp pipeline. In: Proceedings of Interspeech. 2013.
Schatz, T., Peddinti, V., Cao, X.N., Bach, F., Hermansky, H., Dupoux, E.. Evaluating speech features with the minimal-pair abx task (ii): Resistance to noise. In: Proceedings of Inter- speech. 2014.
Pike, K. L. (1967). Language in relation to a unified theory of the structure of human behav- ior (Vol. 24). Walter de Gruyter GmbH & Co KG.
Pitt, M. A., Johnson, K., Hume, E., Kiesling, S., & Raymond, W. (2005). The Buckeye cor- pus of conversational speech: labeling conventions and a test of transcriber reliability. Speech Communication, 45(1), 89-95.
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., … & Silovsky, J. (2011). The Kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding (No. EPFL-CONF-192584). IEEE Signal Processing Society.
Thiolliere, R., Dunbar, E., Synnaeve, G., Versteegh, M., & Dupoux, E. (2015). A hybrid dy- namic time warping-deep neural network architecture for unsupervised acoustic modeling. In Sixteenth Annual Conference of the International Speech Communication Association.
Versteegh, M., Anguera, X., Jansen, A., & Dupoux, E. (2016). The zero resource speech challenge 2015: Proposed approaches and results. Procedia Computer Science, 81, 67-72.
Wang, D., & Zhang, X. (2015). Thchs-30: A free chinese speech corpus. arXiv preprint arX- iv:1512.01882.
Yuan, Y., Leung, C. C., Xie, L., Chen, H., Ma, B., & Li, H. (2017). Extracting bottleneck fea- tures and word-like pairs from untranscribed speech for feature representation.