the computation and language lab
at the University of Rochester


overview of our research and approach


information on current and prospective members


our publications are publicly available


the lab wiki and available code, data sets

• about •

We study the basic computational processes involved in human language.

overview: colala's research includes computational and experimental approaches to language acquistion, language processing, and understanding the core features of language. Our work draws on corpus methods, behavioral experiments with adults and children, computational and mathematical modeling, as well as state-of-the-art techniques in machine learning, information theory, statistics, computer science, and linguistics. We also work with an indigenous South American population, the Tsimane' of Bolivia, to study universal aspects of human language, numerical cognition, and cognitive development.

philosophy: Most research in colala centers on creating fully-formalized (implemented) theories that aim to address basic questions about human language and cognition with convergent evidence across methods. We work to develop open tools, data, and results.

topics and selected papers:

facilities: Our experimental facilities include rooms for testing adults and children. Our computational facilities include 96 dedicated cores at the University Bluehive cluster for running models and analysis, as well as several in-house web, corpus, GPU, and HPC servers.

• open positions •

graduate students: Prospective graduate students should apply to the Department of Brain and Cognitive Sciences at the University of Rochester and contact Steve. Applications are due in early January each year. Graduate students should be hard-working, full of ideas, and capable of self-directed research. Students interested in the lab should bring strong quantitative skills including knowledge of statistics and programming, as well as serious background in some of the following areas: machine learning, mathematics, theoretical computer science, mathematical logic, cognitive development, language acquisition, language processing, or linguistics.

undergraduates: Undergraduates must be independent, creative, and self-motivated. Our undergraduate research opportunities include work on current large-scale projects, as well as opportunities for self-directed research. Undergraduates will typically have a background in BCS, math, computer science, or linguistics.

• graduate students •

Frank Mollica

Sam Cheyette

• lab manager •

Jenna Register

• principal investigator •

The lab's PI is Steve Piantadosi. He is interested in understanding the computational mechanisms supporting human language learning and use.

• collaboratoring students & staff •

Amanda Yung collaborates on kelpy

Kyle Mahowald collaborates on information theory and the lexicon

Julian Jara-Ettinger collaborates on studies with the Tsimane'

Richard Futrell collaborates on information theory and language, ngrampy, and Pirahã

Laura Stearns collaborates on the Pirahã corpus

Holly Palmeri collaborates at the Rochester Baby Lab

• collaborating labs & PIs •

Researchers in the lab work in close collaboration with a number of labs at Rochester and around the country:

Celeste Kidd
The Rochester Baby lab

Dick Aslin
The Rochester Baby lab

Jessica Cantlon
The Concepts, Actions, and Objects lab

Ben Hayden
The Hayden lab at Rochester

Michael Tanenhaus
The Tanenhaus Lab

Noah Goodman

Ted Gibson
Tedlab, the language lab at MIT

Ev Fedorenko

Dan Everett

• papers •

under review
[50]S. T. Piantadosi, "The computational origin of representation and conceptual change", under review. [pdf]
[49]S. T. Piantadosi, K. Mahowald, C. Kidd, R. Futrell, E. Gibson, "Lexical prescriptivism reduces communicative efficiency", under review. [email]
[48]S. T. Piantadosi, H. Palmeri, R. Aslin, "Limits on composition of conceptual operations in 9-month-olds", under review. [pdf]
[47]K. Mahowald, I. Dautriche, E. Gibson, S. T. Piantadosi, "Word forms are structured for efficient use", under review. [email]
[46]I. Dautriche, K. Mahowald, E. Gibson, A. Christophe, S. T. Piantadosi, "Words cluster phonetically beyond phonotactic regularities", under review. [email]
in press
[45]S. T. Piantadosi, J. Cantlon, "True Numerical Cognition in the Wild", Psychological Science, in press. [email]
[44]S. T. Piantadosi, E. Fedorenko, "Infinitely productive language can arise from chance under communicative pressure", Journal of Language Evolution, in press. [pdf]
[43]F. Mollica, S. T. Piantadosi, "How data drives early word learning: A cross-linguistic waiting time analysis", Open Mind, in press. [email]
[42]S. Ferrigno, J. Jara-Ettinger, S. T. Piantadosi, J. Cantlon, "Universal and uniquely human factors in spontaneous number perception", Nature Communications, in press. [email]
[41]S. T. Piantadosi, "A rational analysis of the approximate number system", Psychonomic Bulletin and Review, 2016, pp. 1-10. [pdf] [doi]
[40]S. T. Piantadosi, J. Tenenbaum, N. Goodman, "The logical primitives of thought: Empirical foundations for compositional cognitive models", Psychological Review, 2016. [pdf]
[39]S. T. Piantadosi, R. Jacobs, "Four problems solved by the probabilistic Language of Thought", Current Directions in Psychological Science, vol. 25, 2016, pp. 54-59. [pdf]
[38]S. T. Piantadosi, C. Kidd, "Extraordinary intelligence and the care of infants", Proceedings of the National Academy of Sciences, vol. 113, no. 25, 2016. [pdf]
[37]S. T. Piantadosi, C. Kidd, "Endogenous or exogenous? The data don’t say (Commentary on Han, Musolino, & Lidz 2016)", Proceedings of the National Academy of Sciences, vol. 113, no. 20, 2016. [pdf]
[36]S. T. Piantadosi, "Efficient estimation of Weber's W", Behavior Research Methods, vol. 48, 2016, pp. 42-52. [pdf]
[35]S. T. Piantadosi, R. Aslin, "Compositional reasoning in early childhood", PLOS ONE, 2016. [pdf]
[34]M. C. Overlan, R. A. Jacobs, S. T. Piantadosi, "A Hierarchical Probabilistic Language-of-Thought Model of Human Visual Concept Learning", in Proceedings of the Cognitive Science Society, 2016. [pdf]
[33]L. Martí, F. Mollica, S. T. Piantadosi, C. Kidd, "What determines human certainty?", in Proceedings of the Cognitive Science Society, 2016. [pdf]
[32]J. Jara-Ettinger, S. T. Piantadosi, E. Spelke, R. Levy, E. Gibson, "Mastery of the logic of natural numbers is not the result of mastery of counting: Evidence from late counters", Developmental Science, 2016. [pdf]
[31]R. Futrell, L. Stearns, D. L. Everett, S. T. Piantadosi, E. Gibson, "A Corpus Investigation of Syntactic Embedding in Pirahã", PLOS ONE, 2016. [pdf]
[30]I. Dautriche, K. Mahowald, E. Gibson, S. T. Piantadosi, "Wordform similarity increases with semantic similarity: an analysis of 100 languages", Cognitive Science, 2016. [pdf]
[29]E. J. Bigelow, S. T. Piantadosi, "A large dataset of generalization patterns in the number game", Journal of Open Psychology Data, vol. 4, 2016. [pdf] [doi]
[28]E. J. Bigelow, S. T. Piantadosi, "Inferring priors in compositional cognitive models", in Proceedings of the Cognitive Science Society, 2016. [pdf]
[27]S. T. Piantadosi, B. Hayden, "Response: "Commentary: Utility-free heuristic models of two-option choice can mimic predictions of utility-stage models under many conditions”", Frontiers in Neuroscience, vol. 9, no. 299, 2015. [pdf] [doi]
[26]S. T. Piantadosi, "Problems in the philosophy of mathematics: A view from cognitive science", in Mathematics, Substance and Surmise: Views on the Meaning and Ontology of Mathematics, E. Davis, P. J. Davis, Eds., Springer, 2015. [pdf]
[25]S. T. Piantadosi, B. Hayden, "Utility-free models of binomial choice can replicate predictions of utility models in many conditions", Frontiers in Neuroscience, 2015. [pdf]
[24]M. Pelz, S. T. Piantadosi, C. Kidd, "The dynamics of idealized attention in complex learning environments", in The 5th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, 2015. [pdf]
[23]F. Mollica, S. T. Piantadosi, "Towards semantically rich and recursive word learning models", in Proceedings of the Cognitive Science Society, 2015. [pdf]
[22]F. Mollica, S. T. Piantadosi, M. K. Tanenhaus, "The perceptual foundation of linguistic context", in Proceedings of the Cognitive Science Society, 2015. [pdf]
[21]J. Jara-Ettinger, E. Gibson, C. Kidd, S. T. Piantadosi, "Native Amazonian Children Forego Egalitarianism When They Learn to Count", Developmental Science, 2015. [pdf] [doi]
[20]P. Hemmer, K. Persaud, C. Kidd, S. T. Piantadosi, "Inferring the Tsimane's use of color categories from recognition memory", in Proceedings of the Cognitive Science Society, 2015. [pdf]
[19]J. Cantlon, S. T. Piantadosi, S. Ferrigno, K. Hughes, A. Barnard, "The origins of counting algorithms", Psychological Science, 2015. [pdf]
[18]S. Alonso-Diaz, J. Cantlon, S. T. Piantadosi, "Cognition in reach: continuous statistical inference in optimal motor planning", in Proceedings of the Cognitive Science Society, 2015. [pdf]
[17]S. T. Piantadosi, C. Kidd, R. Aslin, "Rich analysis and rational models: Inferring individual behavior from infant looking data", Developmental Science, 2014, pp. 1-16. [pdf]
[16]S. T. Piantadosi, E. Gibson, "Quantitative Standards for Absolute Linguistic Universals", Cognitive Science, vol. 38, no. 4, 2014, pp. 736-756. [pdf] [doi]
[15]S. T. Piantadosi, J. Jara-Ettinger, E. Gibson, "Children's learning of number words in an indigenous farming-foraging group", Developmental Science, vol. 17, no. 4, 2014, pp. 553-563. [pdf] [doi]
[14]S. T. Piantadosi, "Zipf’s word frequency law in natural language: A critical review and future directions", Psychonomic Bulletin & Review, vol. 21, 2014, pp. 1112-1130. [pdf] [doi]
[13]C. Kidd, S. T. Piantadosi, R. N. Aslin, "The Goldilocks Effect in Infant Auditory Attention", Child Development, vol. 85, 2014, pp. 1795-1804. [pdf] [doi]
[12]S. T. Piantadosi, H. T. H., E. Gibson, "Information content versus word length in natural language: A reply to Ferrer-i-Cancho and Moscoso del Prado Martin [arXiv:1209.1751]", ArXiv e-prints, 2013. [pdf]
[11]S. T. Piantadosi, L. Stearns, D. Everett, E. Gibson, "A corpus analysis of Pirahã grammar: An investigation of recursion", Talk presented at the LSA (by E. Gibson)., 2012. [pdf]
[10]S. T. Piantadosi, J. Tenenbaum, N. Goodman, "Bootstrapping in a language of thought: a formal model of numerical concept learning", Cognition, vol. 123, 2012, pp. 199-217. [pdf]
[9]C. Kidd, S. T. Piantadosi, R. Aslin, "The Goldilocks Effect: Human Infants Allocate Attention to Visual Sequences That Are Neither Too Simple Nor Too Complex", PLoS ONE, 2012. [pdf]
[8]S. T. Piantadosi, H. Tily, E. Gibson, "Word lengths are optimized for efficient communication", Proceedings of the National Academy of Sciences, vol. 108, no. 9, 2011, pp. 3526. [pdf]
[7]S. T. Piantadosi, H. Tily, E. Gibson, "Reply to Reilly and Kean: Clarifications on word length and information content", Proceedings of the National Academy of Sciences, vol. 108, no. 20, 2011, pp. E109. [pdf]
[6]S. T. Piantadosi, "Learning and the language of thought.", Ph.D. dissertation, MIT, 2011. [pdf]
[5]S. T. Piantadosi, H. Tily, E. Gibson, "The communicative function of ambiguity in language", Cognition, vol. 122, 2011, pp. 280--291. [pdf]
[4]S. T. Piantadosi, J. Tenenbaum, N. Goodman, "Beyond Boolean logic: exploring representation languages for learning complex concepts", in Proceedings of the Cognitive Science Society, 2010. [pdf]
[3]C. Kidd, S. T. Piantadosi, R. Aslin, "The Goldilocks Effect: Infants' preference for visual stimuli that are neither too predictable nor too surprising", in Proceedings of the Cognitive Science Society, 2010. [pdf]
[2]S. T. Piantadosi, H. Tily, E. Gibson, "The communicative lexicon hypothesis", in Proceedings of the Cognitive Science Society, 2009, pp. 2582-2587. [pdf]
[1]S. T. Piantadosi, N. Goodman, B. Ellis, J. Tenenbaum, "A Bayesian model of the acquisition of compositional semantics", in Proceedings of the Cognitive Science Society, 2008. [pdf]

• libraries & software •

A number of research software packages are actively developed by colala and available under the GNU Public License:

  • LOTlib is a library for modeling learning complex concepts as compositions of primitives in a language of thought. GPL3

  • kelpy (kid experimental library in python) is a library for running simple psychology experiments in python. It is intended primarily for making simple animated displays with simple responses for baby and child experiments. It is built on top of pygame and supportrs Tobii eyetracking. GPL3

  • ngrampy is a python library for manipulating large google ngram data sets, and computing measures such as average surprisal in context from Piantadosi, Tily, & Gibson (2011). Code is included to replicate and extend that finding. GPL3

  • ChurIso is a scheme library for recovering Church encodings from constraints (email for upcoming publication). GPL3

  • pychuriso is a python implementation of churiso inference. GPL3

  • WeberMCMC Bayesian data analysis for estimation of Weber ratios. In practice, incorporating the reliability of an estimate of W into statistics allows for more power and correctness. GPL3

  • GPUropolis Bayesian inference via Metropolis-Hastings on symbolic expressions, currently under heavy development. Code is highly parallelized using CUDA to run on graphics hardware. Includes several classic scientific data sets to test with. GPL3

• data & code •

Data from all projects completed and in progress is available upon request

Meliora Hall
University of Rochester, River Campus
Rochester, NY