Data Science for Cog Sci Kids

Cognitive scientists think about how people think about data. Embedded in every cognitive science question is a data science question: What should you infer given some sort of data? Adjusting that answer with people’s quirks, biases, constraints, or what have you, we have a model of human learning. This makes it in principle possible to derive human thinking from first principles (rather than studying it as a “black box”), a dream shared by generations of cognitive scientists since Marr (1982) and Anderson (1990).

Sounds like a sweet deal! Like all things cool in life, it’s easier said than done. While most cognitive scientists can probably appreciate the usefulness of some “ground truth” models, building them is challenging for those who are not data scientists by training. Seeing papers like this, this, and this is quite intimidating, even though the topics look super interesting and relevant. Then again, as I was once told, you don’t have to be a math genius in order to do good computational cognitive science. You just need enough to get your research going!

Two years ago I compiled a “Beginner’s CoCoSci List” that I realized may be too ambitious for any beginners to plough through. Here I’m collecting math/programming/modeling resources that might actually get things off the ground for cog sci kids (like myself) who wish to break into data science. The beginning may be humble but I hope we can all journey on!

Recommendations more than welcome! (You can leave a message below or shoot me an email (yuan_meng [at] berkeley [dot] edu)!


Math

The Basics

I remember going to lunch with a visiting professor where a grad student asked him what he thought would be the most essential course for first-years. He said a math class, starting with basic symbols. For someone who does fancy modeling work, that was surprisingly not a fancy answer. But truly, many of us need it. As an undergrad, I was often scared off by papers that had matrices or integrals in them. If you wish to overcome this fear but don’t have a ton of time, I recommend A Mathematical Primer for Social Statistics by John Fox, which gives you a quick overview of linear algebra, calculus, and probability theory.

While this little green book is very accessible and provides both intuitions and some proofs, it sometimes wants for more details. When I felt like in-depth explanations, I watched YouTube/Khan Academy/edX/etc. videos or read tutorials/textbooks to fill in the gaps. 2-4 are some resources I referred to.

  1. A Mathematical Primer for Social Statistics
  2. AP Calculus AB on Khan Academy
  3. The Matrix Cookbook
  4. Review of Probability Theory (CS229)

Bayesian Inference

I love the dual role of Bayesian inference: Not only is it a powerful tool for making optimal inferences from data but it also captures people’s intuitive judgments in everyday life (turns out cognitive modelers often confuse the two purposes, Tauber et al. 2017). Given the ever rising popularity of Bayesian statistics, you may feel at loss when deciding where to begin.

If you just want a quick intro, read Simon Dedeo’s Bayesian Reasoning for Intelligent People—I found it highly intuitive and entertaining. To get more details, James V. Stone’s Bayes’ Rule is a great first book. If you’re really hooked and want to apply Bayesian statistics in your own data analysis, check out Richard McElreath’s Statistical Rethinking and John Krushke’s Doing Bayesian Data Analysis. Both authors explain concepts with unmatched clarity and combine code with theory.  If you’re interested in modeling human cognition as Bayesian inference, Michael Lee’s Bayesian Cognitive Modeling is one of the few books I know that teaches it step by step, from drawing graphical models to writing WinBUGS/JAGS code. Another fantastic book is Noah Goodman and Josh Tenenbaum’s Probabilistic Models of Cognition, which walks you through state-of-the-art models from iconic papers. If you’re so intrigued by Bayesian inference that you want to get to the bottom of it, you should eventually visit E. T. Jaynes’ seminal work Probability Theory: The Logic of Science!

  1. Bayesian Reasoning for Intelligent People
  2. Bayes’ Rule (MatLab, Python, R)
  3. Statistical Rethinking (website, lectures)
  4. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan (book, website)
  5. Bayesian Cognitive Modeling (book, website)
  6. Probabilistic Models of Cognition
  7. Probability Theory: The Logic of Science

Information Theory

If probability theory captures (human or machine) learners’ beliefs about the world, then its “little brother”—information theory—quantifies how much their beliefs change as a result learning and, importantly, can guide them on what to learn in advance.

Similarly, you can begin with Simon DeDeo’s Information Theory for Intelligent People, continue with James V. Stone’s Information Theory: A Tutorial Introduction, and dive into David MacKay’s Information Theory, Inference, and Learning Algorithms.

  1. Information Theory for Intelligent People
  2. Information Theory: A Tutorial Introduction
  3. Information Theory, Inference, and Learning Algorithms

Programming

R

I use R for almost everything, from data analysis to cognitive modeling (with R Stan or JAGS). When learning R, I found Grolemund and Wickham’s R for Data Science and Danielle Navarro’s Learning Statistics with R particularly helpful (you can find a more condensed version on her R for Psychological Science course website).

  1. R for Data Science
  2. Learning Statistics with R
  3. R for Psychological Science

Python

I don’t use Python that much since R satisfies my scientific computing needs. However, as Brad Voytek pointed out, Python is tremendously useful beyond scientific computing/stats purposes and may therefore be a more marketable programming skill. I found many beginner-friendly resources on Jeremy Manning’s Storytelling with Data course page. Brad Voytek also taught Data Science in Practice in Python and all the course materials are open to public. When I was first learning Python, I enjoyed How to Think Like a Computer Scientist (It now lives on Runestone Academy, where there seem to be a bunch of popular CS textbooks.)

  1. Learn Git (Code Academy)
  2. Learn Python (Code Academy)
  3. Introduction to Python
  4. Data Science in Pratice
  5. How to Thinking Like a Computer Scientist

These lists above are neither meant to be comprehensive nor intimidating. Through a quick search, you can surely find far more resources (e.g., like here, here, and here) and easily feel like a kid in a candy shop. You just gotta start somewhere and stick through it!

Last but most importantly, there’s no need to finish all of the above before you can get your feet wet! As many would say, learning data science (just like learning about everything else) is most effective and motivating when you put your theory into practice! So think of something you’re interested in and see how you can answer it with data! Below are some places to find nice datasets recommended by Jeremy Manning. Enjoy!

Datasets

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at WordPress.com
%d bloggers like this: