Undergraduate Course Descriptions
Find a Course
In this 'big data' era, presidents and popes tweet daily. Anyone can broadcast their thoughts and experiences through social media. Speeches, debates and events are recorded in online text archives. The resulting explosion of available textual data means that journalists and marketers summarize ideas and events by visualizing the results of textual analysis (the ubiquitous 'word cloud' just scratches the surface of what is possible). Automated text analysis reveals similarities and differences between groups of people and ideological positions. In this hands-on course students will learn how to manage large textual datasets (e.g. Twitter, YouTube, news stories) to investigate research questions. They will work through a series of steps to collect, organize, analyze and present textual data by using automated tools toward a final project of relevant interest. The course will cover linguistic theory and techniques that can be applied to textual data (particularly from the fields of corpus linguistics and natural language processing). No prior programming experience is required. Through this course students will gain skills writing Python programs to handle large amounts of textual data and become familiar with one of the key techniques used by data scientists, which is currently one of the most in-demand jobs.