Professor Joseph Cappella is one of the editors of a special edition of The Annals of the American Academy of Political and Social Science, May 2015, No. 659, entitled “Toward Computational Social Science: Big Data in Digital Environments.” Professor Cappella’s coeditors are Dhavan V. Shah and W. Russell Neuman.
Work by Professor Cappella and several Annenberg faculty, researchers, and students are in this issue.
Dhavan V. Shah, Joseph N. Cappella, and W. Russell Neuman, “Big Data, Digital Media, and Computational Social Science: Possibilities and Perils.
The exponential growth in “the volume, velocity and variability” (Dumbill 2012, 2) of structured and unstructured social data has confronted fields such as political science, sociology, psychology, information systems, public health, public policy, and communication with a unique challenge: how can scientists best use computational tools to analyze such data, problematical as they may be, with the goal of understanding individuals and their interactions within social systems? The unprecedented availability of information on discrete behaviors, social expressions, personal connections, and social alignments provides insights on a range of phenomena and influence processes—from personality traits to political behaviors; from public opinion to relationship formation—despite issues of representativeness and uniformity. That is, even though data from social media may not represent the entirety of a population, that does not mean they are without research value for understanding that population. And the challenges of interpreting these sorts of social data are not limited to population biases and tailored content …
Sandra González-Bailón and Georgios Paltoglou, “Signals of Public Opinion in Online Communication: A Comparison of Methods and Data Sources.”
This study offers a systematic comparison of automated content analysis tools. The ability of different lexicons to correctly identify affective tone (e.g., positive vs. negative) is assessed in different social media environments. Our comparisons examine the reliability and validity of publicly available, off-the-shelf classifiers. We use datasets from a range of online sources that vary in the diversity and formality of the language used, and we apply different classifiers to extract information about the affective tone in these datasets. We first measure agreement (reliability test) and then compare their classifications with the benchmark of human coding (validity test). Our analyses show that validity and reliability vary with the formality and diversity of the text; we also show that ready-to-use methods leave much space for improvement when analyzing domain-specific content and that a machine-learning approach offers more accurate predictions across communication domains.
Stuart Soroka, Lori Young, Meital Balmas, “Bad News or Mad News? Sentiment Scoring of Negativity, Fear, and Anger in News Content.”
This article examines the prevalence and nature of negativity in news content. Using dictionary-based sentiment analysis, we examine roughly fifty-five thousand front-page news stories, comparing four different affect lexicons, one for general negativity, and three capturing different measures of fear and anger. We show that fear and anger are distinct measures that capture different sentiments. It may therefore be possible to separate out fear and anger in media content, as in psychology. We also find that negativity is more strongly related to anger than to fear for each measure. This result appears to be driven by a small number of foreign policy words in the anger dictionaries, rather than an indication that negativity in U.S. coverage reflects “anger.” We highlight the importance of tailoring lexicons to domains to improve construct validity when conducting dictionary-based automation. Finally, we connect these results to existing work on the impact of emotion on political preferences and reasoning.
Matthew Brook O’Donnell, Emily B. Falk, “Big Data under the Microscope and Brains in Social Context: Integrating Methods from Computational Social Science and Neuroscience.”
Methods for analyzing neural and computational social science data are usually used by different types of scientists and generally seen as distinct, but they strongly complement one another. Computational social science methodologies can strengthen and contextualize individual-level analysis, specifically our understanding of the brain. Neuroscience can help to unpack the mechanisms that lead from micro- through meso- to macro-level observations. Integrating levels of analysis is essential to unified progress in social research. We present two example areas that illustrate this integration. First, combining egocentric social network data with neural variables from the “egos” provides insight about why and for whom certain types of antismoking messages may be more or less effective. Second, combining tools from natural language processing with neuroimaging reveals mechanisms involved in successful message propagation, and suggests links from microscopic to macroscopic scales.
Joseph N. Cappella, Sijia Yang, Sungkyoung Lee, “Constructing Recommendation Systems for Effective Health Messages Using Content, Collaborative, and Hybrid Algorithms.”
Theoretical and empirical approaches to the design of effective messages to increase healthy and reduce risky behavior have shown only incremental progress. This article explores approaches to the development of a “recommendation system” for archives of public health messages. Recommendation systems are algorithms operating on dense data involving both individual preferences and objective message features. Their goal is to predict ratings for items (i.e., messages) not previously seen by the user on content similarity, prior preference patterns, or their combination. Standard approaches to message testing and research, while making progress, suffer from very slow accumulation of knowledge. This article seeks to leapfrog conventional models of message research, taking advantage of modeling developments in recommendation systems from the commercial arena. After sketching key components in developing recommendation algorithms, this article concludes with reflections on the implications of these approaches in both theory development and application.