POLS 559 - Text as Data
Weds 1:30-4:20
Smith 109
John Wilkerson (jwilker@uw.edu) Text as Data Publications and Projects
Office: Gowen 112
Office hours: Monday 10-12 or by appt (Zoom) any day other than Wednesday
This class is an introduction to computational approaches to collecting, formatting and analyzing text as data (TAD). We begin by considering the research opportunities and challenges associated with using text as a data source.
We then work through the stages of a text as data project - starting with selecting text, to converting text to data; to quantitative analysis. We will be particularly attentive to validation - to what extent is our text-based measure a valid proxy for our concept of interest?
The instructor is a political scientist specializing in American politics. Most of the examples will reflect these interests. However, students will be encouraged to investigate and share relevant readings that may be closer to their own interests.
This is not a class about the latest large language models. I'm not qualified to teach such a class. We focus instead on established TAD methods that are broadly accessible to social scientists and meet social science standards of replicability.
This is also not a programming class. It assumes that students have some familiarity with R and/or Python as both will be used in homework assignments. There is no TA to assist with coding questions. For starters, students are encouraged to support each other. CSSCR offers drop in consulting, 9am-6pm M-F. CSSS also offers consulting but the hours are more limited.
Weekly activities
-The readings listed for a given week should be completed before we meet that week. For example, on Wednesday January 8th we will be discussing Grimmer, Roberts and Stewart, chapters 1-2.
-The homework for a given week is typically due the following Monday (submitted as Canvas assignments in most cases). For example, the homework assignment assignment listed for the first week is due January 13th. Unless you are highly skilled in R and Python, it is probably best to start working on the homework assignments earlier than Sunday.
-Collaboration is strongly encouraged in this class. Obviously, the people listed on a collaborative assignment should have all contributed, and I generally expect collaborative assignments to demonstrate more thought and effort than those submitted individually.
Required books
The books have been ordered through the UW bookstore and should be available (as well as on line).
- Text as Data: A New Framework for Machine Learning and the Social Sciences (Links to an external site.). Justin Grimmer, Margaret Roberts, Brandon Stewart.
- Text Mining in R. Julia Silge and David Robinson
- Text Analysis in Python for Social Scientists: Discovery and Exploration. Dirk Hovy
Grading
- Participation (20%) – Class attendance and contributions to in-class discussions and activities. Readings are for the listed date (they should be completed in advance of that date).
- Homeworks (30%) – Due Mon. evening on Canvas unless otherwise noted.
- Research Project (50%) . We will be thinking about potential projects from the first day of class. My office door is open (so to speak) for brainstorming! Deadlines: Meet to discuss proposal (by Feb 26); Accepted proposal (March 12); Final project (March 23).
Academic honesty
Suspected plagiarism will be reported and disciplinary actions may ensue. For further detail about the University of Washington’s academic honesty policy, please refer to this website. (Links to an external site.)
Using AI for coding assistance is acceptable. It can be very helpful for trouble shooting (and learning). But of course, the longer term goal is to understand what you are doing.
It is also appropriate to use AI for research assistance. If you rely on AI summaries (e.g. in summarizing the findings of other studies), then you must credit your source as you would any other source. In my areas of expertise, I've found that AI results can be seriously incomplete (overlooking important works) and sometimes wrong (hallucinations). You are ultimately responsible for the content and quality of your submissions.
Accommodations and support
Disability Resources for Students (DRS) offers resources and coordinates reasonable accommodations for students with disabilities. If you have not yet established services through DRS, but have a temporary or permanent disability that requires accommodations (this can include but not limited to; mental health, attention-related, learning, vision, hearing, physical or health impacts), you are welcome to contact DRS at 206-543-8924, or uwdrs@uw.edu. See this website. (Links to an external site.)
- The Counseling Center and Hall Health are excellent resources on campus that many UW students utilize. Students may get help with study skills, career decisions, substance abuse, relationship difficulties, anxiety, depression, or other concerns.
- Counseling center website (Links to an external site.)
- Hall health website (Links to an external site.)
Washington state law requires the UW to develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW’s policy, including more information about how to request an accommodation, is available at Religious Accommodations Policy (https://registrar.washington.edu/staffandfaculty/religious-accommodations-policy/) (Links to an external site.). Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form (https://registrar.washington.edu/students/religious-accommodations-request/) (Links to an external site.).