Department of Journalism and Public Relations

Catalog listing

The unprecedented availability of ”big data” constitutes a surfeit of digital information that is utilized by academics, government institutions, private industry, and digital firms. This course trains students to use the tools (software, methods, and theory) required to access, process, analyze, and compose findings in the manner of public interest and social science journalism.

Room: Online

Email:

Course website: http://bit.ly/csuc-dj-2021

DESCRIPTION

The need for analytic abilities exploded in the last decade. The desire for evidence-based decisions extended into nearly every facet of life. We trust algorithms to deliver us online orders, match us with dates, and even find health professionals. Almost all journalism has moved from print to screens, and this has changed the underlying skill set the profession requires.

This course trains students to use the tools (software, methods and theory) required to access, process, analyze and compose findings in the manner of public interest and data journalism. The course will introduce students to the R programming language, data manipulation, and visualizations.

STUDENT LEARNING OUTCOMES

At the conclusion of this course, students will understand:

  1. Why data journalism exists (a brief history)
  2. Basic data science technologies
  3. Common sources for data (open datasets, websites, etc.)
  4. Using R for data manipulation
  5. A comprehensive grammar of graphics
  6. Identifying good, bad, ugly, and wrong data visualizations
  7. Creating data visualizations
  8. Production of data visualization appropriate for journalistic publication.
  9. Production of journalistic narratives, both text- and image-based, that employ the above methods.

HOW THE CLASS WILL WORK

The course is divided into two sections. In the first eight weeks of the course, you will 1) learn basic web technologies, 2) data storage, extraction and manipulation, 3) basic data summaries, and 4) data visualization. At the end of this portion of the course, you use what you’ve learned to create an exploratory data analysis project (see Project 1 in Assignments and Grading). The second portion of the course will function as a modern newsroom. There will be four sections covering different ‘desks’, each presenting a topic from a particular beat (i.e., science and technology, criminal and social justice, media and politics, and culture and entertainment). We will discuss the examples in these sections, and you will use them to draw inspiration for your final project (see Project 2 in Assignments and Grading).

This course is concerned with an unusual blending of theoretical and practical applications. The theoretical, here, is concerned with what is required to determine truth about a subject that comprises some public interest; the practical concerns, here, address how data are collected, stored, manipulated and analyzed in the pursuit of that truth. With such varied and complex materials, it is imperative that students commit to attending class, completing readings and homework, and consulting me if and when difficulties arise. Further, this is a course in a professional program. This means that students approach this coursework as they would a professional commitment with timeliness (in attendance and submission of assignments), responsibilities (to readings and assignments), and respectful interactions with others in the course. By taking this course, students commit to attending every meeting (absent medical documentation) and arriving in class prepared and on-time. This course meets only sixteen times, which necessitates student attendance for comprehension of material. Therefore, attendance is mandatory and each missed meeting (sans medical documentation) will result in final grade reduction of one full letter grade. There will be opportunities to make up missed assignments and absences throughout the course.

ASSIGNMENTS AND GRADING

This course requires that students attend class to learn basic data manipulation and data visualization skills. Therefore, much of the student evaluation, besides the weekly exercises conducted to ensure comprehension, is based on two data journalism projects.

Exercises (credit): These are in-class exercises and homework assignments. Each missed exercise after the first will cost the student a full letter grade for the course. Completion of these assignments is obligatory.

Project 1 (40%): Reproduce an exploratory data analysis (EDA) of a given data source and accompanying narrative. This project gives students an opportunity to practice data manipulation, visualization, graphing techniques, presentation and framing. Length: 1,200-1,500 words.

Project 2 (40%): Storyboard project. Students will choose from a list of data sources, create a sequence of data visualizations and displays, and provide related commentary and narrative. This project will be published online. This project requires the student to use the R programming to examine an issue of social scientific significance. Length: 1,200-1,500 words.

Weekly quizzes (20% total): These are formative assessments to review material from the previous week.

I will keep the gradebook current in Blackboard. If you have difficulty with material or concepts, please talk to me. There will be many opportunities to review material individually, but I need to be aware of any such difficulties.

THE WEEKLY AGENDA

We meet once a week, during which I give a brief lecture, following a discussion. We will then walk-through ‘live coding’ activities that demonstrate how to collect, clean and visualize a variety of data types. The last portion of class can be used to complete the assigned activities for the week. Exercises are due at the end of each week, and completion will constitute your participation grade for that week.

Prior to our meeting each week, there will be assigned readings. There will be a quiz on this material given immediately at the beginning of each week’s class.

PLAGIARISM

In this course, we are following standard definitions of plagiarism that include: no copying of text (verbatim or partial), no introduction of another’s idea without attribution, no use of another’s intellectual property without consent and clear attribution of source. Cheating on any quiz or exam will be considered a similar violation.

WRITING CODE

Students will encounter a fair amount of code in this course. While this is not a programming course, it’s standard practice to use code snippets from online resources like StackOverflow or online vignettes. R is an open-source programming language, and reusing code encourages collaborators to build on the work of others. Therefore, students are encouraged to copy and paste code from other sources. However, this code usually needs to be adapted to fit a particular use. This is the one exception to the plagiarism policy. All story text and photos must be exclusively your own. I loathe dishonesty, in nearly every form and circumstance. I will immediately fail and request the maximum possible punishment for any student that violates these policies.

BOOKS AND MATERIALS

Data Journalism texts:

The Data Journalism Handbook, Gray, et al., (2012). Online at: http://datajournalismhandbook.org/1.0/en/index.html

The Data Journalism Handbook 2, Gray, et al., (2018). Online at: https://datajournalism.com/read/handbook/two

R programming texts:

The following texts are available online free of charge:

R for Data Science

R Markdown: the definitive guide

Data visualization texts:

The following texts are available online free of charge:

Data Visualization: A practical introduction

Fundamentals of Data Visualization

Additionally, several online readings will be required and assigned in class. Students are expected to have consistent high-speed Internet availability and have access to a computer with modern processing capabilities (open lab is available for those without such a machine).

DISABILITIES

If you have a disability and require special accommodations, please see me to discuss possible arrangements with ARC.

GRADE SCALE

All tests, quizzes and large presentations will be graded on a 100-point scale. All percentages are rounded to the nearest whole percent. Letter grades are based on the following scale:

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.