I had never heard of Data Journalism until a few weeks ago. I’m still not entirely sure I understand what it means – but there are seemingly job openings for Data Journalists. Plenty of them.
What makes a person a data journalist? The ability to deal with data. At first I thought this must be pretty simple: take a statistics class, learn the basics of data interpretation. Want to know more? Take more statistics classes. That was a social scientist‘s point of view – and it’s not true for data journalism.
Statistics vs Data Journalism
A data journalist definitely needs to know basic statistics. No competent data journalist would confuse a proportion with a percentage, and then report that prices (or profits) had increased 1300%. A data journalist understands the meaning of statistical significance, can accurately interpret reports of scientific research, and is energized, rather than terrified, but the presence of numbers.
The difference between the professions is found in the auxiliary skills. A social scientist – the group most commonly compared to data journalists – expects to define variables, collect data that no one has collected before, or create unique data sets. Even social scientists who specialize in secondary analysis – working with data collected by governments, international agencies, or public data sets – are interested primarily in exploring theories or evaluating which academic perspective is more likely to be true. So they study research methods, marinate themselves in the intricacies of social theories, and garner all the tools of academic discourse.
The data journalist does not expect or want to be the creator of data – although she may well aspire to be the one who combined existing data in new ways to generate a new perspective. The data journalist needs all the skills of any journalist – tracking down all angles of a story, gathering the particular details and forming a coherent narrative that is supported by the facts of the situation. For a data journalist, those facts are data – usually numbers – gathered by local, statewide, national, international governmental bodies as well as many non-profit agencies and dozens – hundreds – of public relations and advertising firms. Oh yes – there’s also the data being generated by your cell phone, Facebook or Linked-In, your computer use and Google click-through, and the like.
Skills of the Data Journalist
Beyond the basics of statistics – understanding a frequency distribution table or a research report – the data journalist is an organizer of existing data. They pursue topics not data sets or research agendas. Rather than a long academic review of the literature on an issue like religious freedom or the psychological impact of unemployment, the data journalist wants to discover and create an underlying plot line, and support it with data from government reports, social scientists’ research projects, economic projections, and more.
As soon as one sets out on this path, the problems are apparent. Even though data is officially public, it may not be released quickly. Government websites may offer ready-made analyses – particular tables, constantly updated and ready to go – but make it much more difficult to get the entire data set. Budget cutbacks threaten even data sets as important as the U.S. Census or the American Communities studies – where can the data journalist get results to interpret? Data journalists become expert at downloading and organizing data, at computing rates or exact numbers to create comparability across data sets, and at the various database languages needed to carry out these activities. They become better data managers than the most assiduous researchers, with good reason: Their data are hard to obtain, sometimes impossible to replace, and almost always found in three or four different formats across dozens of locations.
So the data journalist learns to find public data and to ask for access to data files from private researchers – especially if the research was funded from tax income. More difficult than finding the data is the process of organizing it for analysis. It may have to be massively reformatted. Raw numbers may need to be converted to rates, or converted to a different basis to appear in the same graph with a second variable. Many of the skills for this part of the profession are found in Computer Science departments, not social science or statistics – and the names of the programs they use for scraping and re-organizing data are not familiar to most mainstream social science researchers.
There is true journalism in this process. It’s not enough to simply amass a quantity of data. The journalist needs to find a plot line. What changes are occurring? How do they relate to other changes? What seems to be creating or facilitating the changes? Or – why is nothing changing when one would expect it would? Data journalism requires analyzing and organizing the data to find the story – and then telling the story.
Images have become an central part of data journalism. The well-designed and revealing graph replaces the descriptive photo in many stories. Where Dorothea Lange’s photos gave Americans a sense of the devastation of the Dustbowl of the 1930’s, today’s data journalists are expected to find creative ways to present their data so that readers can absorb the onslaught of numbers without their math-phobic nerves being awakened.
Hans Rosling pioneered animated graphs, taking static graphs showing, for instance, the relationship of birthrate and child mortality for the nations of the world. The static graph for any one year became a dynamic story, with the less developed nations racing across the grid. He made the case that some developing nations in 2008 had maternal health statistics that were similar to those of western nations in the 1970s – using the style of football play-by-play announcers rather than the language of statisticians.
Technological tools – software – is certainly important here. But so is the designer’s eye, the ability to use color, space, size, shape and motion to present the story in the data. The data journalist has to write well – but she also needs to design the data presentation well. Some specialize only in the visual presentation of data, creating the story in graphs and charts and leaving the writing – now a less creative task – to writers.
Allied to the growing field of data journalism is the new profession of Data Scientist, with an even greater emphasis on the computer science and mathematical skills. Both are evidence that the information age has arrived – and created the jobs of the future. They just don’t look the way many of us expected them to.
- 5 tips for getting started in data journalism (nextlevelofnews.com)
- The work of data journalism: Find, clean, analyze, create … repeat (radar.oreilly.com)
- The Grunt Work of Data Journalism (forbes.com)
- The Data Journalist Tool Belt (readwriteweb.com)
- Gathering data: a flow chart for data journalists (onlinejournalismblog.com)