Journalism and Language Precision

Using Technology to Recode for Trust

By Janet Coats, Managing Director, The Consortium on Trust in Media and Technology
College of Journalism and Communications, University of Florida

Summary

The crisis in trust is pervasive, and nowhere more evident than in the relationship between news media and its consumers. It seems logical to pin the erosion of trust on to digital and social media, particularly as measures of trust plummet as sources of news multiply. That, however, would be missing a fundamental role of language itself in establishing or undermining trust. Indeed, understanding the intention in language and how receivers perceive that intention is an essential element in coding for trust. To build and sustain trust, then, journalists must measure their intention against the embedded bias of language, and choose their words carefully. And technology, it turns out, can be a helpful partner in identifying that language and in so doing help restore trust.

I’m the director of a research center called the Consortium on Trust in Media and Technology at the University of Florida. I often say that I’m blessed with a nice, narrow mission focused on an easily solvable problem.

It’s a weak joke, the kind made to acknowledge the obvious: The trust crisis is pervasive, leaving virtually no institution or profession untouched. At the center of it sit the two fields that connect with all others: media and technology. They twine around each other to create a thicket that can be dense with information but short on sunlight.

Throughout the rise of the digital age, we’ve celebrated the advance of technology as a means of connection and despaired of the ways it has driven us apart. We’ve blamed it, reflexively, for the destruction of the news business model and the decline of reported news. We’ve praised social media as a democratizing agent, providing unmediated access to social movements such as #BlackLivesMatter, and we’ve decried it as a powerful engine of disinformation that puts the very republic at risk. Now, the object of fear and promise is artificial intelligence, specifically generative AI. It’s either going to kill us all, free us to do more creative work, or replace us in the workforce.

I tend to be, as President Kennedy said, an idealist without illusion. Having watched my chosen profession dreadfully miscalculate the need to embrace the possibilities of the internet, I’m not inclined to sit the rise of AI out. It’s heartening to see journalism organizations like the Associated Press embrace the complicated business of leveraging AI’s promise while mitigating the harms, and to see funders like the Knight Foundation explore the constructive application of its possibilities for local journalism.

Thus I’m inclined to flip the emerging script that AI threatens the value of reported, verified news and information and will be largely an engine driving distrust by focusing on this idea: Can we turn to technology—natural language processing and generative AI specifically—to help us make journalism more trustworthy?

It Begins with a Question

If the task is to build trust as a coin of social interaction and self-governance, it’s hard to know where to start when you survey the diminished state of journalism (particularly at the local level) and the ever-faster evolution of technology that powers information dissemination. At the Consortium, we’re proposing a basic starting point: Thinking about the relationship between trust and language.

Trust and distrust are “registered into the very language we speak.”1 If the language we use reduces trust, we’re pushed into aversion and fear. But if we can break that code and recode our words away from bias and toward an authentic reflection of the language people use to describe their experiences, can that move us closer to journalism that engenders trust?

“Is there a language of journalism?” That’s the seminal question that prompted this line of inquiry. It was posed by Paul Cheung, CEO of the Center for Public Integrity, and an advisor to our Consortium on Trust. Paul’s question was informed by his experience decoding intention in language as a non-native English speaker.

It’s an intriguing idea: Is there a common language used in news coverage, particularly on contentious topics, that is broadly and reflexively adopted by reporters across news organizations? And what intention does that language convey to those who consume news?

We turned to the field of computational linguistics to help find an answer. Our conclusion: Yes, there is a “language of journalism” that can be documented by analyzing a large body of news stories. That language can range from the subtle to the downright jaw-dropping in terms of the potential for bias.





An example:

We analyzed a large body of news stories covering the public response to the murder of George Floyd, drawn from the corpus called “News on the Web.”2 That analysis found:

  • The word “protest” is commonly used to describe the response.
  • The verbs associated with protest commonly carry negative connotations.
  • The verbs carry an expectation of volatility.

To illustrate: verbs commonly appearing with protest in this context convey images of fire:

In fact, the word “spark” appears in news coverage associated with George Floyd more frequently than it appears in the context of its literal meaning of igniting fire.

The broad use of this fiery terminology suggests that it is embedded in the language journalists use to describe protests. Additionally, we found that it is common for news coverage to qualify the nature of peaceful protests: “mostly peaceful,” “largely peaceful,” “relatively peaceful,” “otherwise peaceful.” The language suggests that a peaceful protest is outside the norm and a condition that could be expected to change rapidly.

Is this the meaning that journalists intend to convey when they cover these events? Or is it a default, used without intent and insensitive to the connotations of these words?

The generous explanation, given the widespread use of these constructs, is that the language has become a default. Our challenge is to develop a method and tool that helps journalists move beyond their “gut” to make word choices that are more authentic to the language used by the communities they serve, more intentional and precise. Before we describe what that could look like, let’s consider why it is important to think about language in this way and why we should turn to technology to help.

What News Trust-building Efforts Teach Us

Building trust in news has focused on both the creators and the consumers, and there are lessons that inform our thinking drawn from both of those approaches.

Efforts to help journalists cover news in ways that are worthy of trust have focused on concepts like transparency, word choices and community connection. Media literacy focuses on news consumers, with an emphasis on detecting disinformation/misinformation and identifying high-quality news sources. In both the creation and consumption of news, understanding bias in language plays a critical role. Historically, newsrooms have relied on the “journalist’s gut” in terms of understanding language choice; it’s an approach not dissimilar from Justice Stewart’s description of pornography as “I know it when I see it.” Media literacy’s approach to biased language is similarly imprecise and puts the burden on the reader/viewer/user to understand the nuance of meaning and connotation.

As noted earlier, signifiers for trust and distrust are embedded in our language choices. Understanding the intention in language and how receivers perceive that intention is an essential element in coding for trust. For instance, we know that persuasive language shifts perceptions, and that disinformation is framed with the intention of provoking powerful emotions like fear and anger.

Language precision is too important to leave to “gut,” and the expectation that audiences will be able to decode our meanings is too high. A data-driven approach, through which journalists both see the language choices they are making and the implicit bias that language may convey, will put the burden where it belongs: On the people who are reporting and writing the news.

The Role of Linguistics

We turn to linguistics to provide the foundation for this model. In fact, the principles of linguistics are not far removed from the core principles of good journalism. Let’s start with the idea of the Cooperative Principle as framed by H.P. Grice.3 His framing relates to conversation, but it is a pretty good proxy for what journalism should accomplish. Summarized: Make the contribution to the conversation you are best positioned to make, at the right moment and toward the accepted purpose of the information exchange. Grice proposes four categories to consider: Quantity (as much information as is required and no more); quality (don’t say what you believe to be false and don’t say that for which you lack evidence); relation (be relevant); manner (avoid ambiguity, be brief, be orderly).

All that requires precision and authenticity of language. To achieve it, we can apply the data-driven learning model; in linguistics, that model is used to help language-learners acquire a second language by giving them access to authentic language datasets that help them discover how grammar is actually used rather than learning through the memorization of grammar rules that may often be broken or inconsistent.4

For our purposes, we used concordancers to analyze large bodies of news coverage. In corpus linguistics, a concordancer is a computer program that retrieves sorted lists of linguistic data from “real world text” for analysis. That analysis gives us insights into the “authentic language of journalism,” revealing patterns such as those we see in coverage of George Floyd.

Practical Application

This is all well and good—an interesting approach with findings that could have application in helping journalists communicate more precisely, better matching the intention of their language to the authentic language of the people who consume their reporting. But to bring that to life, we must get the tools into the hands of journalists in a way that is user-friendly.

To that end, we’re building a concordancing tool specifically for journalists that will allow them to analyze their language choices in real time to identify implicit bias and sharpen the precision of their words. The tool (we’re calling it Authentically) uses the methods I’ve described to identify patterns and give journalists the opportunity to ask themselves: Is this really what I meant to say? Does this accurately represent the events I’m describing? Is this language biased?

The tool could return a result that looks something like this:

We’re currently in the “beta of a beta,” pre-training a natural-language-processing model on a large language dataset to analyze news stories. Our current model relies on an active search: We identify completed news stories and analyze the language in them. We envision a final version that would enable passive search so that language can be analyzed as journalists write, providing a sentiment analysis, identifying potentially problematic words in real-time and suggesting possible alternatives.

We don’t need to wait for the tool to be complete to start putting what we’ve learned to work. We’re working now on developing educational materials for journalists and journalism students based on the analysis we’ve done on coverage of abortion and race. We’re making plans to begin analysis of 2024 election coverage. We’re working with early partners to conduct tests with newsrooms and to provide custom analysis for newsrooms focused on coverage areas they want to better understand. We’ve also experimented with using Generative AI as one tool in creating guides for newsrooms based on what we’ve learned through language analysis and see promise in that method. Guides could include word usage and interview frameworks modeling language use that is authentic to the ways communities talk about their concerns. Alongside our product development, we’re undertaking academic research to better understand the effectiveness of the tool in improving credibility with news consumers.




Considering the Word “Trust”

Analyzing the language journalists use with an eye toward precision brings us to the word at the center of this project—a word in the very name of the Consortium I lead. Is trust really the word we want to use to describe the relationship we’re trying to build between journalists and communities?

When I presented our work on language analysis at the Computer History Museum’s “Tech and the Future of News” workshop, discussion turned to the question of whether journalists can ever truly earn trust, particularly in communities that have been misrepresented, ignored and even scorned by news organizations. As Candice Fortman of Outlier Media framed it, trust is not even a starting point for those communities; news organizations would have a lot of work to do before the concept could even be put on the table. Tracie Powell, founder and CEO of The Pivot Fund, said that instead of thinking in terms of “trust,” we should be talking about authenticity and intention.

That resonates with the work we’re doing on language. We’ve been talking about three factors we hope to improve in the “language of journalism”: authenticity, intention and precision. Those ideas must extend beyond language to the act of reporting if there’s to be the possibility of being credible, much less trustworthy. For too long, journalists have “parachuted in” to communities where they have no connection to extract news of the tragic, then exited as quickly as they arrived, not to be seen again until the next tragedy. The default language of this kind of journalism is one of distance. Trust is built up close; to do that requires a language of proximity. We can use technology to help us see the distant “language of journalism” we’re using and understand the bias we may not intend but are demonstrating nonetheless.




A starting point to creating a new, more precise and intentional language of journalism is listening to the way people express themselves: the authentic words they use to describe themselves, their communities and the things they care about. A starting point to the conversation about trust is getting the language right. We’re learning that technology can help us get there.

Throughout the rise of the digital age, we’ve celebrated the advance of technology as a means of connection and despaired of the ways it has driven us apart

the word “spark” appears in news coverage associated with George Floyd more frequently than it appears in the context of its literal meaning of igniting fire

Is trust really the word we want to use to describe the relationship we’re trying to build between journalists and communities?

Trust is built up close; to do that requires a language of proximity

Janet Coats is the managing director of the Consortium on Trust in Media and Technology at the University of Florida, where she focuses on understanding the dynamics that have undermined trust in news. She came to UF from the Arizona State University’s Walter Cronkite School of Journalism and Mass Communications where she was Executive Director for Innovation and Strategy. Prior to her turn in academia, Coats led multimedia news organizations in Sarasota and Tampa. She has served as a Pulitzer Prize juror multiple times, including as chair of the Public Service jury.

1 Gefen, David; Fresneda, Jorge E.; Larsen, Kai R., “Trust and Distrust as Artifacts of Language: A Laten Semantic Approach to Studying Their Linguistic Correlates”, Frontier Psychology. 26 March 2020. Vol.11.

2 Davies, Mark. (2016-) Corpus of News on the Web (NOW). Available at https://www.english-corpora.org/now/.

3 Grice, H.P. “Logic and Conversation.” Syntax and Semantics. 1975. pp 41-58.

4 Johns, Tim. “Should You Be Persuaded – Two Samples of Data-Driven Learning Materials.” ELR Journal. Vol. 4, pp 1-16.

PREVIOUS STORY
NEXT STORY

Tech x The Future of News is a Publication of CHM