Why partisanship (right-left) might not predict susceptibility to misinformation all that much.
Identifying a potential methodological artefact in Nikolov, Flammini, and Menczer (2021), Right and left, partisanship predicts (asymmetric) vulnerability to misinformation, and a positive suggestion for how to fix it.
(This is not a peer-reviewed review, it’s just my personal opinions at this stage.)
Right and left, partisanship predicts (asymmetric) vulnerability to misinformation | HKS…
We analyze the relationship between partisanship, echo chambers, and vulnerability to online misinformation by studying…
Nikolov, Flammini, and Menczer’s paper is excellent, and on a very relevant question, which is a tragically rare combination in the field so far — both technically advanced and on a humanly highly relevant topic, but there’s a potential methodological artefact link between the way they define ‘misinformation’, partly relying on Zimdars et al. 2016, and, likely to be partly resulting from that, variation in the false negative error rates of identifying and of grading misinformation sites across the left-right partisanship categories or clusters in the network.
This means that the apparent causal link between partisanship and vulnerability to misinformation might not be as strong as they conclude, or the distribution might be different — I’d guess it’s bimodal not binary, as in the classic ‘horseshoe’ model of left-right partisanship and political values. I do think there is a real causal nexus as they say, but I’m dubious that it’s as simple or as statistically significant as they conclude.
Firstly, I don’t have a problem with the heuristic principle of source-checking before fact-checking: of course, it’s fallible, like every other means and medium of knowledge, but it’s very reasonable given the nature of the information environment we live in, the limitations of human cognitive and social psychology, and the vital needs which sufficiently rational decision-making must meet, in terms of both accuracy and efficiency, because a perfect decision too late is as maladaptive as an inaccurate decision.
Our cognitive capacities are inevitably inadequate to fact-checking all incoming information, so we have to rely on social network based heuristics. The choices existentially available to us are only about how consciously, explicitly, and carefully we apply social heuristics to filter the volume of information in the global network down to a volume which we can individually process, not whether or not we rely on social heuristics.
Applying social heuristics appropriately prevents our more resource-demanding cognitive processes from getting overloaded and then malfunctioning or ceasing to function (as Russia Today’s banner slogan ‘Question More’ is aimed at producing). People, and other animals, can make rational decisions inasmuch as we apportion cognitive resources between high efficiency vs. high accuracy processes in a meta-rational way. Source-checking before fact-checking is an example of apportioning cognitive resources in an efficient way before checking the resulting smaller sampling of information in a more accurate, individually rational way.
I also don’t have a problem with Zimdars et al. 2016’s positive identifications of misinformation sources — I haven’t checked every one, but on a scan read I recognise and agree with most, although I think the qualitative grading of how seriously misinforming the sites are is probably underestimating the severity of misinformation on left-wing sites more.
My problem is about the sampling methods in Zimdars et al. 2016 and hence in Nikolov et al. 2021. I think there are more likely to be more false negative errors in identifying misinformation sources in the left-wing cluster than the right-wing, because:
Zimdars et al. 2016 does not specify how they sampled sites to detect misinformation sites. I expect how they did it was checking sites as they came to them, or exploring around networks of sites and sources, and checking as they go. This is also how I’ve done it before, and as an initial exploratory method it’s fine, but when misinformation is going to be a categorical variable in a statistical analysis to measure a hypothesised causal relationship, it’s more appropriate to use a systematic sampling method.
Obviously, systematically sampling the whole internet and manually coding which sites are misinformation sources, and what qualitative grade of misinformation, would be impractically time-consuming and labour-intensive, but at least it would be viable to do a systematic random sampling method to check how much the observer’s relative position in the network influences what sites they come across and how they grade them (observer bias), and to check how much false negative error rates vary across the left-right clusters, on at least a systematically random sub-section of sites across the left-right political clustering dimension in the network structure.
Zimdars et al. methodology of judging sources as misinformation or not weights stylistic and aesthetic factors quite highly. But, is not following the Associated Press (AP) Style Guide really a reliable criterion for misinfo? If a source disagrees with AP terminology, with good reasons, is that relevant to whether it is misinformation or not? Journalists from culturally subordinate countries more often disagree with WENA-cultures’ Style Guides, due to their experiences, values and logic. Would students participating in Zimdars’ study by categorising and grading sites have noticed or been instructed to record when variations from AP’s Style Guide occurred for apparently good reasons and probably do not have any meaningful connection to the site being ‘misinformation’ or not?
AP terminology e.g. https://blog.ap.org/announcements/now-we-say-the-islamic-state-group-instead-of-isil advises usage of ‘Islamic State group’ for الدولة الإسلامية في العراق والشام‘’ (ad-Dawlah al-Islāmiyah fī ‘l-ʿIrāq wa-sh-Shām). Arabic reporters often prefer to call it ‘داعش’, transliterated as ‘DAESH’, because that’s delegitimizing and offensive to DAESH as it implies denying their claim to be ‘Islamic’ and makes them sound pretentious, as they are, but AP calls it “Islamic State group”, which risks legitimizing anti-Muslim hate speech. “DAESH” doesn’t mean it’s misinfo.
Mainstream (i.e. non-specialist) media often call the war in Syria since 2015 a “civil war”, but without the Iranian and Russian interventions to backup their client regime, headed by Bashar al-Assad, the civil war part of the war would probably have been over by Autumn 2015. Most of the armed forces on the regime’s side are foreign, mainly Iranian supplied ground forces and Russian air forces, against mainly Syrian Opposition forces. So ‘civil war’ is a malapropism for the kind of war in Syria now. When Syrian Opposition media vary from the AP’s ‘civil war’ framing, that has no meaningful connection to their sites being ‘misinformation’ sites.
Enab Baladi is a Syrian Opposition media company originally formed as a women’s cooperative in Daraya during the revolution. They’ve done some very important reporting, e.g. on the emerging civil-society based local governance system of the Local Coordination Councils, and their reporting is substantially high-quality, however, their proofreading in English sucks. That does not mean they’re ‘misinformation’ or a lower-grade of source, just that they should hire a better team of English proofreading editors to help.
is a Syrian nonprofit media organization established in Daraya Damascus suburbs in 2011 upon the outbreak of the Syrian…
On the flipside, an example of stylistic normalcy according to WENA cultures which has no meaningful connection to reality: AP and other mainstream Western broadcasters often call de facto authorities, especially in former colonial subject countries, who have no plausible grounds for claiming representativeness or legitimacy according to democratic values, ‘states’, or apply statehood terminology, thereby attributing legitimacy to them, without any reasonable connection to reality. Is non-conforming to that stylistic norm a reason to code a source as ‘misinfo’ or ‘politically biased’? Is it indeed apolitical to talk about de facto authorities who have presented no plausible evidence for their claims to representativeness or public service using statehood terminology for them, when one is neither a subject of their rule nor having ascertained the subjects’ consent to it?
Stylistic and aesthetic factors probably indicate where a site or source is coming from socially and culturally, or where its authors want it to appear to be coming from or associated with, but such preferences don’t necessarily have any meaningful connection to being misinformation or not, or being presented in accord with democratic values.
In Zimdars et al. 2016, were there graph-based scraping algorithms used in the sampling method before the human content analysis steps mentioned in their methodology, in order to make sampling more systematic and representative? What was the design(s) and the underlying assumptions of the scraping algorithms used?
Were there any methodological controls to minimize false negatives, and especially to minimize unconscious partisan biases? Statistically biased lack of contextual knowledge around issues is likely to be more often about international news issues than about domestic ones. And how often are domestic vs. international issues discussed in right-wing vs. left-wing sources? That too potentially biases the data collection and source coding.
I mean ‘statistically biased’ in the sense that our individual and social positions in the networked community structure of the global information space affect what information reaches us and what information doesn’t reach us passively. Even if one makes a conscious effort to correct this sort of sampling bias on what information one encounters, it is inevitable it affects what one sees and doesn’t see or receive, to some extent. Such network structure based statistical sampling biases on what information we see are not only related to political partisanship (homophily), but also where we originated in the network (stigmergy), offline and online, because, in general, earliest influences persistently shape complex processes, such as network community structures, the most. Thus, we are less likely to see an even sampling of information from distant countries, even before accounting for political partisanship and ‘left-right’ positionality in the network.
I think it’s reasonable to assume, considering demographics and polling data, that the students and academics who participated in compiling Zimdars et al. 2016’s dataset probably lean liberal-leftish (as do I). Leftwing political narratives tend to have a bigger scope, more often referring to international patterns and processes, which naturally we tend to know less directly and contextually about. It is easier to be unknowingly and unintentionally biased about things one has not directly seen but necessarily has to rely on others’ for knowledge of (especially when one is enculturated to suppress consciousness of social heuristics as ‘irrational’, and to over-rely on individualistic ‘pure reason’ as if human rationality existed in an asocial, disembodied vacuum).
Compounding that difficulty, to spot inaccuracies, un-representatively selected emphases and omissions, decontextualized or mis-contextualized facts, and to notice unsubstantiated causal or moral nexus attribution claims, often hidden in the adverbs or in rhetorical stylistic flourishes, requires much more contextual, historical subject knowledge. It is inherently harder to see what’s not there when the contextual knowledge necessary to make sense of and to judge representations of events, actions, and claimed connections about it, is more unfamiliar and about more diverse subjects.
Furthermore, in the methods section of Nikolov et al. , 2021: Major news events included in the timeframe of the dataset — only 2 out of the 8 issues (Muller investigation about Russian interference; American intervention in Syria) I would expect to engage left-leaning clusters as much as right-leaning clusters. Thus, the dataset they scraped to compare against Zimdars et al. 2016’s dataset is probably also affected by a statistical sampling bias, as less of the activity in the timeframe would have engaged left-wing popular discussions as it would have right-wing. This would be checkable by going back to the dataset and re-graphing it in time-slices for each of the 8 major events and issues, and seeing how the activity is distributed socially.
The sampling of activity in Nikolov et al. , 2021, was continuous, from June 1 to June 30, 2017, but that’s a short period, and it doesn’t seem like it was selected to capture evenly all of the activity around the 8 events.
The time-frame of popular discussion activity on different topics varies significantly; in my personal observations, usually between two days and two weeks. Events and issues are discussed in specialist niches in the network continuously outside that timeframe, but news only breaks through into mainstream popular discussion occasionally, a few days after specialists get it, and popular discussions last a few days to a few weeks. How long popular discussion lasts doesn’t only depend on the audience’s attention span and interest level, but also on how much interested parties amplify them.
At least 5 of the 8 issues I expect would have been heavily re-framed and amplified by the Russian regime to promote their strategic political interests. How long they amplify them for depends on what sort of political decision they’re aiming to influence. Thus, the probability of all the activity about an issue being in the sampling timeframe varies in a way linked to political partisanship, besides the hypothesized relationship. It also potentially relates to how successful an influence operation is how quickly — if it’s sufficiently successful quicker, amplification resources will be redirected sooner, and the volume of popular political discussion activity on it will decrease sooner.
There are two more potential sampling bias effects due to Nikolov et al. ’s methodology: to focus on active online news consumers, they selected accounts which shared at least ten links from a set of news sources with known political valence (Bakshy et al., 2015) during that period. Bakshy et al.’s sample comes from Facebook, so is huge and comprehensive, but the sites shared on Facebook and on Twitter don’t necessarily exactly overlap — there might be more omissions in the dataset when it’s transferred to Twitter data on the left than on the right.
And “To focus on users who are vulnerable to misinformation, we further selected those who shared at least one link from a source labeled as low-quality. This second condition excludes 5% of active right-leaning users and 30% of active left-leaning users.” The problem is that how much of that difference in frequency of sharing ‘low-quality sources’ is due to the methodology of how such sources are identified and graded and how much is due to the hypothesized relationship between partisanship and susceptibility to misinformation, are entangled.
Nikolov et al. don’t specify which of Zimdar et al.’s categories of misinformation they included in ‘low-quality’ — I guess they mean all of Zimdar et al.’s categories, but it’s not clear. If Nikolov et al. are only including some of Zimdar et al.’s categories in ‘low-quality’, it makes the qualitative grading of sources in Zimdar et al. more important to Nikolov et al.’s results; and if there is also a biased down-grading of left-wing sources, potentially due to the subjects of left-wing narratives being more unfamiliar, diverse and complex, and information about them being less likely to come to the audience or the coders as directly, thus harder to see what’s not there or to notice mis-contextualized but in particular true information, that could also methodologically artifactually confound Nikolov et al.’s result.
I am sensitive to the fact that Nikolov et al.’s research probably took in the order of 200x more time and effort to create than to critique, so it would be churlish to critique it without proposing a positive solution to the potential methodological artefacts I’ve mentioned. For this part especially, I’m grateful to Kaneshk Waghmare, a friend and neuroscience doctoral student at Amsterdam university, for discussions which led to this idea on how to fix it:
It’s impractical to comprehensively collect social media data and systematically analyse all of it, but I think it would be practical to collect a random sample of sites presenting themselves as ‘news’ sites, as evenly as possible across the relevant network, manually code them systematically, and then compare the rates of coding sites as ‘misinformation’, and what grade, in that random sample compared with the as-it-came-in, non-random sample, to check whether the misinformation and grading rates significantly vary.
If the unsystematic sample and the systematically random sample differ significantly in their ‘misinformation’ categorical coding rates, it probably indicates that there was about that size of false negative error rate in the original sample. The timeframe of sampling almost certainly matters significantly too, so this control would have to be performed on a new dataset, so it wouldn’t be appropriate to retrospectively subtract the estimated false negative error rate and rerun the statistical analysis, but it could be applied in subsequent studies and indicatively about the dataset in Nikolov et al. 2021.
Prof. Kate Starbird replied to my Twitter thread about this, and agreed that in her observations too the decontextualized or mis-contextualized but in particular true kind of misinformation is more frequent in the left-wing cluster over time generally than the right. That means it’s harder to identify, so methods like Zimdars et al. 2016 are more likely to miss or down-grade left-wing misinformation sources more often than right-wing ones.
Nikolov et al.’s main conclusion could be true, but without controlling for these potential methodological artefacts it’s not as certain as it appears to be.