Eligibility Check (Additional Analysis)
Do Won Kim
1 Checking Assumption
Question: Do people who follow zero qualifying accounts have consistent enough exposure in their home timeline to be worth including?
By adopting Following OR Hometimeline
as our eligibility
criteria, we assume that people are repetitively/consistently exposed to
indirect low quality tweets. That is, if we find indirect low-quality
tweets from users’ home timelines (i.e., tweets that their friends
retweeted or quoted) at the time when we collected reverse chronological
timeline data, we assume that exposure is not a one-time event.
To check this assumption, let’s compare (1) data we collected by randomly choosing 24 participants for preliminary eligibility analysis (Feb 2) and (2) data collected for the same participants during the main eligibility check (Feb 10).
From the preliminary
eligibility check, I retrieved user_id
of 24 randomly
chosen participants and searched them in our main eligibility results
csv file.
In preliminary check, these users were the ones with “no following
but found in home timeline” condition: user_id
, 26845113
- Home timeline (T),
Following (F)
Before: user_id
= 20416629
- Home timeline (T), Following (F)
After (List 1): Home Timeline (T), Following (F) (regardless of which list we use, this user is still not following but being exposed to low quality tweets in home timeline)

user_id = 26845113: Home Timeline (F), Following (F)
: Home Timeline (F), Following (F)
Based on the same list (List 1), there is no low quality tweets found in home timeline anymore. This is evidence against our assumption.
However, when we change to the longer list (List 4), it becomes: Home Timeline (T), Following (T)
- In List 4, direct tweets from
and indirect retweet fromRepStefanik
are shown in Home Timeline.
- In List 4, direct tweets from
So, of the two cases where users were not following any eligible accounts but had previously been exposed to indirect tweets from these accounts, one supports our assumption while the other does not.
2 List 4
We are likely to choose List 4 (N=1515). The problem is that the list is too long so muting all of them will take a lot of time, and given power law distribution of followers, almost all of the muting treatment will be through those accounts with larger number of followers.
Brendan’s suggestion: we reverse rank the accounts by number of followers in the pilot and select some cutoff on how many accounts to include (which we can show covers X% of all eligible accounts) and then mute 70% of those accounts.
Hence, I used Twitter API to retrieve the number of followers, sorted the followers in descending order, and calculated the percentage cumulative followers, to determine the number of accounts to be muted by a certain threshold (e.g., 99% of total followers).
If we set threshold as 99% (=keep only the accounts that take up 99% of total sum of followers), we have 813 accounts. (70% = ~570 accounts to mute; ~3 hours per user)
We have a shortened list of low quality accounts to mute based on each cutoff (99% vs. 95%). But how much does each shortened list cover eligible accounts?
: If TRUE, then the low quality account is included in the eligible set (=there are eligible users who follow that account)In_Hometimeline
It seems like 95% threshold better captures eligible accounts (those accounts that are actually followed by users and/or found in users’ home timelines).