WTA_distribution
Do Won Kim
2024-03-13
1 Data Preparation
w1_users
: Wave 1 responses of users who took wave 2 survey so far (247 obs)w2_users
: These same respondents’ wave 2 responses
Based on the above data frames, I made a merged dataset
wta_df
!
library(tidyverse)
library(DT)
library(haven)
library(readr)
# load raw data
df_w1 <- readRDS("~/Downloads/2023-095b_files/2023-095a_client_wave-1.rds")
df_w1_supp <- readRDS("~/Downloads/2023-095b_files/2023-095a_client_wave-1_supp.rds")
df_w2 <- readRDS("~/Downloads/2023-095b_files/2023-095b_client_wave-2.rds")
# bind wave 1 data + filter only Progress == 100
df_w1 = rbind(df_w1, df_w1_supp)
df_w1 |>
select(-user_id) |>
filter(Progress==100) -> df_w1
# invited users list (for merge)
Users_to_invite_Wave2 <- read_csv("Users_to_invite_Wave2.csv",
col_types = cols(user_id = col_character()))
Users_to_invite_Wave2 |>
select(user_id) |>
mutate(userid = user_id) -> for_merge
# merged with invited user list
df_w2 |>
merge(for_merge, by='userid') -> w2_users
# merged with invited user list
w2_users |>
select(userid, user_id) |>
merge(df_w1, by='userid') -> w1_users
w1_users |>
select(userid, WTA_1_18, WTA_1_2, WTA_2) |>
mutate(assigned_wta_w1 = ifelse(!is.na(WTA_2),"open-ended",
ifelse(is.na(WTA_1_2), "scale_within",
"scale_over30")),
WTA_1_18 = as.integer(WTA_1_18),
WTA_1_2 = as.integer(WTA_1_2),
WTA_2 = as.integer(WTA_2),
wta_w1 = ifelse(!is.na(WTA_1_18), WTA_1_18,
ifelse(!is.na(WTA_1_2), WTA_1_2, WTA_2))) |>
select(userid, wta_w1, assigned_wta_w1) -> w1_wta
w2_users |>
select(userid, wta_2:wta_2_again_compre2) |>
mutate(
wta_w2 = ifelse(is.na(wta_2_again), wta_2, wta_2_again),
wta_2_compre1 = as.integer(wta_2_compre1),
wta_2_compre2 = as.integer(wta_2_compre2),
wta_2_compre1_over = as.integer(wta_2_compre1_over),
wta_2_compre2_over = as.integer(wta_2_compre2_over),
wta_2_again_compre1 = as.integer(wta_2_again_compre1),
wta_2_again_compre2 = as.integer(wta_2_again_compre2),
wta_w2_compre1 = ifelse(is.na(wta_2_again),
wta_2_compre1, wta_2_again_compre1),
wta_w2_compre1 = ifelse(is.na(wta_w2_compre1),
wta_2_compre1_over, wta_w2_compre1),
wta_w2_compre2 = ifelse(is.na(wta_2_again),
wta_2_compre2, wta_2_again_compre2),
wta_w2_compre2 = ifelse(is.na(wta_w2_compre2),
wta_2_compre2_over, wta_w2_compre2),
w2_compre1 = ifelse(wta_w2_compre1==1, 1, 0),
w2_compre2 = ifelse(wta_w2_compre2==2, 1, 0)
) |>
select(userid, wta_w2, w2_compre1, w2_compre2) -> w2_wta
1.1 Description of
wta_df
wta_w1
: Wave 1 survey amount of WTAassigned_wta_w1
: In Wave 1, what condition was the user assigned toscale_within
: user chose a number within 0~30 scalescale_over30
: user chose ‘over 30’ and then put the WTA amountopen-ended
: user assigned to open-ended version (as in Wave 2)
wta_w2
: WTA amount in Wave 2w2_compre1
,w2_compre2
: 1 if the user understood (got the answer right), 0 otherwise. Non-responses are NA/empty cells.
2 2X2 comparison
Brendan: how many respondents were under $15 last time vs. this time?
Wave 1 WTA < $15 | Wave 1 WTA >= $15 | |
---|---|---|
Wave 2 WTA < $15 | 82 | 52 |
Wave 2 WTA >= $15 | 26 | 86 |
cf) We didn’t remove (filtered out) those with WTA >= $15 after Wave 1 (N=138).
Note that in Wave 2, we used open-ended ver.
–
(unfavorable switch) 26 users switched from Wave 1 under $15 –> Wave 2 over $15
(favorable switch) 52 users switched from Wave 1 over $ 15 –> Wave 2 under $15
86 (consistently WTA >= $15)
82 (consistently WTA < $15)
3 Open-ended responses
Right after eliciting WTA, we asked the following: “What were your main considerations as you decided on your answer? Please write at least three sentences describing your thought process.” Open-ended responses to this question can be found in the below table.
3.1 Description of the table
(each row represents each respondent)
wta_w1
,wta_w2
: WTA amount in Waves 1 and 2, respectivelywta_considerations
: open-ended responses right after WTA (what were your main considerations…)feedback
: feedback at the end of the survey
wta_df |>
mutate(
w1_under15 = as.factor(ifelse(wta_w1 < 15, 1, 0)),
w2_under15 = as.factor(ifelse(wta_w2 < 15, 1, 0)),
) |>
merge(w2_users, by='userid') |>
mutate(
wta_considerations = ifelse(!is.na(wta_outro), wta_outro, wta_outro2)
) |>
select(wta_w1, wta_w2, wta_considerations, feedback) |>
mutate(
wta_w1 = as.integer(wta_w1),
wta_w2 = as.integer(wta_w2)
) |>
datatable(filter='top')
3.1.1 Only among those who got open-ended ver. in Wave 1:
w2_compre1
,w2_compre2
: if correctly understood, 1; otherwise 0
wta_df |>
mutate(
w1_under15 = as.factor(ifelse(wta_w1 < 15, 1, 0)),
w2_under15 = as.factor(ifelse(wta_w2 < 15, 1, 0)),
) |>
merge(w2_users, by='userid') |>
mutate(
wta_considerations = ifelse(!is.na(wta_outro), wta_outro, wta_outro2)
) |>
filter(assigned_wta_w1=="open-ended") |>
select(wta_w1, wta_w2, w2_compre1, w2_compre2, wta_considerations) |>
datatable(filter='top')
3.2 Notes
I noticed that some participants calculated on a weekly basis.
(WTA = $20) $5 per week for 4 weeks is $20; Being denied access to certain media I may use or enjoy is worth at least 5 a week
(WTA = $40) In thought of $10 a week I’d like to help your survey I’d like to see what it is like not to follow these accounts
(WTA = $80) $20 per week would be a fair amount to have me mute certain accouts
I also noticed that the $10 example may have an anchoring effect ($10 being the mode):
- You offered $10 in the example. I probably would have done it for less. That number seemed like a fair amount. I’d be happy to take part.
🔍 Should we (1) lower the example $ and (2) re-calculate the cutoff based on $ X 4 weeks?
- e.g., $5 in the example, $20 as the cutoff?
Additionally, many participants were curious about the specific accounts we would mute for them.
In general, I think WTA does a good job measuring demand (e.g., participants with lower WTA ratings claiming they don’t care or care less because they don’t use Twitter often or follow news on the platform, etc.).
4 Scatter Plot
Ro’ee: I think it would be useful at some point to create a scatter plot with WTA in round 1 in the x-axis and round 2 in the y-axis (and also just report the correlation). It’s a bit surprising that so many people provided a different answer, we should figure out if this is related to the question wording.
+Among those who received open-ended ver. in wave 1
4.1 Plot 1.
- 14 cases removed (over $100)
library(tidyverse)
library(ggplot2)
library(DT)
library(haven)
library(ggpubr)
library(cowplot)
library(ggthemes)
library(readr)
library(ggExtra)
library(scales)
wta_df <- read_csv("wta/wta_df.csv", col_types = cols(userid = col_character(),
wta_w1 = col_integer(),
wta_w2 = col_integer(),
w2_compre1 = col_integer(),
w2_compre2 = col_integer()))
plot0 <- ggplot(wta_df, mapping = aes(wta_w1, wta_w2)) +
geom_abline(intercept = 0, slope = 1, lty = 3, color = 'black') +
geom_hline(yintercept = 15, lty = 2, color = 'red') +
geom_vline(xintercept = 15, lty = 2, color = 'red') +
geom_jitter(alpha=0.6) +
stat_cor(method = 'pearson') +
xlim(c(0, 100)) +
ylim(c(0, 100)) +
theme_few() +
theme(legend.position = 'bottom') +
xlab('WTA distribution in Wave 1 (0~100)') +
ylab('WTA distribution in Wave 2 (0~100)')
ggMarginal(plot0, fill='gray')
4.2 Plot 2.
Group by
category
based on WTA $always-under15
: user chose below $15 in both wavesalways-over15
: user chose $15 or over in both wavesfavorable-shift
: user chose $15 or over in wave 1 but shifted below $15 in wave 2unfavorable-shift
: user chose below $15 in wave 1 but shifted $15 or over in wave 2
Among those who got open-ended ver. in Wave 1,
always-over15
: 33 (28.4%)always-under15
: 47 (40.5%)favorable-shift
: 20 (17.2%)unfavorable-shift
: 16 (13.8%)
cf) Among those who got scale ver. in Wave 1,
always-over15
: 53 (40.8%)always-under15
: 35 (26.9%)favorable-shift
: 32 (24.6%)unfavorable-shift
: 10 (7.69%)
wta_df |>
filter(assigned_wta_w1=='open-ended') |>
mutate(
category = ifelse(
wta_w1 < 15 & wta_w2 < 15, "always-under15",
ifelse(wta_w1 >= 15 & wta_w2 >= 15, "always-over15",
ifelse(wta_w1 >=15 & wta_w2 < 15, "favorable-shift", "unfavorable-shift")))
) |>
filter(!is.na(wta_w1)) -> op_wta_df
plot1_op <- ggplot(op_wta_df, mapping = aes(wta_w1, wta_w2, color=category)) +
geom_abline(intercept = 0, slope = 1, lty = 3, color = 'black') +
geom_hline(yintercept = 15, lty = 2, color = 'red') +
geom_vline(xintercept = 15, lty = 2, color = 'red') +
geom_jitter(alpha=0.5) +
stat_cor(method = 'pearson') +
xlim(c(0, 100)) +
ylim(c(0, 100)) +
theme_few() +
theme(legend.position = 'bottom') +
xlab('WTA distribution in Wave 1 (0~100)') +
ylab('WTA distribution in Wave 2 (0~100)') +
ggtitle('Scatter Plot', subtitle = 'Only among those with open-ended ver. in Wave 1')
ggMarginal(plot1_op, groupColour = TRUE, groupFill = TRUE)