WTA_distribution

1 Data Preparation

  • w1_users: Wave 1 responses of users who took wave 2 survey so far (247 obs)

  • w2_users: These same respondents’ wave 2 responses

Based on the above data frames, I made a merged dataset wta_df!

library(tidyverse)
library(DT)
library(haven)
library(readr)


# load raw data 
df_w1 <- readRDS("~/Downloads/2023-095b_files/2023-095a_client_wave-1.rds")
df_w1_supp <- readRDS("~/Downloads/2023-095b_files/2023-095a_client_wave-1_supp.rds")
df_w2 <- readRDS("~/Downloads/2023-095b_files/2023-095b_client_wave-2.rds")


# bind wave 1 data + filter only Progress == 100 
df_w1 = rbind(df_w1, df_w1_supp)
df_w1 |>
  select(-user_id) |> 
  filter(Progress==100) -> df_w1

# invited users list (for merge)
Users_to_invite_Wave2 <- read_csv("Users_to_invite_Wave2.csv", 
    col_types = cols(user_id = col_character()))
Users_to_invite_Wave2 |> 
  select(user_id) |> 
  mutate(userid = user_id) -> for_merge

# merged with invited user list 
df_w2 |>
  merge(for_merge, by='userid') -> w2_users

# merged with invited user list 
w2_users |> 
  select(userid, user_id) |>
  merge(df_w1, by='userid') -> w1_users
w1_users |>
  select(userid, WTA_1_18, WTA_1_2, WTA_2) |> 
  mutate(assigned_wta_w1 = ifelse(!is.na(WTA_2),"open-ended",
                                  ifelse(is.na(WTA_1_2), "scale_within", 
                                         "scale_over30")),
         WTA_1_18 = as.integer(WTA_1_18),
         WTA_1_2 = as.integer(WTA_1_2),
         WTA_2 = as.integer(WTA_2),
         wta_w1 = ifelse(!is.na(WTA_1_18), WTA_1_18, 
                      ifelse(!is.na(WTA_1_2), WTA_1_2, WTA_2))) |>
  select(userid, wta_w1, assigned_wta_w1) -> w1_wta
w2_users |>
  select(userid, wta_2:wta_2_again_compre2) |>
  mutate(
    wta_w2 = ifelse(is.na(wta_2_again), wta_2, wta_2_again),
    
    wta_2_compre1 = as.integer(wta_2_compre1),
    wta_2_compre2 = as.integer(wta_2_compre2),
    wta_2_compre1_over = as.integer(wta_2_compre1_over),
    wta_2_compre2_over = as.integer(wta_2_compre2_over),
    wta_2_again_compre1 = as.integer(wta_2_again_compre1),
    wta_2_again_compre2 = as.integer(wta_2_again_compre2),
    
    wta_w2_compre1 = ifelse(is.na(wta_2_again), 
                           wta_2_compre1, wta_2_again_compre1),
    wta_w2_compre1 = ifelse(is.na(wta_w2_compre1), 
                            wta_2_compre1_over, wta_w2_compre1),
    wta_w2_compre2 = ifelse(is.na(wta_2_again), 
                           wta_2_compre2, wta_2_again_compre2), 
    wta_w2_compre2 = ifelse(is.na(wta_w2_compre2), 
                            wta_2_compre2_over, wta_w2_compre2),
    
    w2_compre1 = ifelse(wta_w2_compre1==1, 1, 0),
    w2_compre2 = ifelse(wta_w2_compre2==2, 1, 0)
  ) |> 
    select(userid, wta_w2, w2_compre1, w2_compre2) -> w2_wta
w1_wta |> 
  merge(w2_wta, by='userid') -> wta_df

1.1 Description of wta_df

  • wta_w1: Wave 1 survey amount of WTA

  • assigned_wta_w1: In Wave 1, what condition was the user assigned to

    • scale_within: user chose a number within 0~30 scale

    • scale_over30: user chose ‘over 30’ and then put the WTA amount

    • open-ended: user assigned to open-ended version (as in Wave 2)

  • wta_w2: WTA amount in Wave 2

  • w2_compre1, w2_compre2: 1 if the user understood (got the answer right), 0 otherwise. Non-responses are NA/empty cells.

wta_df |> datatable()

2 2X2 comparison

Brendan: how many respondents were under $15 last time vs. this time?

Wave 1 WTA < $15Wave 1 WTA >= $15
Wave 2 WTA < $158252
Wave 2 WTA >= $152686
  • cf) We didn’t remove (filtered out) those with WTA >= $15 after Wave 1 (N=138).

  • Note that in Wave 2, we used open-ended ver.

  • (unfavorable switch) 26 users switched from Wave 1 under $15 –> Wave 2 over $15

  • (favorable switch) 52 users switched from Wave 1 over $ 15 –> Wave 2 under $15

  • 86 (consistently WTA >= $15)

  • 82 (consistently WTA < $15)

3 Open-ended responses

Right after eliciting WTA, we asked the following: “What were your main considerations as you decided on your answer? Please write at least three sentences describing your thought process.” Open-ended responses to this question can be found in the below table.

3.1 Description of the table

(each row represents each respondent)

  • wta_w1, wta_w2: WTA amount in Waves 1 and 2, respectively

  • wta_considerations: open-ended responses right after WTA (what were your main considerations…)

  • feedback: feedback at the end of the survey

wta_df |>
    mutate(
        w1_under15 = as.factor(ifelse(wta_w1 < 15, 1, 0)),
        w2_under15 = as.factor(ifelse(wta_w2 < 15, 1, 0)),
    ) |> 
  merge(w2_users, by='userid') |>
  mutate(
    wta_considerations = ifelse(!is.na(wta_outro), wta_outro, wta_outro2) 
  ) |>
  select(wta_w1, wta_w2, wta_considerations, feedback) |>
  mutate(
    wta_w1 = as.integer(wta_w1),
    wta_w2 = as.integer(wta_w2)
  ) |>
  datatable(filter='top')

3.1.1 Only among those who got open-ended ver. in Wave 1:

  • w2_compre1,w2_compre2: if correctly understood, 1; otherwise 0
wta_df |>
    mutate(
        w1_under15 = as.factor(ifelse(wta_w1 < 15, 1, 0)),
        w2_under15 = as.factor(ifelse(wta_w2 < 15, 1, 0)),
    ) |> 
  merge(w2_users, by='userid') |>
  mutate(
    wta_considerations = ifelse(!is.na(wta_outro), wta_outro, wta_outro2) 
  ) |>
  filter(assigned_wta_w1=="open-ended") |>
  select(wta_w1, wta_w2, w2_compre1, w2_compre2, wta_considerations) |> 
  datatable(filter='top')

3.2 Notes

I noticed that some participants calculated on a weekly basis.

  • (WTA = $20) $5 per week for 4 weeks is $20; Being denied access to certain media I may use or enjoy is worth at least 5 a week

  • (WTA = $40) In thought of $10 a week I’d like to help your survey I’d like to see what it is like not to follow these accounts

  • (WTA = $80) $20 per week would be a fair amount to have me mute certain accouts

I also noticed that the $10 example may have an anchoring effect ($10 being the mode):

  • You offered $10 in the example. I probably would have done it for less. That number seemed like a fair amount. I’d be happy to take part.

🔍 Should we (1) lower the example $ and (2) re-calculate the cutoff based on $ X 4 weeks?

  • e.g., $5 in the example, $20 as the cutoff?

Additionally, many participants were curious about the specific accounts we would mute for them.

In general, I think WTA does a good job measuring demand (e.g., participants with lower WTA ratings claiming they don’t care or care less because they don’t use Twitter often or follow news on the platform, etc.).


4 Scatter Plot

Ro’ee: I think it would be useful at some point to create a scatter plot with WTA in round 1 in the x-axis and round 2 in the y-axis (and also just report the correlation). It’s a bit surprising that so many people provided a different answer, we should figure out if this is related to the question wording. 

+Among those who received open-ended ver. in wave 1

4.1 Plot 1.

  • 14 cases removed (over $100)
library(tidyverse)
library(ggplot2)
library(DT)
library(haven)
library(ggpubr)
library(cowplot)
library(ggthemes)
library(readr)
library(ggExtra)
library(scales)
wta_df <- read_csv("wta/wta_df.csv", col_types = cols(userid = col_character(), 
                                                      wta_w1 = col_integer(), 
                                                      wta_w2 = col_integer(), 
                                                      w2_compre1 = col_integer(), 
                                                      w2_compre2 = col_integer()))


plot0 <- ggplot(wta_df, mapping = aes(wta_w1, wta_w2)) +
  geom_abline(intercept = 0, slope = 1, lty = 3, color = 'black') +
  geom_hline(yintercept = 15, lty = 2, color = 'red') + 
  geom_vline(xintercept = 15, lty = 2, color = 'red') + 
  geom_jitter(alpha=0.6) +
  stat_cor(method = 'pearson') + 
  xlim(c(0, 100)) + 
  ylim(c(0, 100)) +
  theme_few() +
  theme(legend.position = 'bottom') +
  xlab('WTA distribution in Wave 1 (0~100)') + 
  ylab('WTA distribution in Wave 2 (0~100)')

ggMarginal(plot0, fill='gray') 

4.2 Plot 2.

  • Group by category based on WTA $

    • always-under15: user chose below $15 in both waves

    • always-over15: user chose $15 or over in both waves

    • favorable-shift: user chose $15 or over in wave 1 but shifted below $15 in wave 2

    • unfavorable-shift: user chose below $15 in wave 1 but shifted $15 or over in wave 2

Among those who got open-ended ver. in Wave 1,

  • always-over15: 33 (28.4%)

  • always-under15: 47 (40.5%)

  • favorable-shift: 20 (17.2%)

  • unfavorable-shift: 16 (13.8%)

cf) Among those who got scale ver. in Wave 1,

  • always-over15: 53 (40.8%)

  • always-under15: 35 (26.9%)

  • favorable-shift: 32 (24.6%)

  • unfavorable-shift: 10 (7.69%)

wta_df |>
  filter(assigned_wta_w1=='open-ended') |>
    mutate(
    category = ifelse(
      wta_w1 < 15 & wta_w2 < 15, "always-under15", 
      ifelse(wta_w1 >= 15 & wta_w2 >= 15, "always-over15", 
             ifelse(wta_w1 >=15 & wta_w2 < 15, "favorable-shift", "unfavorable-shift")))
    ) |>
    filter(!is.na(wta_w1)) -> op_wta_df


plot1_op <- ggplot(op_wta_df, mapping = aes(wta_w1, wta_w2, color=category)) +
  geom_abline(intercept = 0, slope = 1, lty = 3, color = 'black') +
  geom_hline(yintercept = 15, lty = 2, color = 'red') + 
  geom_vline(xintercept = 15, lty = 2, color = 'red') + 
  geom_jitter(alpha=0.5) +
  stat_cor(method = 'pearson') + 
  xlim(c(0, 100)) + 
  ylim(c(0, 100)) +
  theme_few() +
  theme(legend.position = 'bottom') +
  xlab('WTA distribution in Wave 1 (0~100)') + 
  ylab('WTA distribution in Wave 2 (0~100)') +
  ggtitle('Scatter Plot', subtitle = 'Only among those with open-ended ver. in Wave 1')


ggMarginal(plot1_op, groupColour = TRUE, groupFill = TRUE)