Go to content

Appendix – Statistical analysis

Statistical analysis of the test conducted at Original Coffee

A difference-in-difference regression (DiD) was conducted to estimate the effect size and test if the effect of the interventions were statistically significant. Using this approach, difference in the change in the share of reusable cups in the test-cafes relative to the control cafes, is used as the estimate for the effect of the treatment. The results are presented in Table A1.
The estimated DiD treatment effect of the first intervention period (prompt & signs) shows an effect of 6.26 percentage points (pp) compared to the control (CI = [3.53 – 9.00]). The effect is statistically significant on a 1% level.
The estimated DiD treatment effect of the second intervention period (prompt & signs & add-on) shows an effect of 8.33 pp compared to the control (CI = [5.91 – 10.74]). This effect is statistically significant on a 1% level. Testing the effect separated for each intervention the effect measured is still significant on a 1% level, however with varying effect size of (8.05 pp, 4.64 pp, 11.89 pp). This indicates that the interventions had positive effects on the consumption of reusable to-go coffee in the two intervention periods compared to the control.
Two-sample t-tests were conducted to test the effect of each add-on. We compare the mean of the share in the first intervention period (sign & prompt) to the mean of the share in the second intervention period (sign & prompt & add) to identify the effect of the add-on. This is done separately for each test-café. The estimated effect of the add-ons in both Bredgade and Store Kongensgade was below 2 pp, and not significant. The estimated effect of the add-on in Istedgade (lottery mechanism in the point of purchase) was 6.8 pp compared to the first intervention period. This effect is statistically significant on a 1% level. An overview of the effects can be found in the table below.
By graphical inspection, we identified a few outliers (e.g. the two first days of the first treatment period in BR). In the statistical analyses, outliers were identified by using the Z-score. Observations that differed more than 3 standard deviations from the mean are defined as outliers. We ran the regressions of the dataset without the outliers, and the effects were still statistically significant although the effect size was not as large as when the outliers are in the estimated.
Table A1. Summarized of the effect measured in Original Coffee (P<|t|).
 
Store Kongensgade, Bredgade, & Istedgade
Store Kongensgade
Bredgade
Istedgade
Difference-in-Difference (P<|t|)
Observations
(N=135)
(N=81)
(N=81)
(N=102)
 
Prompt & Signs
0,0626***
(0,0093)
 
 
 
 
Prompt & Signs & add-on nudge
0,0833**
(0,0230)
 
 
 
 
Prompt & Signs & Social feedback loop
 
0,0805***
(0,0006)
 
 
 
Prompt & Signs & Lottery mechanism (return)
 
 
0,0464***
(0,0006)
 
 
Prompt & Signs & Lottery mechanism (buying)
 
 
 
0,1198***
(0,0004)
RobustRobust standard errors in parentheses
***p<0.01, **p<0.05, *p<0.1
To-sample t-test (mean) 
Observations 
(N=28)
(N=28)
(N=35)
 
Social feedback loop
 
-0,0007
(0,0209)
 
 
 
Lottery mechanism (return)
 
 
0,0142
(0,0265)
 
 
Lottery mechanism (buying)
 
 
 
0,0682***
(0,0162)
Standard errors in parentheses
***p<0.01, **p<0.05, *p<0.1

Statistical analysis of the test conducted at Nordea

The following is an elaboration of the statistical analysis made for the experiment conducted at Nordea. Table A2 shows the summarized results.
To test the hypothesis, that the share of green cups was larger in the treatment period compared to the control period a one-tailed t-test was conducted (Figure 2). The test show that there is an increase in the average weekly share of green cups of 9.78 percentage points (CI: [7.97 - 11.59]) moving from the control period to the treatment period. The difference is significant on a 1% significance level.
Figure 3 divide the treatment period into two. A one-tailed t-test was conducted to test whether the increase from the control period to the first treatment period is statistically significant. The test shows that the increase is 5.68 pp (CI: [1.44% - 9.93%]), and that the difference is significant on a 5% significance level. Furthermore, another one-tailed t-test was conducted to test whether the increase from the control period to the second treatment period is statistically significant. We find that the increase is 11.04 pp increase compared to the control (CI: [9.51%–12.58%]). It is significant on a 1% significance level.
When comparing the two intervention periods, a two-tailed t-test is performed to see if there is a statistically significant difference between the two periods. The average weekly share of green cups sold in the second treatment period was 5.36 pp (CI: [2.39%–8.33%]) larger compared to the first treatment period. The difference is significant on a 1% level.
Figure 4 displays the weekly share of green cups sold during the entire experiment. It indicates that there is a learning period where the use of green cups gradually increases, especially in the first intervention period. To test the statistical significance of these trends, one-tailed t-tests is conducted for the three periods. We call the gradually increase the average weekly growth. In the control period, the average weekly growth rate is not significantly different from zero (0.03 pp). The same goes for the second treatment period (0.14 pp). The first treatment period has an average weekly growth in the share of green cups on 2.06 pp (CI: [0.61–3.51], p=0.0101 on the one-tailed t-test). This is significant on a 5% level. This suggest that there is a learning effect in the beginning of the intervention, and that the gradual increase doesn’t continue in the second intervention period.
Table A2. Summarize of the effect measured in Nordea’s café (P<|t|).
 
Nordea café
Two-sample t-test (mean)
 
 
Descriptive social performance feedback loop (combined)
(n=27216)
0,0978***
(0,0086)
 
Descriptive social performance feedback loop (Under iNudgeyou)
(n=11891)
0,0568**
(0,0140)
 
Descriptive social performance feedback loop (Under Nordea Café)
(n=21824)
0,1104***
(0,0072)
 
Descriptive social performance feedback loop (comparing the two intervention periods)
(n=20717)
0,0536***
(0,0139)
One-sample t-test
 
 
Time effect for control period
-0,0003
(0,0050)
 
Time effect for intervention 1
0,0206**
(0,0045)
 
Time effect for intervention 2
0,0014
(0,0086)
Standard errors in parentheses ***p<0.01, **p<0.05, *p<0.1 

Statistical analysis of the tests conducted in Sweden

To test if the nudges gave results significant different from zero, we conducted difference in difference regressions for the three Swedish interventions. The results from these regressions are displayed in Table A3. The regressions are run on the share of reusable cups with the variable “nudge” as the major variable of interest. As seen in the table, the nudge conducted on Nordrest had a significant effect on the share. This statement is further strengthened by the fact that neither the time effect nor the effect from being in a certain group of the stores had a significant effect on the share of reusable cups. The R2 value in the column for Nordrest, further indicates that around 30 percent of the variation in share of reusable cups on Nordrest is due to the intervention. Note that the magnitude of the positive effect from Nordrest is only 1.13 percentage units. For Espresso House and Circle K we see no effects from the interventions statistically different from zero.
Table A3. Difference in difference regressions for Sweden.
Share reusable cups
Nordrest
Espresso House
Circle K
nudge
0.01126***
(0.00355)
0.00264
(0.00236)
-0.00017
(0.00018)
customers
0.00001**
(0.000004)
-0.000001
(0.00001)
0.000001***
(0.0000003)
time_treatment
0.00082
(0.00062)
-0.00020
(0.00115)
-0.00014
(0.00012)
butik_treatment
0.00016
(0.00056)
-0.00038
(0.00075)
0.00020
(0.00016)
Constant
-0.00472**
(0.00237)
0.00135
(0.00097)
-0.00034
(0.00024)
Observations
78
332
392
R2
0.31704
0.00889
0.04978
Adjusted R2
0.27962
-0.00323
0.03996
Regressions ran with robust standard errors.  *p<0.1; **p<0.05; ***p<0.01