This is an important and interesting article about a large-scale field experiment in the Finnish ECEC sector. The article is well written, the field experiment was well executed, and I very much enjoyed reading about it. Implementation of the research design into the field requires a number of adaptations, and the authors guide the reader carefully through all stages of the implementation. They also generously share not only their reflections and considerations on how best to implement the intervention, but also their retrospective evaluation of the choices they made and thoughts on approaches that might have been better. Researchers conducting similar field experiments in the future will greatly benefit from these insights.
The authors find only negligible effects of the intervention on academic and socioemotional skills, results that deviate from existing empirical evidence. In particular, a similar intervention conducted in Norway finds treatment effects of .12 and .13 on a composite index of measured skills at the post-intervention and one-year follow-up surveys (Rege et al., 2024). A possible explanation for the difference is that the interventions changed practice more in the Norwegian experiment than in the Finnish one: While the existing Norwegian preschool curriculum is very nonspecific and unstructured, with a tradition of emphasising free play rather than didactic learning, it appears that the Finnish tradition has a more didactic approach. The Finnish intervention built directly on the existing framework, extending it to five-year-olds, while the Norwegian experiment introduced a new, structured and play-based curriculum. This could explain not only the lack of treatment effects on children whose counterfactual would be childcare, but also the large effects on children whose counterfactual would be home care.
Another notable difference between the Finnish and Norwegian experiments is the scale: While the Norwegian experiment was confined to a specific region and 71 childcare centres that volunteered to participate, the Finnish one covered 956 centres throughout the country. It is possible that Norwegian centres volunteered to take part because they were particularly interested in and motivated to implement this new type of curriculum, and the results may not be generalisable to the full population of centres. This illustrates the importance of large-scale interventions as a knowledge foundation for policymakers before nationwide reforms are implemented.
While it is crucial for the quality of implementation to allow municipalities and centres the necessary autonomy to take ownership of the intervention, this also implies less control over the implementation and a threat to internal validity. In particular, to ensure sufficient resources to implement the intervention, municipalities relocated their most experienced teachers from control to treated centres. Since this implies that also control centres were affected by the intervention, and some even closed down, this should not be considered an integral part of a bundled treatment. I believe this issue deserves more attention in the article.
It appears that baseline test scores were not controlled for in the analyses, possibly because children in home care were not assessed at baseline. It would be useful to have clarification on this and to understand the considerations done on this issue.
Finally, it was interesting to learn about the impressive “culture of experimentation” in the Finnish government and among Finnish policymakers, an example to follow for other countries.