PifferPilfer

PifferPilfer

Cold Winters, Academic Achievement and Income in Italy

Climate, ancestry, and cognitive outcomes at the municipal level

Davide Piffer's avatar
Davide Piffer
Feb 07, 2026
∙ Paid

In a previous post I tested variants of the Cold Winters hypothesis using ancient DNA and cross-population comparisons. Those analyses suggested a recurring pattern: populations exposed to harsher winter environments tend to have higher cognitive polygenic scores, albeit the effect was relatively small.

The real question is methodological: are we confusing ancestry structure with selection? When you work with aDNA you are not comparing nations or ethnic groups. You are comparing individuals and archaeological populations that differ in date, geography, and mixture history, using polygenic scores and principal components computed from the genomes themselves.

The worry is that apparent climate effects can arise because colder and warmer regions have different ancestry compositions, and polygenic scores can correlate with ancestry for many reasons unrelated to selection (drift, migration, sampling, imperfect GWAS portability, and so on). The hard problem is separating an ecological gradient from population structure.

At the same time, there is also the opposite risk: overcontrolling. If selection really acted in certain environments, then ancestry and climate will partly line up, and aggressively partialling out structure can erase part of the signal.

This tension motivates looking at a simpler setting: a single country with shared institutions, where we can ask whether climate still predicts outcomes once demographic history is taken seriously.


Measuring winter severity directly: gradi-giorno

As the chief climatic variable I use gradi-giorno, the metric employed by the Italian state to regulate heating.

In simple terms, it works like this. For each day of the year, you take a reference indoor comfort temperature (20 °C). If the outdoor mean temperature is lower, you count the difference. A day with an average of 5 °C contributes 15 degree-days; a day at 18 °C contributes 2; a day above 20 contributes zero. Add this across the year.

Places with cold, long winters accumulate large totals. Mild places accumulate small ones.

So this isn’t a climate index I constructed for the analysis. Gradi-giorno is an administrative measure designed for a practical purpose: quantifying how demanding winter is in day-to-day life, via heating needs. On this basis, Italy is officially classified into six heating zones (A–F), and those zones determine key rules, such as when and for how many hours per day heating systems can operate.

Le zone climatiche italiane e i periodi di accensione del riscaldamento
Map of the official gradi-giorno classification used for energy regulation. The metric sums, over the year, how much daily mean outdoor temperature falls below 20 °C; higher totals therefore indicate longer and/or harsher winters and greater heating demand. Municipalities are grouped into zones A (warmest, lowest totals) through F (coldest, highest totals). This administrative indicator provides a direct measure of winter severity independent of latitude and is used here as the study’s climate variable.

Controlling ancestry without genomes: surnames

We do not have genetic principal components for every municipality. But surnames carry information about demographic history. They are inherited, geographically clustered, and shaped by migration and isolation.

Using the approach described in Piffer et al. (2025), municipal surname distributions can be used to estimate how similar a comune is to historically northern versus southern Italian provinces. It is not DNA, but it is a reasonable proxy for population structure when genomes are unavailable.

Think of it as ancestry control with different data.


Outcomes and model

The main outcome is school performance from standardized INVALSI tests.

I run the same specification twice, once with school performance and once with per-capita income. The second model is not meant to “explain away” education through wealth; it provides a comparison. If climate is merely standing in for development, we should see it operate more strongly on income than on test results.

Predictors:

  • winter severity (gradi-giorno)

  • altitude

  • surname-based north/south admixture

  • population size

A key detail: the climate variable is residualized on admixture, altitude, and population. This forces it to represent the part of winter harshness that is not simply the demographic gradient under another name, and it keeps collinearity low.

In other words: once you control for the north–south demographic structure, is there still anything left for climate to explain?


Regression results

The dominant pattern is demographic. Even within one country, surname-inferred ancestry strongly predicts both income and school outcomes. Ignoring that would obviously confound any ecological analysis.

But climate does not vanish once those controls are included.

Municipalities that are colder than expected given their surname profile, altitude, and size tend to show better school performance. The standardized effect is modest (around β ≈ 0.14) compared with the ancestry gradient.

Income shows a similar association, though slightly smaller (β ≈ 0.11).

Altitude, meanwhile, contributes to income but not to test scores once the rest is in the model. That is intuitive: mountain areas can face economic disadvantages. Yet altitude itself is not the mechanism usually invoked by Cold Winters arguments. Winter severity is.

Because climate has been residualized on the demographic predictors, diagnostics show essentially no collinearity. The model is explicitly asking whether climate adds explanatory power beyond migration history.


Genetic selection, culture, or immediate environment?

The key question is why the surname controls do not absorb the entire education gradient. Under a genetic or cultural transmission model, surnames—being inherited and geographically clustered—should capture a large share of persistent differences, especially since there has been little time for biological evolution over the last few centuries.

There are two broad explanations. First, the climate term could reflect a transient environmental effect: hotter conditions may depress performance directly through heat stress, sleep, classroom comfort, or seasonality. In that case, climate matters now even if surnames fully capture inherited structure.

Second, the surname proxy may simply be too coarse. A north–south “admixture” score captures major demographic history, but Italy contains substantial climatic variation within each macro-area. If the controls mostly remove the national north–south gradient while leaving finer-scale structure unmodeled, a residual climate association can remain, whether the underlying within-region structure is neutral ancestry, persistent cultural differences, or geographically patterned genetic predispositions.

This is why region fixed effects, or North–Center–South fixed effects with interactions, are a useful robustness check. They force the climate coefficient to be identified within areas, not by differences between broad parts of Italy.

The demographic gradient is large, as expected. But what happens to climate after we force the comparison within regions?

User's avatar

Continue reading this post for free, courtesy of Davide Piffer.

Or purchase a paid subscription.
© 2026 Davide Piffer · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture