The Rise of Rome and the Rise of Polygenic Scores
Why did Rome, rather than any of its many rivals in Iron Age Italy, become the core of an empire?
A muddy settlement on the Tiber turns into a machine that can raise armies, write laws that outlive empires, build roads that stitch a continent together, and carry water for millions through aqueducts, while running a Mediterranean-wide bureaucracy for centuries. The usual explanations are familiar: institutions, military discipline, geography, luck. All true, and none of them feels fully satisfying on its own. Many societies possessed some of these advantages. Rome was unusual in how consistently it turned them into scalable institutions.

There is another angle that is rarely discussed, mostly because until recently it was not testable. What if part of Rome’s advantage was carried in its people, as average differences in traits linked to learning, planning, and administration?
Ancient DNA makes it possible to ask that question directly. Using the AADR dataset and educational attainment polygenic scores, Iron Age and Republican-era Romans come out unusually high. Besides exceeding earlier Italian groups, they sit at the top of the entire ancient European distribution, even after accounting for sample age and genomic coverage.
That by itself does not explain the rise of Rome. But it does suggest a sharper hypothesis: Rome’s institutions may have been built and operated by a population that, on average, was unusually well suited to master and scale complex social systems.
Why bring genetics into this at all?
A useful precedent for thinking this way comes from economic history. One influential argument is that England’s escape into sustained modern growth may have been helped not only by markets and institutions, but also by slow-moving demographic change. Over many centuries, higher-status groups in preindustrial England often had more surviving children, and their descendants gradually spread through the wider population. If the traits associated with economic success were partly heritable, this process could slowly shift the population distribution of characteristics linked to human capital and competence, making complex economic systems easier to sustain at scale. This is the core mechanism proposed by Gregory Clark in A Farewell to Alms.
This is not a genetic determinist claim. Institutions are a major factor. The point is that institutions are operated by people, and the returns to good rules depend in part on the distribution of skills and dispositions in the population. Even modest shifts in population-level human capital could change how effectively institutions function, how quickly they scale, and how resilient they are under stress.
My recent work with Prof. Connor - which I summarized in a previous post - tries to bring this hypothesis into the molecular era by asking a direct question: do ancient populations differ in polygenic signals for educational attainment in a way that tracks historical differences in complexity and state capacity?
In earlier work, we reported that Republican Romans had higher cognitive polygenic scores than both pre-Roman Italians and later Imperial-period Romans. Here I place Roman samples into a wider European reference frame using AADR, rather than comparing them only within Italy and across a few periods.
The “Republican Rome” label here refers to the AADR group Italy_IA_Republic (ca. 2830–2250 years BP, roughly ~880–300 BCE), which spans the late Iron Age through the early and middle Republic; for the purposes of this analysis, it is best understood as a continuous central Italian Iron Age population rather than a sharp archaeological break between “Regal/Kingdom” and “Republican” phases.
The data
This post uses AADR v62, a large public compilation of ancient human genomes with archaeological labels (site, culture, and date). Each individual has genotype data on the standard 1240k capture panel, which targets about 1.24 million SNPs widely used in ancient DNA.
The outcome I analyze is an EA polygenic score. Here EA stands for educational attainment, a GWAS-based phenotype that broadly captures genetic influences on years of schooling and closely related cognitive and behavioral traits. A polygenic score (PGS) is a single number computed by summing many genetic variants across the genome, each weighted by its GWAS effect size. It is not a measure of an individual’s achieved education in the past. It is a genetic predictor derived from present-day samples.
To reduce dependence on any single GWAS, I compute two educational attainment polygenic scores using genome-wide significant SNPs (p < 5 × 10⁻⁸) from two large studies: Lee et al. (2018), which I refer to as EA3, and Okbay et al. (2022), which I refer to as EA4. I then average EA3 and EA4 within each individual to obtain a single combined EA score per person, and Z-standardize this combined score across individuals so results are reported in standard deviation units and are comparable across analyses.
Ancient DNA varies in data quality, so I also compute coverage as the fraction of the 1.24 million 1240k target SNPs that are actually observed in each sample. This matters because missingness can distort polygenic scores if it differs systematically across groups or time periods.
Finally, I run the analyses at two levels:
Individual level: each person is one data point
Group level: each population label is summarized by its mean values
Throughout, I control for two core confounders whenever appropriate: sample age (Years BP) and coverage. I also restrict comparisons to groups with at least 10 individuals to avoid unstable means driven by tiny samples.
A baseline pattern in Europe: later samples score higher
Before focusing on Rome, it is necessary to establish the background trend. Across ancient European populations, educational attainment polygenic scores increase toward the present. At the group level, the correlation between mean Years BP and mean Educational Attainment (EA) polygenic score is about −0.55. Older groups tend to score lower, and more recent groups tend to score higher.
Figure 1 shows this pattern using population means. Years BP is plotted on the x-axis and reversed, so the present lies on the left and the deep past on the right. The slope is clearly negative, indicating a long-term increase in EA polygenic scores across Europe.
Figure 1. Educational polygenic scores rise toward the present across ancient Europe
Italy_IA_Republic lies well above the regression line, indicating a large positive residual relative to the Europe-wide age trend.
This background trend is important, because it sets the baseline expectation for any late population. The real question is whether Rome is simply late, or unusually high even for its time.
Figure 2 shows boxplots of EA polygenic scores for the top and bottom 20 populations with at least 10 individuals. Republican-era Romans (Italy_IA_Republic) sit at the very top of the distribution. This difference is not marginal. Their median and upper range exceed those of most other Iron Age, Medieval, and Bronze Age groups.
Figure 2. Republican Rome sits at the top of the European EA PGS distribution
This visual comparison is suggestive, but it is not sufficient on its own. Later samples tend to score higher in general, and sequencing coverage can bias polygenic scores. The relevant question is therefore not whether Republican Romans score high in absolute terms, but whether they remain high after accounting for age and coverage.
Rome is unusually high even for its time
To answer this, I compare individuals labeled Italy IA/Republic to all other individuals in the filtered dataset, while controlling for Years BP and genomic coverage.
The effect is large. Republican Romans score about +1.31 standard deviations above the rest, conditional on these controls. The result is highly significant, with a 95 percent confidence interval of roughly +0.73 to +1.89 SD.
I also tested whether Republican Romans follow a different time trend by allowing a Rome-by-time interaction. The interaction is not significant, indicating that the result reflects a level shift rather than a different temporal slope.
To address concerns about individual-level dependence, I repeat the analysis at the group level. The conclusion does not change. The Republican Roman group mean is about +1.27 SD higher than other group means after adjusting for mean Years BP and mean coverage. Weighting groups by sample size yields very similar estimates.
Italy-wide comparison using merged groups
To reduce label clutter and improve inference, closely related central Italian Iron Age groups were merged into a single regional cluster. This cluster combines samples from Tuscany and Lazio that overlap in time and geography and belong to closely connected Etruscan and early Roman contexts. Republican Rome is then compared to this merged group and to the rest of Italy, while controlling for sample age and genomic coverage.
Under this framework, most Italian populations score well below Republican Rome., as shown in the figure below.
Figure 3. Republican Rome compared with other Italian populations after adjustment
This includes Epigravettian hunter-gatherers, Sardinian groups from all periods, Sicilian Bronze and Iron Age populations, Imperial Romans, Late Antique Italians, and Early Medieval groups from northern Italy. In many cases, the adjusted differences exceed one standard deviation.
Central Italy behaves differently. Once Etruscan-associated and nearby Iron Age groups are merged, their average score lies closer to Republican Rome and is no longer clearly separable after multiple-testing correction. This indicates that Republican Rome does not represent an isolated outlier, but rather sits within a broader zone of elevated scores in central Italy during the late Iron Age.
A tighter comparison within the Iron Age
To avoid relying on comparisons across very different periods, the analysis was repeated using only Italian individuals dated between 1800 and 3200 years before present. This restriction places Republican Rome alongside other Iron Age populations and closely adjacent contexts.
Even within this narrow time window, Republican Rome remains about 1.2 standard deviations higher than the Italian average after adjusting for age and coverage. The effect is statistically strong and does not depend on contrasts with much earlier hunter-gatherers or much later medieval populations.
At the same time, the merged analysis clarifies how this difference should be interpreted. Republican is extreme relative to most of Italy, but it is closest to other central Italian Iron Age groups, particularly those associated with Etruscan contexts. This is precisely the comparison where genetic context matters most. Ancient DNA analyses of Iron Age central Italy consistently find that Etruscan-associated groups and contemporaneous Roman Republican individuals were genetically very similar.
Posth et al. (2021) show that individuals from central Italian Etruscan contexts can be modeled as mixtures of Neolithic or Copper Age Italians and steppe-related ancestry, with the latter accounting for roughly a quarter of ancestry when using distal sources such as Yamnaya, and around 50% when using temporally and geographically closer proxies like Bell Beaker groups. Crucially, these Etruscan individuals can also be modeled entirely as mixtures of other European Iron Age and Late Neolithic populations, indicating that by the Iron Age, central Italy was already genetically well integrated into the broader European ancestry landscape.
In principal component analyses, Iron Age and Roman Republican individuals from Tuscany and Lazio, including samples from the city of Rome itself, overlap almost completely. This indicates that steppe-related ancestry was already widespread and homogenized across central Italy by the Iron Age, spanning both Indo-European–speaking groups (such as Italic populations) and non–Indo-European Etruscan speakers. In this context, the similarity in polygenic scores between Republican Rome and nearby Etruscan-associated groups is not surprising. It reflects a shared regional population shaped by long-term interaction and assimilation, rather than a uniquely Roman genetic departure.
This reinforces the main point of the section. Republican Rome appears exceptional relative to most of Italy, but it is embedded within a coherent central Italian Iron Age population, both genetically and in polygenic signal. The Roman result therefore looks structured and regional, not accidental.
Rome after Rome: continuity into the Medieval and Early Modern city
Medieval and Early Modern Italian samples such as Italy_Medieval_EarlyModern and Italy_Tuscany_Siena are not distinguishable from Republican Rome in the adjusted comparisons. The estimated gaps are small relative to their uncertainty, so these groups overlap substantially with the Republican Roman distribution once Years BP and coverage are taken into account.
This sits comfortably with what ancient DNA has already shown about Rome’s demographic volatility. Antonio et al. document a major ancestry shift in the Imperial period toward the eastern Mediterranean, followed by a later shift toward increased European contributions. That “European” input is consistent with historical accounts of post-imperial migrations and the arrival of groups associated with Germanic polities in Italy, even if the magnitude and geography of that contribution varied through time.
A separate, non-exclusive possibility is a demographic reset driven by urban collapse. Medieval Rome was far smaller than Imperial Rome, and some historians and commentators have suggested that repeated population declines could have been followed by repopulation from the surrounding countryside, which would pull Rome’s ancestry back toward a more local central Italian baseline. Razib Khan has argued for this kind of genealogical break between Imperial cosmopolitan Rome and the later medieval city:”Rome after the fall of the Empire was genetically a different Rome. It was replenished from the hills of Latium, now Lazio, just as the ancient city had been stocked with migrants from backwater hill tribes.”
On this reading, the pattern in the polygenic scores is not mysterious. Imperial Rome is lower, plausibly reflecting heavy immigration from regions with lower EA PGS in this dataset, while later Italian groups rebound toward the earlier Iron Age level, consistent with a stronger contribution from local central Italians plus some northern and central European input.
Conclusion
The high polygenic score observed in Republican Rome should not be read as a uniquely Roman anomaly. It reflects a broader concentration of elevated scores in central Italy during the late Iron Age, of which Rome was one expression.
This strengthens the interpretation rather than weakening it. Signals that recur across closely related populations and that reappear after major demographic disruptions are more credible than patterns confined to a single archaeological label. Republican Rome stands out at the continental scale and remains unusual within Italy, but it is also continuous with its central Italian context. It is not a genetically isolated outlier relative to Etruscan-associated neighbors or later Rome-linked populations.
The claim, then, is not that genes caused Rome, but that institutions are ultimately run by people, and population differences in traits tied to learning, planning, and coordination can shift how effectively complex systems are built, staffed, and scaled. When societies grow large, even modest advantages in the supply of such capacities can compound through bureaucracy, law, and organization.
If this reading is right, the Republican period may represent a moment when Rome combined favorable institutional innovations with an unusually strong human-capital base. That combination helps explain why Rome did not merely compete within Italy, but repeatedly out-organized rivals and eventually dominated the Mediterranean.
References
Antonio, M. L., Gao, Z., Moots, H. M., Lucci, M., Candilio, F., Sawyer, S., Oberreiter, V., Calderon, D., Devitofranceschi, K., Aikens, R. C., Aneli, S., Bartoli, F., Bedini, A., Cheronet, O., Cotter, D. J., Fernandes, D. M., Gasperetti, G., Grifoni, R., Guidi, A., La Pastina, F., … Pritchard, J. K. (2019). Ancient Rome: A genetic crossroads of Europe and the Mediterranean. Science, 366(6466), 708–714. https://doi.org/10.1126/science.aay6826
Clark, Gregory (2007). A Farewell to Alms: A Brief Economic History of the World. Princeton University Press.
Lee, J. J., Wedow, R., Okbay, A., Kong, E., Maghzian, O., Zacher, M., Nguyen-Viet, T. A., Bowers, P., Sidorenko, J., Karlsson Linnér, R., Fontana, M. A., Kundu, T., Lee, C., Li, H., Li, R., Royer, R., Timshel, P. N., Walters, R. K., Willoughby, E. A., Yengo, L., … Cesarini, D. (2018). Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nature Genetics, 50, 1112–1121. https://doi.org/10.1038/s41588-018-0147-3
Okbay, A., …, & Young, A. I. (2022). Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. Nature Genetics, 52, 437–449.
Posth, C. et al. (2021). The origin and legacy of the Etruscans through a 2000-year archeogenomic time transect. Sci. Adv.7,eabi7673(2021). DOI:10.1126/sciadv.abi7673





Thanks for the excellent article.
"If this reading is right, the Republican period may represent a moment when Rome combined favorable institutional innovations with an unusually strong human-capital base. That combination helps explain why Rome did not merely compete within Italy, but repeatedly out-organized rivals and eventually dominated the Mediterranean."
One can not help but see parallels between the Roman Empire and the current Western Empire.
It is no surprise that when a civilization reaches a level of intelligence superior to others, it surges ahead in dominance. But when an advanced civilization gets out over its skis, and hubris replaces critical thinking, it is just a matter of time before it falls.
Nice work, Davide. I think these analyses of historical PGSs are some of the most important results relevant to social science. I suspect the correlation between complex civilization and PGSs like these will keep replicating in other populations.
Disagree about the "institutions" thing, though. Not sure if that's just a bone for the crazies to keep them off your back, but it doesn't compute.
"Institutions" are enacted social rules. "Enactment" is human behavior, and social rules are _fully_ constructed by human cognition. Both behavior and cognition are largely controlled by genes. More sophisticated cognition creates more complex rules that give better social results (civilization).
"Institutions" are thus just an intermediary between human cognition and civilization. A non-explanatory one.
Institutions are hard to transmit to different cultures, and hard to keep in place without outside influence. People tend to converge to the baseline of their own cognitive models and behaviors.
Am I missing something here?