The farmers of Europe appear to be a very closely related group, that derives from a potentially singular source. We’ve seen several papers over the last couple years devoted to farmers. Last year brought us Lipson et al. (2017), and Mathieson et al. (2017). These papers brought us many new samples from the Mediterranean, Central Europe, and the Balkans. The datasets from these two papers will be the source that I am working with here.

Firstly, I wanted to look at a simple tree to find a decent fit. That led to the following:

This one was not a bad fit. It just had one zero drift edge towards Iron Gates, which would probably be taken care of if more hunters were included that lacked as much ANE as Iron Gates. While the Peloponnese samples are an outgroup to the other farmers, the Koros samples, from the First Temperate Neolithic, appear to be very close to the ancestral population for both the Balkan and Mediterranean groups. For the purpose of starting here, it seems fine. Next, I wanted to add Iberia EN as an offshoot of the Cardial EN samples from Croatia, just to see if the two of Mediterranean origin really are closely related.

This graph left what looks to be a needed admixture event from a hunter branch related to Iron Gates, to Iberia EN.

This graph actually turned out very nice. Iberia EN was able to branch from the same population as the Croatian Cardial and only needed a little extra HG ancestry. This also removed the zero edge from Iron Gates. For the next run, I am going to place LBK Austria coming off the branch to Starcevo.


The first thing I will try after seeing this worst Z-score is to try an admixture edge from the branch related to Iron_Gates into LBK Austria.

simple Farmer5

Surprisingly, the admixture from a European hunter did not take care of that worst Z-score. So, I scrapped that and decided to go with the admixture from Croatian Cardial into LBK Austria.

simple Farmer6

This graph resulted in LBK being a mix of 59% Starcevo and 41% Cardial. Still, we have a worst Z that wants Starcevo to also be closer to Cardial. In this case, I will first try a shared branch opposite of Koros, and if needed, after the HG-related admixture at B3.

simple Farmer7

While this is not a bad result, we do have a couple zero edges here that I would like to resolve. The admixture from the Cardial branch to LBK has also reduced to 5% in this graph. The worst Z involves the Peloponnese Neolithic and Starcevo, and also Iron Gates and LBK. I first want to try an edge from around Iron Gates to LBK to see how that does.


For this last graph, the A2 node for HG was eliminated since there was a 0 drift edge. All HG admixture now comes off of A1. The extra HG into LBK Austria has now put the worst Z-score around 3, which isn’t too bad. The edges all look good. The surprising part is that LBK comes out nearly 50% Cardial-related. This is interesting because, LBK was seen as just a subset of late Starcevo and potentially some Vinca influence.  Since this is unexpected, I am going to see if there is more shared drift between the two before splitting, after the extra HG admixture coming after splitting with a group related to Koros EN.


Still, we have LBK Austria coming out as nearly half Cardial-related. While these results are interesting, they are not matching with D-stats, f3-ratio, or qpAdm results. There may be something else here that will take more complex graphs to figure out.

Here is another way of looking at it.


This graph makes a little more sense, with the separation of Mediterranean and Danubian groups a little more. The next step will be to separate the two before Koros EN. I will continue working from here for the rest of the post. If you have any more ideas, let me know. I will post updates as I have more.

Here are stats that have me thinking there is nothing here as far as admixture from Croatian Cardial.


Out Test Pop1 Pop2 D-stat Z-score SNPs
Mbuti_DG Cardial_EN LBK_Austria Starcevo -0.000114 -0.537 899799
Mbuti_DG Cardial_EN LBK_EN Starcevo 0.000076 0.398 903127
Mbuti_DG Cardial_EN LBK_Austria Koros_EN -0.000287 -0.959 892981
Mbuti_DG Cardial_EN LBK_EN Koros_EN -0.000035 -0.119 895930


Source Source Target f_3 std. Err Z SNPs
Koros_EN LaBrana Iberia_EN -0.002397 0.001668 -1.437 403414
Koros_EN Iron_Gates Iberia_EN -0.000355 0.001208 -0.294 598713
Koros_EN French_HG Iberia_EN -0.001492 0.001914 -0.779 127825
Cardial_EN LaBrana Iberia_EN -0.001178 0.001525 -0.773 410098
Cardial_EN Iron_Gates Iberia_EN 0.001799 0.001092 1.648 585125
Cardial_EN French_HG Iberia_EN -0.000595 0.001713 -0.348 131631
Koros_EN LaBrana LBK_EN -0.005487 0.00118 -4.65 580503
Koros_EN Iron_Gates LBK_EN -0.005153 0.000779 -6.615 755300
Koros_EN French_HG LBK_EN -0.003943 0.001382 -2.854 175316
Cardial_EN LaBrana LBK_EN -0.002165 0.001022 -2.118 555955
Cardial_EN Iron_Gates LBK_EN -0.001198 0.000649 -1.847 711734
Cardial_EN French_HG LBK_EN -0.001455 0.001165 -1.249 171888
Starcevo LaBrana LBK_EN -0.00003 0.000841 -0.035 600458
Starcevo Iron_Gates LBK_EN 0.000585 0.000568 1.031 770549
Starcevo French_HG LBK_EN -0.000667 0.001019 -0.654 180239
Koros_EN Iron_Gates LBK_Austria -0.005504 0.000912 -6.033 715204
Koros_EN Iron_Gates Cardial_EN 0.002384 0.001571 1.517 553269
Koros_EN Iron_Gates Iberia_EN -0.000355 0.001208 -0.294 598713
Starcevo Iberia_EN LBK_Austria 0.000357 0.000672 0.53 615071
Starcevo Iberia_EN LBK_EN -0.000208 0.000516 -0.403 665936
Starcevo Cardial_EN LBK_Austria 0.00095 0.00067 1.419 581722
Starcevo Cardial_EN LBK_EN 0.001295 0.000578 2.24 626360
Starcevo Iron_Gates LBK_Austria 0.000502 0.000715 0.702 729333
Starcevo Iron_Gates LBK_EN 0.000585 0.000568 1.031 770549
Koros_EN Iberia_EN LBK_EN -0.000175 0.000696 -0.251 647053
Koros_EN Iberia_EN Starcevo 0.000738 0.001072 0.689 496318
Koros_EN Cardial_EN LBK_EN 0.000021 0.000789 0.026 609865
Koros_EN Cardial_EN Starcevo -0.000584 0.001115 -0.524 470485



Lipson, M. et al. Parallel palaeogenomic transects reveal complex genetic history of early European farmers. Nature 551, 368–372 (2017)

Mathieson, I. et al. The genomic history of Southeastern Europe. Nature 555, 197-208 (2018)


7 thoughts on “European Farmers Part I; Mediterranean vs Danubian”

  1. Hey Chad, big fan of your work. If you don’t mind, I had a question about how exactly f3 testing works exactly? In a (Source, Source: Target) set-up, are you trying to see if you can model the target population as some combination of the two source populations? What do the raw f3 scores and Z score actually tell you about the relationship between the populations you’re testing?


  2. Mousterian,
    Thank you. From my understanding, that is what you are after. The stronger the negative f3 and Z-score, the more likely these populations are similar to the two that can create your target population. Although, sometimes mixtures can be more complex and you may have only a slightly negative or positive score due to there being a third population involved, like WHG MA1 EHG being slightly positive, although it is generally the story.
    What I see here is that Koros plus Iron Gates can be good enough to make LBK and Starcevo. While there is a slight negative with Iberia and slight positive with Cardial, and a more complex admixture could make it true, it seems unlikely with archaeology and the other methods. qpAdm is also not wanting any Iberian or Cardial admixture.
    Another problem could be that Cardial and Koros or Starcevo are not differentiated enough to see if the admixture is really true. It seems unlikely, but there is always the chance for this to change with more samples. As it sits, I wouldn’t feel comfortable claiming that the admixture from Cardial > LBK is real. The simple tree with no mixture between Mediterraneans and Danubians seems more believable.


  3. “The stronger the negative f3 and Z-score, the more likely these populations are similar to the two that can create your target population.”

    Interesting, so only if the raw f3 and the Z scores are negative, then you can infer the source populations (or groups similar to them) work as ancestors of the target? What does it mean if you get significantly positive values instead of negatives?


  4. It doesn’t mean that only they can be, but anything related. It depends on whether or not a source population is a good fit historically too. They could also be involved, but a third is needed and the results are positive. It depends on several factors. If they are very significantly positive, one or both of the populations don’t work, or you are missing a very important population which could be highly differentiated from the other two. It just takes trial and error, along with common sense, or knowledge of archaeology. Sometimes, relying on other methods can give you better results. qpGraph is my favorite as it creates many options with several co-fitted populations. f3 may not be best for very diverse populations, with several admixture events.


  5. The simple tree with farmer plus Iron Gates works fine, but there are some stats I’m looking at. A focus on the Danubians is coming. I’ll explore other potential admixture events that could be causing the issue.


  6. I think these Neolithic groups were constantly evolving and reformulating, as archaeologists had described with the the collapse of LBK, the rise of post-LBK horizon on the Danube versus different , HG-rich groups beyond the loess zone. Then there are possible secondary expansions from northern Balkans overlaying earlier (LBK) groups.
    A detailed following with uniparentals will also help analyses such as yours.


  7. Sure, I can look at those kind of things. One thing I have noticed is that all Y-DNA in MN Germany is the same as LBK and HG at Blatterhohle. There isn’t anything that really sticks out as later Danubian, which has quite a bit more of C1a2, H2 and I2c, which is basically nil in LBK and MN Germany. Now, there is a supposed late LBK event involving groups further east and southeast, but we would need samples stretching the whole 5500-4900BCE, which there isn’t really. I’m really curious about what might have been there before LBK and the whole la Hoguette thing that gets some all fired up. There are three camps on this. One is HG, one Cardial, and the other group sees it just as variation within LBK, with some LBK more of an exterior and pastoral group. I plan on looking into this all very thoroughly in my Danubian and LBK post.


