Steppe Maykop

Thanks to Wang et al (2018), we got our first look at a group referred to as “Steppe Maykop”. Chernykh (2008) does briefly describe this group in his work, “Formation of the Eurasian “Steppe Belt” of stockbreeding cultures: viewed through the prism of archaeometallurgy and radiocarbon dating”.  He goes on to explain that this is a kurgan using culture outside of the Maykop proper zone,  north of the Kuban and Terek Basins, between the Sea of Azov and Caspian Sea (Chernykh, 2008). These steppe complexes contained Maykop cultural items (mostly pottery) (Chernykh, 2008). The image below, taken from Chernykh (2008), illustrates the location of Steppe Maykop.


Fig. 4.
Circumpontic Metallurgical Province at the early formation stage (proto-CMP).
1 – Maykop culture proper;
2 – sites of the so-called Maykop type (“Steppe Maykop”);
3 – Kura-Araks culture;
4 – late (northern) sites of the Uruk community
DNA results of Steppe Maykop
In the supplementary information of Wang et al (2018), the Steppe Maykop samples showed T2e, H2a1, and U7b. The one Y-DNA resulted in Q1a2. These are some interesting results, especially U7b and Q1a2.
U7b is likely rooted in West Asia between 5-10 KYA (Sahakyan et al, 2017), as it is mostly found in West Asia and Southeast Europe. It almost completely absent from Central Asia, where U7 and U7a are more typically found (Sahakyan et al, 2017). Ancient remains from Central Asia are also giving the same story, with U7 and U7a being the only ones found (Narasimhan et al, 2018). Also interesting is the hot spot of U7b in Georgia (Sahakyan et al, 2017), but this is based on modern populations. These Steppe Maykop samples are our first ancient samples with U7b. With their connection to Maykop, that may be the likely source of this haplogroup.
Q1a2 is a common haplogroup of Northern Asia and the Americas, showing that relationship to Native Americans that was discussed in Wang et al (2018). Today, it is fairly common among Turkic speakers of Siberia and Northern Central Asia, along with groups in the Americas.
Modeling with qpGraph
I started first by building a tree that included Chimp, Ust_Ishim, Botai, West_Siberia_N, Maykop_Novosvobodnaya, and Progress_Eneolithic. These samples come from Narasimhan et al (2018), Jeong et al (2018), and Wang et al (2018).
The next thing I did was add Steppe_Maykop to the column designating labels, but did not manually add them to any tree, to find the most significant scores for the first placement. The worst f-stat was Botai, Steppe_Maykop; West_Siberia_N Steppe_Maykop (Z>-49).
To determine if this stat was driven more by Botai or West_Siberia_N, I looked at the Z-scores where only the two were used (Botai, Steppe_Maykop; Botai, Steppe_Maykop and West_Siberia_N, Steppe_Maykop; West_Siberia_N, Steppe_Maykop).
The highest Z-score between the two was -42 between Botai and Steppe_Maykop, versus -30 for West_Siberia_N.
Trying to model this admixture as coming from just West_Siberia_N and not Botai left many 0 drift edges and a worst f-stat approaching 4, likely seeking admixture from Botai and more from Maykop, making it look less feasible. See first graph below. Second one below is the one where Steppe Maykop first branches off of Botai.
This graph left a worst f-stat asking for an edge from Progress_Eneolithic to Steppe_Maykop.
This last graph did have the Steppe_Maykop requiring more Maykop admixture than Progress_Eneolithic, more in-line with the f3-ratio stats (see below). However, I wanted to look at this another way as another way of looking at this created worst Z-scores asking for admixture from EHG, possibly pointing to this group have steppe ancestry that is more like Khvalynsk. Resulting in the tree below.
I also looked at this admixture potentially coming from Early_Maykop, as the f-3 score for both Maykop groups was similar. This also worked well.

This last one was more in-line with f3-ratio outputs and also resolved the need for extra EHG when removing the 0 drift edge in there.


I decided to look at f3-ratio as well, to look for confirmation of these results. The results showed significant scores where the two sources were like Botai and Early_Maykop/Maykop_Novosvobodnaya. So, there is some good confirmation here that the models line up.
Source 1 Source2 Target f_3 std. Error Z SNPs
Botai E_Maykop St_Maykop -0.008599 0.002107 -4.081 68545
Botai Maykop_N St_Maykop -0.008444 0.001574 -5.364 126581
Botai Progress St_Maykop -0.003914 0.001447 -2.705 141243
Botai Khvalynsk St_Maykop 0.005247 0.001706 3.075 106850

The f3-ratio using West_Siberia_N as the source 1 pop did produce more significant scores than Botai, but this could be due to the fact that Siberia_N is rather intermediary to Botai and Progress, as well as Khvalynsk.



When looking back at all the data, combining the uni-parental markers, along with the statistical output, it seems that a population from around the Urals did move into the region north of Maykop proper, and acquired their cultural material (Chernykh, 2008), along with their genetic input. Since Maykop and Steppe Maykop both have the same carbon dates, it could be that Early Maykop is the better source, but there is limited samples here across that time-frame to really come to that conclusion. Progress and Khvalynsk, or something in-between make a good fit as the steppe ancestry in these ssamples. All-in-all it seems very possible for this to be the story of the genesis of Steppe Maykop, from a source more related to Native Americans, along with more typical West Asian ancestry.



Chernyk (2008) Formation of the Eurasian “Steppe Belt” of stock-breeding cultures.

Jeong et al (2018) Characterizing the genetic history of admixture across inner Eurasia.

Narasimhan et al (2018) The genetic formation of South and Central Asia.

Sahakyan et al (2017) Origin and spread of human mitochondrial DNA haplogroup U7.

Wang et al (2018) The genetic history of the Greater Caucasus.



Steppe Eneolithic, or Prikaspiiskaya (Caspian Culture) from the “South”



(map created by Rob)

The new Wang et al. (2018) paper has opened up a new can of worms in the formation of Eneolithic steppe cultures. It also has some major gaps that leave room for a lot of speculation. What would be most helpful is samples from the Piedmont between 6000 BCE and 4500 BCE to see the formation of the Prikaspiiskaya Culture.

Vibornov (2016) reiterates a common theme suggesting that this culture originates on the Don at 5500BCE, but migrations from West Asia are also suggested. The only problem with this is that we have samples from Eastern Ukraine, and they are nothing like this. I would find it highly unlikely that this group originated close to here. I think the origin will be closer to the Caucasus, mixing with an early farming wave. Which direction they came from is more of the debate here.

The Meshoko samples presented are far too Anatolian to matter anything to the Steppe Eneolithic, and with good reason. They are 1500 years or more after farmers appear in the region and begin at the tail of Shulaveri-Shomu, which had increased contacts with Halaf in levels IV and V, near the end of the culture (Hamon, 2008). This likely increased the Anatolian ancestry of these farmers.  Meshoko-Darkveti is also influenced by local potters from the Don, one site that saw a deterioration of pottery quality in later sequences (Kozintsev, 2017).

Rob has also provided me with a nice graph here showing the Mesolithic through to the EBA for the region in question.



Not only this, but the fact that the Prikaspiiskaya differs and drifts towards the South and Eastern Caspian in terms of mud-brick architecture, lithics, and also domesticates. The early Jeitun culture is the first to be heavily dependent on sheep and also goats (Harris et al, 1996), differing from other traditions, such as Mesopotamians with their cattle and pigs. Prikaspiiskaya currently only shows sheep and goats as domesticates. Perhaps another clue as to where the contacts came from.

There are curious cases of potentially unrelated farming groups passing through the West Caspian region. Hajji Firuz shows evidence of an earlier group with different pottery. In Azerbaijan, local farmers, at least by 5900BCE could be the source of Shulaveri-Shomu, rather than farmers coming directly from Mesopotamia. Hajji Firuz has a good amount of Anatolian ancestry, probably far too much for Mesopotamians to make up much of the ancestry of even something as late as Meshoko. I would expect Shulaveri-Shomu to be even more shifted towards Iranian farmers, and even West Siberia Neolithic samples. I expect that Jeitun may be not much different from later farmers in the region, potentially between groups like Tepe Hissar ChL and Geoksiur EN. These groups look more like Iran_LN plus some CHG and West Siberia N-related ancestry.

Without these samples it is really hard to say what happened for sure, with regards to direction of flow. Is Prikaspiiskaya like South Caspian farmers, with the samples we have containing back-flow admixture from Khvalynskaya? It is hard to say. I think the big thing to take away is that this movement from the south happened after 6200BCE, but before 5500BCE. The Don should be ruled out with the Ukrainian samples. That leaves us the first Shulaveri-Shomu, or a Jeitun or Kelteminar-related group somehow navigating to the region, either through the Caucasus, via Azerbaijan, by boat, or around the East Caspian. The interesting link in this might be Caspian fluctuations that brought about an increase of 8m in depth (Naderi Beni et al, 2013) for the Caspian at the beginning of the Jeitun, and ended at the time Prikaspiiskaya appears. There is also evidence of a hiatus at Jeitun just before the beginning of the Prikaspiiskaya. Could some have traveled North? They got their sheep somewhere, and it wasn’t from fully developed SS.

Just to test this out, I tried a few runs with qpGraph, to see if these Eneolithic samples from the Piedmont want admixture from hunter-gatherers, like CHG, or from farmers around the South Caspian. I chose Tepe Hissar and Geoksiur as my stand-ins, as there is not much better of an option at this time. The results were pretty interesting.

I started by making a base graph that included Chimp, West_Siberia_N, EHG, CHG, Tepe_Hissar_ChL, and Geoksiur_EN.


This graph basically just left us with Geoksiur wanting some extra admixture from a population closer to West_Siberia_N, so that is where I went with the second graph.


With this all set, I went ahead and added Progress_Eneolithic. The first run, with them unattached, showed the strongest Z-score with EHG. So, that is where I placed them for the first run.


Interestingly, this graph left the most significant Z-scores wanting admixture from a source most like Tepe_Hissar, rather than CHG or even Geoksiur.


After completing this graph, Progress is now wanting some addition CHG. For the last graph, I put in an admixture edge from CHG, to Progress_Eneolithic.


What we are left with, is a population that is mostly represented by EHG, with minor CHG, and a good chunk of farmer from the South Caspian. This could represent several things. One is that the hunter-gatherers before 5500BCE were EHG with just minor CHG. Two, could be that the farmers coming in were more CHG-like than Tepe_Hissar, or lastly, that these samples have some back-flow from Khvalynsk, making them more northern. With the samples available, it is really a guessing game. Nothing can be said for certain until there are samples all around the Caspian from 6500-5000BCE. Then, we should know exactly what happened.

Just to try to look for a different result, I tried adding Ust-Ishim to the next set of graphs to see if anything changed. The graphs were as follows:







As can be seen from the above graphs, adding Ust-Ishim really caused no issues here. The output is basically identical from the first, less complex run. Next, I wanted to explore the route from the South Caspian, around the East Side of the Caspian Sea. For this, I figured that Iranian Mesolithic samples from Hotu and Belt Cave would do. They do not have much coverage, but beggars can’t be choosers here.

Firstly, I wanted to explore just how Iran_Meso compares to Ganj_Dareh_N with a series of graphs. They went as follows:




I was quite surprised to see the extra non-basal admixture into Iran Meso was more like EHG than West_Siberia_N. We need more sampling, but a pop closer to EHG down the Caspian is a possibility. Lots could’ve happened between then and the Eneolithic of Central Asia to create an excess of West_Siberia_N ancestry.

The next step was to use simple graphs to see how Progress Eneolithic would look by adding more “Eastern options” for admixture. For this one, I used Iran_Meso, and Eneolithic Caucasus as admixing sources. Ukraine_N was also added, as a Don HG reference to see if it matches those archeological best-guesses.






As the above graph progression shows, overall, the genetic make-up of the Progress samples best match the samples outside of the Caucasus. Minimal admixture from Meshoko is all that this graph produced. For the next set, I decided to add CHG. This made it much more complicated, but the overall outcome didn’t change, with CHG only taking a bit from Iran_Meso and Caucasus_Eneolithic.






f3-ratio Test for Admixing Source

Below, testing for an admixing source via f3 revealed no significant stats showing that one is a preferred source in a single event.

X Y Test f3 std. Error Z-score SNPs
CHG EHG Progress -0.001668 0.002449 -0.681 413273
Ganj_Dareh EHG Progress 0.000893 0.002306 0.387 445843
Iran_Meso EHG Progress 0.010461 0.003531 2.963 70383
Tepe_Hissar EHG Progress 0.001471 0.002257 0.652 473798
Hajji_Firuz EHG Progress 0.005077 0.002363 2.149 448044
Tepe_Anau EHG Progress 0.002793 0.002316 1.206 415830
Geoksiur EHG Progress 0.004428 0.002316 1.912 347963
Sarazm EHG Progress 0.004181 0.002509 1.667 352405
Cau_Eneo EHG Progress 0.001736 0.00254 0.683 319180
Iran_LN EHG Progress 0.001374 0.002707 0.508 222793


With all of the evidence here, along with the fact that Iran_Meso seems to fall on a Ganj_Dareh_N > EHG cline, rather than with West_Siberia_N, the question may become how much of the EHG is from North of the Caspian, with Progress? Is it all from Khvalynsk, local HG, or did it all come from East of the Caspian. Some of this could also be due to the excess ENA in West_Siberia_N not being present in Central Asia. Lithics point to movements from the South, towards the Urals, from at least the 8th Millenia BCE. Could some have come with domesticated animals? It certainly seems plausible, as these domesticates in the steppes do not include pigs, which was common in the Caucasus and Europe. Sheep do not occur in the wild in the Urals and European steppes. These are clearly brought in by other people. Could this dead-end R1b be from East of the Caspian? I suppose it is possible with R1b being in Botai, in northern Kazakhstan. R1b being rooted in Central Asia is not a new idea. It could be there as much as 30KYA, or more.

This is all just several possibilities. Considering the lithics are not linked to the Caucasus, but possibly to Central Asia, and the clear links with pottery and even the limited domesticates are pointing there too. FrankN has done a good job summarizing the southern links with the Steppe Eneolithic and checking out that would be of interest to some ( Kelteminar is speculated to originate from the South Caspian, in Iran. They very well could be just like the Hotu and Belt Cave samples. Clearly, samples from all around the Caspian, including the Caucasus, are needed between 6500BCE to 5000BCE to really say what happened. Until then, none of us can really say for sure. Hopefully, more sampling will help to shed light on just who these Eneolithic Steppe folks were.




Harris et al (1996) Jeitun: Recent excavations at an early neolithic site in Southern Turkmenistan.

Naderi-Beni et al (2013) Caspian sea-level changes during the last millennium: historical and geological evidence from the South Caspian Sea.

Narasimhan et al (2018) The genetic formation of South and Central Asia.

Szymcsak et al (2006) Exploring the neolithic of the Kyzyl-Kums.

Vybornov et al (2015) The origin of farming in the Lower Volga region.

Wang et al (2018) The genetic prehistory of the Greater Caucasus.









Pots, Not People, in Northwest Anatolia?

As pointed out to me, by Rob, it appears there is a new date for a sample from Barcin Hoyuk, in Turkey. This sample (I0707) is now dated to the Mesolithic (9650-9291 cal BCE). This is a huge find, as it changes the view of the way agriculture got to the region, all together. I feel that this may be the biggest thing included in the Wang et al (2018) paper, but others interested in Indo-European studies may disagree. I have contacted the authors to verify this, just to be sure.

Nearly everyone has interpreted the appearance of agriculture in NW Anatolia to some other place in Anatolia, via migration. This often includes Hacilar and Catalhoyuk. This is based on some similarities in pottery, but does ignore the differences in domestic plants in use, as well as the architecture in NW Anatolia.

With this new sample, it appears we may have our first evidence of agriculture and animal husbandry being adopted by a local, or near-local Mesolithic group. Using data available from Mathieson et al (2018) and Lazaridis et al (2016), I decided to compare how this Mesolithic sample compares to the farmers of Barcin, 3000 years later. The results were quite shocking.


Anatolia_N and Meso_Anatolia Form a Clade, With Respect To Other Groups

The following D-stats detail that the average of all farmers and the Mesolithic Barcin sample are nearly identical, only deviating with a Z>2 for Ukraine_N, towards Anatolia_N.

Outgroup Test X Y D Z-Score SNPs
Chimp CHG Anatolia_N Meso_Ana 0.000322 0.792 946889
Chimp Iran_N Anatolia_N Meso_Ana -0.000379 -0.895 751697
Chimp Iron_Gates Anatolia_N Meso_Ana -0.000465 -1.348 945737
Chimp Natufian Anatolia_N Meso_Ana -0.000126 -0.258 454638
Chimp Ukraine_N Anatolia_N Meso_Ana -0.000831 -2.441 877282
Chimp Levant_N Anatolia_N Meso_Ana -0.000047 -0.122 737416
Chimp Hajji_Firuz Anatolia_N Meso_Ana -0.000022 -0.062 879843
Chimp Pponnese_N Anatolia_N Meso_Ana 0.000149 0.435 917602
Chimp Starcevo Anatolia_N Meso_Ana -0.000125 -0.351 880994


F3-ratio test for Potential Admixture

In agreement with the D-stats, there is only one stat that sees a Z-score >3 for admixture going from Meso_Anatolia to Anatolia_N. This was with Ukraine_N, suggesting that there is potentially a pop nearby that contains some of this ancestry. Possibly from around Bulgaria.

Source 1 Source 2 Target f_3 std. Error Z SNPs
Meso_Ana Iron_Gates Anatolia_N -0.00164 0.000896 -1.831 745983
Meso_Ana Ukraine_N Anatolia_N -0.003474 0.000917 -3.787 671035
Meso_Ana Levant_N Anatolia_N -0.000648 0.001014 -0.639 547417
Meso_Ana CHG Anatolia_N 0.001094 0.001347 0.812 691079
Meso_Ana Iran_N Anatolia_N -0.002526 0.001415 -1.785 556216


qpAdm Model of Anatolia_N Using Meso_Anatolia

This testing also showed agreement with D-stats and f3-ratio. Anatolia_N was modeled as a two-way mixture of Meso_Anatolia and Ukraine_N. For the outgroups, I have used Chimp, Iron_Gates, Ust_Ishim, EHG, West_Siberia_N, IBM (Iberomaurusians), Ganj_Dareh_N, Brazil_LopaDoSanto_9600BP, Natufian, and CHG.

The test was also successful in modeling Anatolia_N as a mix of about 97 percent Meso_Anatolia, with 3 percent admixture from a source similar to Ukraine_N.

numsnps used: 520478

best coefficients: 0.970 0.030

std. errors: 0.017 0.017

fixed pat     wt    dof      chisq       tail prob

   00                0        8        5.839      0.665269


qpGraph Modeling of Anatolia_N

As a final check, I ran qpGraph to see if there would be more confirmation of the above stats.

While more complex, I tried to include several pops as the outgroups, or admixture sources in the creation of Meso_Anatolia, and how this compares to Anatolia_N. I have included Chimp as the outgroup, with CHG, Ust_Ishim, Natufians, Iron_Gates, and Ukraine_N as potential sources of ancestry and additional admixture for Anatolia_N. This first graph, is the base of Meso Anatolia, where a mix of a pop similar to Iron_Gates, Natufians, and CHG does well to create the potential source population at Barcin Hoyuk during the Mesolithic.


For the next graph, I included Anatolia_N, connected to the Mesolithic Barcin sample. Essentially seeing if they are a clade, as this was the heavy favorite in Z-score, when not attaching Anatolia_N to any one population.


These two populations did indeed, essentially form a clade. There isn’t a great deal of reason to improve this, but with the stat connecting Anatolia_N to Ukraine_N, I figured I would explore that and see if improved the fit.


This final graph actual showed the same thing as qpAdm, and in agreement with D-stats and f3-ratio, with Anatolia_N needing admixture from a population similar to Ukraine_N, at a rate of 3 percent. Depending on the response from the authors, to verify the dating of the sample, it does appear that farming was a local event, potentially adopted by the previous Mesolithic inhabitants.

While this is only based on a single sample, there is enough here to suggest the possibility that the Neolithic package arrived in Barcin through cultural exchange, rather than demographic movements. More high quality samples from other parts of Anatolia will help to see if this holds up. Comparison between the shotgun Boncuklu and Barcin samples may have had some artifact affecting the stats showing movements from the Levant. Soon, I will look back at Europe, as the picture there looks much more complex, with some significant stats towards other populations other than hunter gatherers. This could include from the Levant and Eastern Anatolia.




Mathieson et al. (2018) The genomic history of southeastern Europe. Nature volume 555, pages 197–203 (08 March 2018)

Lazaridis et al (2016) Genomic insights into the origin of farming in the ancient Near East. Nature volume 536, pages 419–424 (25 August 2016)

Wang et al. (2018) The genetic prehistory of the Greater Caucasus.






Of Stone & Blood: The demography of the Megalithic expansions (work in progress)

One subject that has often been overlooked, is the genetic influence of the first megalith builders of Northern France. This location is meeting point between Danubian and Cardial Neolithic groups. Not only that, but also mixture with Mesolithic hunters of the region. This mixture can be seen quite clearly in the mtDNA (Rivollat et al, 2015; LeRoy et al, 2016). The one thing that is also questioned is just how much genetic impact did Megalithism have on the genetic make-up of Northern European farmers of the Middle and Late Neolithic. Is this a movement of people, or “monumental” ideas?


Since early times, the movement of farmers to Britain was seen as a movement from France, from Michelsberg, or related groups (Childe, 1931). To test this idea, I used datasets from Lipson et al, 2017; Olalde et al, 2018; and Mathieson et al, 2018) to look for a relationship using qpGraph.

Not only are the samples from Southern France and England alike, they are basically identical! Both are probably rooted in the Paris Basin, showing Danubian, Cardial, and Mesolithic roots. As  the link is shown below, in Le Roy et al (2016). Also, note the shift towards France, by later Central Farmers of the Middle to Late Neolithic.









Below, is the first graph, showing the genetic ties between England, and Southern France. Both are likely from the Paris Basin group.


D-stats showing France_MN and England_N as a clade

Out Test X Y D Z-score SNPs
Chimp Ust_Ishim France_MN England_N -0.000325 -1.052 890030
Chimp WHG France_MN England_N 0.000041 0.116 888629
Chimp Iberia_EN France_MN England_N -0.000248 -0.923 860108
Chimp LBKAustria France_MN England_N -0.000119 -0.469 887340
Chimp GermanMN France_MN England_N 0.000152 0.514 853634
Chimp Gokhem2 France_MN England_N -0.00022 -0.437 209487
Chimp MN_Iberia France_MN England_N -0.000539 -1.933 830076
Chimp GAC_Poland France_MN England_N 0.000002 0.007 694973
Chimp Czech_MN France_MN England_N -0.000167 -0.558 818848

Graph with England_N alone


German Middle Neolithic Funnelbeaker

This same set-up is the one I used to then find the relationship to later Megalith building cultures. The first one I chose to look at was the Funnelbeaker samples from Germany. The surprising thing from this graph is that the German Middle Neolithic samples (minus the Roessen sample) were almost exactly like the England and French Neolithic samples, with a slight bit more Danubian admixture.



Attempt to create TRB-Germany without admixture from French-related groups

The first graph, is just Germany_MN, uplaced, with the worst-Z being one that asks for Germany_MN and England_N to be closer.


The second graph forces Germany_MN to be attached to LBK


The worst-Z from this graph asks for WHG to be closer to Germany_MN. The following graph includes an edge from WHG to Germany_MN.


As a result of this graph, the worst-Z now asks for Germany_MN and Iberia_EN to be closer. Due to this, a migration edge from Iberia_EN is added to the next graph.


This last graph has many issues. Not only the zero drift edges, but a migration edge of 0% from Iberia to Germany and a worst-Z still not resolved. It seems that deviating from the desire to connect Germany_MN and England_N caused the poor outcome.

Iberian Middle Neolithic at La Mina

The second group that I looked at was the Middle Neolithic group from La Mina, in Northern Spain. Once again, the Middle Neolithic population was almost identical to the group with roots in France. Specifically, those that moved to Southern France.


Swedish Funnelbeaker

While Gokhem2 does not have as much coverage as I would like, I thought that they would also be good to check, to make sure they are also rooted in the same group. As suspected, she was also almost exactly like the English and French samples.



Globular Amphora in Poland

Since Globular Amphora is supposed to come out of the TRB group in Northern Germany, I thought that I would also take a look at them and see if they are also closely related to the farmers from France. They also turned out to have a great deal of their ancestry from this group.



So, the picture that is emerging is that at one time, groups from Iberia, to Britain, up to Scandinavia, and all the way to the steppes of Ukraine were rooted in the French Middle Neolithic.

D-stats suggestive of Cardial gene-flow

Out Test X Y D Z-score SNPs
Chimp England_N Iberia_EN LBKAustria -0.00065 -3.053 922319
Chimp France_MN Iberia_EN LBKAustria -0.00089 -3.519 877516
Chimp GermanMN Iberia_EN LBKAustria 0.000082 0.311 927867
Chimp Gokhem2 Iberia_EN LBKAustria 0.000096 0.253 232081
Chimp MN_Iberia Iberia_EN LBKAustria -0.001163 -4.727 888929
Chimp GAC Iberia_EN LBKAustria 0.000024 0.103 923159

African Asymmetry

Now, Chimp does have its shortfalls, such as the branch-shortening effect, and ancient attraction, but it is basically symmetrical in relationship to each of these samples. However, the same can not be said for Africans. That is the reason that I have not included them in these graphs. Now, with them, the graphs really don’t change, but you can have the need for edges from farmers to Africans. Farmers, specifically those from Europe, are much closer to basically all Africans, when put against ENA, hunter-gatherers, and more basal sources, such as Natufians.

This is reflected in graphs, and also D-stats. Now, how this is so is the big question. We may see farmer movement taking R1b-V88 and Iberian ancestry as far as South_Africa_2000BP, or something such as ascertainment bias. The fact that Malawi_Hora_Holocene_8100BP is also showing this is quite puzzling and leaning me more towards some kind of bias. The likelihood of a migration of something like a Dzudzuana to SE Africa that early could be possible, but without the actual sample, that is hard to test.

Out Test X Y D Z-Score SNPs
Chimp Mbuti_DG Iberia_EN LBK -0.000074 -0.522 982851
Chimp Mbuti_DG Iberia_EN Natufian -0.000816 -2.636 474711
Chimp Mbuti_DG Iberia_EN GanjDareh -0.000459 -1.691 787482
Chimp Mbuti_DG Iberia_EN IBM 0.000146 0.595 964435
Chimp Mbuti_DG Iberia_EN Van2900BP -0.000726 -2.156 432540
Chimp Mbuti_DG Iberia_EN Iron_Gates -0.000646 -3.658 983550
Chimp Mbuti_DG Iberia_EN EHG -0.00045 -1.947 943281
Chimp Mbuti_DG Iberia_EN SteppeEBA -0.000218 -1.259 981530
Chimp Mota Iberia_EN LBK -0.00013 -0.672 982461
Chimp Mota Iberia_EN Natufian -0.001552 -3.748 474581
Chimp Mota Iberia_EN GanjDareh -0.000572 -1.566 787235
Chimp Mota Iberia_EN IBM -0.000134 -0.4 964105
Chimp Mota Iberia_EN Van2900BP -0.000879 -1.906 432413
Chimp Mota Iberia_EN Iron_Gates -0.000761 -3.119 983154
Chimp Mota Iberia_EN EHG -0.0009 -2.745 942951
Chimp Mota Iberia_EN SteppeEBA -0.00039 -1.64 981147
Chimp S_A2000BP Iberia_EN LBK -0.000432 -2.587 931815
Chimp S_A2000BP Iberia_EN Natufian -0.000884 -2.562 455408
Chimp S_A2000BP Iberia_EN GanjDareh -0.000862 -2.975 750479
Chimp S_A2000BP Iberia_EN IBM -0.000536 -1.999 914899
Chimp S_A2000BP Iberia_EN Van2900BP -0.000479 -1.33 414977
Chimp S_A2000BP Iberia_EN Iron_Gates -0.00093 -4.843 932493
Chimp S_A2000BP Iberia_EN EHG -0.000834 -3.332 894713
Chimp S_A2000BP Iberia_EN Steppe_EBA -0.000556 -2.95 930610
Chimp Malawi_HH Iberia_EN LBK -0.000165 -0.79 575904
Chimp Malawi_HH Iberia_EN Natufian -0.000828 -1.801 320660
Chimp Malawi_HH Iberia_EN GanjDareh -0.000506 -1.312 505066
Chimp Malawi_HH Iberia_EN IBM -0.000526 -1.589 571516
Chimp Malawi_HH Iberia_EN Van2900BP -0.000141 -0.311 310516
Chimp Malawi_HH Iberia_EN Iron_Gates -0.000669 -2.642 576018
Chimp Malawi_HH Iberia_EN EHG -0.000607 -1.87 563496
Chimp Malawi_HH Iberia_EN SteppeEBA -0.000593 -2.408 575632
Chimp Yoruba Iberia_EN LBK -0.000133 -0.871 897516
Chimp Yoruba Iberia_EN Natufian -0.000948 -2.915 432020
Chimp Yoruba Iberia_EN GanjDareh -0.00057 -1.919 718598
Chimp Yoruba Iberia_EN IBM 0.001219 4.714 881191
Chimp Yoruba Iberia_EN Van2900BP -0.000978 -2.637 394707
Chimp Yoruba Iberia_EN Iron_Gates -0.000762 -4.026 898020
Chimp Yoruba Iberia_EN EHG -0.000876 -3.353 862918
Chimp Yoruba Iberia_EN SteppeEBA -0.000375 -1.995 896410

Testing with qpGraph also always resulted in gene-flow from a source like Iberia_EN, and sometimes also from Iberomaurusians (IBM). Mbuti is not bad, in that it is nearly perfectly aligned with Danubian and Cardial sources, the problem lies in the HG stats, where Mbuti does want to be closer to farmers than hunters, and the flow is always to Africa. Testing with flow both to and from Iberian farmers illustrates the same with all Africans in a tree. Here is a tree with South_Africa_2000BP as an example:




This asymmetry makes me wary of including them in analysis that includes both hunters and various types of farmers. I also feel adding too many more admixture edges will be distracting and admixture from Eurasians to Africans does need more ancient samples to resolve whether these are all legit or not.

Trying to create a graph without Chimp does lose a more neutral source for sorting out this farmer and hunter ancestry in Northern Europe, although it does seem minimal, judging by this basic tree involving England_N and France_MN. It seems the most affected is the actual Cardial ancestry in the samples. They become less Cardial than Danubian, in the tree not involving TRB samples, but actually are now closer to what they were with the other farmers involved in the tree.

However, moving onto Germany_MN, we see that Chimp would make the fit better, decreasing the worst Z that seems to want more Cardial influence to England_N.



Including Mbuti into the Analyses

As expected, adding an African to a graph, when they are not symmetrically related, created a graph that becomes overly complicated and does not match other statistical methods. This includes haplotype analysis of an Irish farmer that shared more ancestry with Cardial than Danubian farmers.


With this output, it seems safest to avoid using Africans in analysis that is Eurasian specific and stick to using Chimp as a more neutral outgroup. No African can be a purely symmetrical outgroup and can drastically affect the quality of the graph and qpAdm output.
With all of these results, it appears that not only is Funnelbeaker and Globular Amphora share a good amount of ideas regarding pottery and burial, but actual developed as a mix of Atlantic farmers, mixing with local farmer and hunter groups, adopting some of their culture and genes as well.
With the results from the graphs, D-stats, and also parental markers, it seems that it is quite probably that farmers from France are responsible for the spread of Megalithism. Not only do graphs prefer a closeness of farmers from France, England, and Germany, but there are new markers of “Western” origin in Germany Middle Neolithic farmers. This includes many shared Y-DNA markers, including I2a1b, and shared mtDNA that are not in Danubian farmers, such as U3, H3, H4, and others. Also notable, is the f3(England_N LBK_Austria Germany_MN -0.000433 -0.382 353272), suggesting gene-flow from France to Germany does work.
While more sampling across France will help, it does seem that there is a shared ancestry  between England, France, and Germany that has more to it than shared Danubian ancestry.
Gordon Childe, V. (1931). The Continental Affinities of British Neolithic Pottery. Archaeological Journal, 88(1), 37–66. doi:10.1080/00665983.1931.10853568

Melie Le Roy et al (2016)Distinct ancestries for similar funerary practices? A GIS analysiscomparing funerary, osteological and aDNA data from the Middle Neolithic necropolis Gurgy “Les Noisats” (Yonne, France)

Mark Lipson et al (2017) Parallel paleogenomic transects reveal complex genetic history of early European farmers. Nature volume 551, pages 368–372 (16 November 2017)
Inigo Olalde et al (2018) The Beaker phenomenon and the genomic transformation of Northwest Europe. Nature volume 555, pages 190–196 (08 March 2018)
Maïté Rivollat et al., When the Waves of European Neolithization Met: First Paleogenetic Evidence from Early Farmers in the Southern Paris Basin. PLoS ONE 2015. Open accessLINK [doi:10.1371/journal.pone.0125521]
Nicole Sheridan (2013) Early Neolithic habitation structures in Britain and Ireland: A matter of circumstance and context.

The Karasuk Culture: Potentially the Ancestors of Iranian and later Scytho-Sarmatian nomads

With the advances of aDNA, we have now begun to tackle  questions, such as the origin of the “Scythian peoples”.  This was first seen with Unterlander et al (2017), and more were included into  Damgaard et al (2018). With the help of Allentoft et al (2015), Mathieson et al. (2018), Narasimhan et al. (2018), along the two previously mentioned papers, I will check the question of origin for the early Iranian nomads.

Bagley (n.d.), attempted to summarize the work on the early Zhou period and their interaction with Siberian Bronze Age center. This was based on work by  Loeuwe & Shaugnessy (1999). This highlights interesting aspects of the trade between these two groups, with artifacts related to the Karasuk culture spreading to not only China, but also towards Europe (Bagley, n.d.). While their early dating of a movement (Chernyk, 2008), does not really match the genetic view to this point, there are later samples which hint in this direction.


chernyk.png Since the time of Herodotus, many have had their own ideas on the origins of the Scythians. Mallory (1989) noted that some thought that the origin lie in the west, in the region north of the Black Sea. Others, saw the Scythians, and Iranians in general, as originating in Central Asia, and even Siberia. Some have even thought that a multi-regional origin was more likely, with changes being cultural, rather than demographic.

Davis-Kimball (2005), was one that saw the Scythians as a group that was multi-ethnic, rather than group with a single origin, or denoting a single group of people. Sometimes, anything west of Inner Mongolia and China was referred to as Scythian, but Scythian would also sometimes be restricted to those in the Western and Central Steppes (Di Cosimo, 1999).

steppe culturesThe first way to go at this, I feel, is to look at Karasuk. A culture that Mallory (1997), described as very mobile, compared to Andronovo, that is known more by their kurgan burials than their settlements. Karasuk is also seen as being highly influential and starting the animal art so common among the “Scythian” people (Keyser et al, 2009). Mallory (1997) even mentions the potential of the Karasuk to have a specific “proto-Iranian” identity. The influence of the Yenesei, and Slab Grave people cannot be underplayed (Mallory, 1997). Okunevo is thought to be a mix of Afanasievo and local Yeneseian groups (Great Soviet Encyclopedia, 1979), in an area later within the Andronovo sphere, and this mixing may likely be the formation of the Karasuk culture within the Minusinsk Basin. Okunevo is thought to be the group that introduced realistic animal art to these later steppe pastoralists as well.

First of all, I wanted to take a look at the Karasuk cluster that is closer to the Andronovo samples in PCA. To understand the make-up of Karasuk, I first used qpAdm to find a valid model of their origin. With qpAdm, the set of right populations, or outgroups chosen included Mbuti_DG, Ust_Ishim, Kostenki14, EHG, Villabruna, Ganj_Dareh_N, Anatolia_N, Steppe_EMBA, Karitiana, and the Ami.

The most successful model of the Karasuk culture needed excess Han-related ancestry, in addition to the ENA found in the Okunevo samples. Best exemplified with the Shamanka_BA run.


Chi-square Tail-prob Andronovo Okunevo Han
17.866 0.0222543 0.721 0.279 NA
std error 0.02 0.02 NA
Chi-square Tail-prob Andronovo Okunevo Han
9.613 0.211584 0.766 0.178 0.056
std error 0.026 0.04 0.019
Chi-square Tail-prob Andronovo Shamanka_BA
5.1 0.746845 0.814 0.186
std error 0.016 0.016

Looking at the Deeper Ancestry of the Karasuk Culture, I tried to make them a mix of Sintashta, Afanasievo, and an ENA group from the Baikal area, Shamanka_EN. This made sense as to making a mixture of a Siberian hunter, Bronze Age steppe pastoralists, and also Middle to Late Bronze Age groups in Central Asia. While the standard errors are a little high, it is clear that the dominant ancestry in Karasuk is Sintashta-related.

Chi-square Tail-prob Sintashta Shamanka_EN Afanasievo
6.196 0.625314 0.686 0.189 0.125
std error 0.069 0.014 0.07

After adding Steppe_MLBA, Germany_MN, and West_Siberia_N to the pright outgroups:

Chi-square Tail-prob Sintashta Shamanka_BA Afanasievo
7.951 0.633621 0.541 0.178 0.281
std error 0.081 0.017 0.081

Interstingly, the Karasuk is also seen to have expanded, if not influenced all the way towards the Black Sea, and at least the Aral Sea (((((((((Trying to relocate citation!!!!!!))))))))

Other samples, dating to about the same time, North of the Aral sea are seen in Mezhovskaya. Even more interesting, is that samples are near genetic dittos to the Karasuk samples. Could Mezhovskaya be part of the western Karasuk group that creates the great cultural uniformity among earlier Iranian nomads through the Scythian period? Potentially, yes.


Chi-square Tail-prob Andronovo Okunevo Han
12.248 0.140492 0.741 0.259 NA
std error 0.028 0.028 NA
Chi-square Tail-prob Andronovo Okunevo Han
6.036 0.535555 0.784 0.151 0.064
std error 0.032 0.051 0.025
Chi-square Tail-prob Andronovo Shamanka_BA
5.318 0.723087 0.846 0.154
std error ..022 0.022

With Chechushkov et al (2018), we see that horse-riding in battle may have begun in Central Asia between 1500-1200 BCE. Which is, of course, during the highly mobile Karasuk period and within the range of these groups.

Mezhovskaya can essentially be modeled as 100% Karasuk with qpAdm, as any additional ancestry is within the standard error of that component.

The next question then is, is Karasuk, and possibly by extension Mezhovskaya, the homeland and ancestors of the Scythians? Are they also ancestral to the western Scythians, as far as Hungary?


art by Johnny Shumate

The first Scythian group I looked at was the Tagar Culture, which followed the Karasuk in the Minusinsk Basin. The Karasuk is indeed very important here for the Tagar. Even the Karasuk+Karasuk outlier combo works here. What’s even more interesting about the Tagar culture, is the great similarity between their art and that of the European Scythians (Keyser et al, 2009; Encyclopaedia Britannica, n.d.).


Chi-square Tail-prob Karsuk Okunevo
5.291 0.726104 0.933 0.067
std error 0.037 0.037
Chi-square Tail-prob Karasuk Shamanka_BA
7.201 0.515133 0.967 0.033
std error 0.021 0.021



The Pazyryk Culture is another well-known group of Scythians, that include the famous tattooed mummy. Their culture is seen as having been very warlike (Citation)))))))))))))

They also require a lot of Karasuk ancestry and also groups that are from nearby, or closely related groups to these samples.

Chi-square Tail-prob Karsuk Okunevo Han
16.037 0.0247822 0.313 0.34 0.347
std error 0.036 0.05 0.023
Chi-square Tail-prob Karasuk ShamankaBA Han
1.761 0.971876 0.43 0.43 0.14
std error 0.028 0.08 0.061



Chi-square Tail-prob Karasuk Okunevo
116.899 1.45E-21 0.41 0.59
std error 0.089 0.089
Chi-square Tail-prob Karasuk BMAC Han
6.075 0.531047 0.568 0.099 0.333
std error 0.045 0.042 0.019


Tian-Shan Saka

Chi-square Tail-prob Karasuk Okunevo BMAC Han
10.628 0.10059 0.574 0.134 0.21 0.082
std error 0.06 0.047 0.023 0.018
Chi-square Tail-prob Karasuk ShamankaBA BMAC
10.703 0.152108 0.618 0.173 0.209
std error 0.04 0.017 0.033

The Tian-Shan Saka graph here did get a little over-complicated for my taste, but with such a complex mixture it might be bound to happen.



Chi-square Tail-prob Karasuk Okunevo Han BMAC
6.786 .341095 0.429 0.284 0.180 0.107
std error 0.06 0.051 0.02 0.034
Chi-square Tail-prob Karasuk Shamanka_BA BMAC
2.488 .927977 0.526 0.372 0.102
std error 0.044 0.019 0.036


Scythian_Samara (Steppe_IA)

Chi-square Tail-prob Karasuk Armenia_EBA
20.194 0.00962638 0.923 0.077
std error 0.048 0.048
Chi-square Tail-prob Karasuk BMAC
15.104 0.0571488 0.863 0.137
std error 0.042 0.042
Chi-square Tail-prob Karasuk BMAC West_Siberia
9.936 0.192223 0.769 0.166 0.065
std error 0.067 0.045 0.037
Chi-square Tail-prob Karasuk BMAC Botai
8.261 0.310125 0.674 0.236 0.089
std error 0.108 0.068 0.056
Chi-square Tail-prob Mezhovskaya BMAC
12.765 0.120182 0.913 0.087
std error 0.05 0.05
Chi-square Tail-prob Tagar BMAC
17.493 0.0253636 0.841 0.159
std error 0.042 0.042

Hungarian Scythian

Chi-square Tail-prob Karasuk Hungary_BA
13.624 0.0921081 0.355 0.645
std error 0.035 0.035
Chi-square Tail-prob Karasuk Balkan_BA
13.99 0.0820368 0.247 0.753
std error 0.037 0.037
Chi-square Tail-prob Scythian_Samara Hungary_BA
18.514 0.0176836 0.314 0.686
std error 0.029 0.029
Chi-square Tail-prob Mezhovskaya Hungary_BA
16.258 0.0388319 0.339 0.661
std error 0.043 0.043



Allentoft et al., Population genomics of Bronze Age Eurasia, Nature 522, 167–172 (11 June 2015) doi:10.1038/nature14507

Bagley, R. Shang Archaeology; The Northern Zone. (1999)

“Central Asian arts: Neolithic and Metal Age cultures”. Encyclopædia Britannica Online. Encyclopædia Britannica

Chechushkov et al., Early horse bridle with cheekpieces as a marker of social change: An experimental and statistical study, Journal of Archaeological Science, Volume 97, September 2018, Pages 125-136,

Chernykh, The Formation of the Eurasian “Steppe Belt” of Stockbreeding Cultures.

Di Cosimo, Nicola, “The Northern Frontier in Pre-Imperial China (1,500 – 221 BC)”, in: M. Loeuwe, E.L. Shaughnessy, eds, The Cambridge History of Ancient China: From the Origins of Civilization to 221BC, 1999, Cambridge University Press 1999, ISBN 9780521470308

Keyser, Christine; Bouakaze, Caroline; Crubézy, Eric; Nikolaev, Valery G.; Montagnon, Daniel; Reis, Tatiana; Ludes, Bertrand (May 16, 2009). “Ancient DNA provides new insights into the history of south Siberian Kurgan people”. Human Genetics. Springer-Verlag.

Mallory, J. P. (1997). Encyclopedia of Indo-European Culture. Taylor & Francis. ISBN 1884964982.

Mathieson et al., (2018) The genomic history of southeastern Europe. Nature 555, 197-203. (Paper / doi:10.1038/nature25778)

Narasimhan et al, The Genomic Formation of South and Central Asia, Posted March 31, 2018, doi:

“Okunev Culture”. The Great Soviet Encyclopedia. 1979

Unterländer et al., Ancestry and demography and descendants of Iron Age nomads of the Eurasian Steppe, Nature Communications 8, Article number: 14615 (2017), doi:10.1038/ncomms14615

Another look at South Asian aDNA

With Narasimhan et al (2018), we got our first look at Central, South Central, and South Asian aDNA. Not only did we get to see new steppe samples throughout the Bronze Age, but even from the Chalcolithic, through the Bronze Age in the Turan region, including BMAC. While there certainly looks to be steppe ancestry in South Asia, it has likely been highly inflated with previously available aDNA, and those that did not account for ANE that was already present in the region. The anticipation of the soon to be released Harappan sample(s), the models will only improve further.

This post will be constantly evolving as I add new outputs from qpAdm and qpGraph, so keep checking back in.

What I have noticed using qpAdm is that South Asian Dravidians do wonders as stand-ins for Harappan ancestry. So, we may see that some group greatly resembles them. I have seen that using the Palliyar and Paniya does work well, but the Irula does seem to work best. I don’t know whether that really means anything or the fact that they have more coverage.

The first thing I did was to look for populations to occupy the right pops, or populations which create the most significant D-stats between my left populations, or those set as the populations used in the mixture. Aside from using an African, Mbuti_DG, I found that using Ust-Ishim, Onge, Ami, EHG, Iron_Gates, Anatolia_N, Ganj_Dareh_N, and Karitiana. Kostenki14 is a hit and miss, as it doesn’t always have significant stats involved comparing two populations. This could be due to the age of the sample and not really developing any significant drift that can help differentiate populations in the test. This can lead higher chi-squares and lower tail-probabilities.

For the following, Brahmin_SGDP and Brahmin_Tiwari did have good marker counts, ranging from 170-200K, but the Brahmin_TN and Brahmin_UP sit around 50K, so they should be taken with a grain of salt.

SIS1= Shahr_I_Sokhta_BA1

Arm_EBA= Armenia_EBA


SGDP chisq tail prob SIS1 Irula Sintashta Dali_EBA Arm_EBA
w Kostenki 2.872 0.82475 0.208 0.623 0.1 0.069 NA
std error 0.038 0.035 0.036 0.035 NA
w/o Kostenki 2.031 0.844866 0.212 0.62 0.103 0.065 NA
std error 0.037 0.034 0.036 0.035 NA
w Kostenki 1.868 0.760109 0.167 0.609 0.065 0.097 0.063
std error 0.071 0.034 0.063 0.051 0.08
w/o Kostenki 2.599 0.761475 0.178 0.61 0.057 0.097 0.058
std error 0.063 0.035 0.065 0.046 0.081
Tiwari chisq tail prob SIS1 Irula Sintashta Dali_EBA Arm_EBA
w Kostenki 8.292 0.217452 0.138 0.583 0.208 0.071 NA
std error 0.025 0.021 0.023 0.02 NA
w/o Kostenki 7.332 0.19711 0.139 0.577 0.211 0.073 NA
std error 0.025 0.02 0.023 0.02 NA
w Kostenki 5.855 0.320606 0.089 0.579 0.154 0.099 0.08
std error 0.04 0.021 0.039 0.026 0.049
w/o Kostenki 5.351 0.253112 0.096 0.574 0.162 0.097 0.071
std error 0.039 0.02 0.039 0.026 0.049
TN chisq tail prob SIS1 Irula Sintashta Dali_EBA Arm_EBA
w Kostenki 0.925 0.988309 0.156 0.656 0.113 0.074 NA
std error 0.043 0.039 0.04 0.04 NA
w/o Kostenki 0.557 0.989882 0.168 0.643 0.117 0.072 NA
std error 0.042 0.037 0.038 0.039 NA
w Kostenki 0.861 0.973004 0.145 0.653 0.097 0.086 0.019
std error 0.079 0.039 0.073 0.049 0.095
w/o Kostenki 0.0624 0.960366 0.191 0.638 0.13 0.068 -0.027
std error 0.079 0.038 0.072 0.049 0.093
UP chisq tail prob SIS1 Irula Sintashta Dali_EBA Arm_EBA
w Kostenki 7.561 0.272087 0.147 0.598 0.181 0.075 NA
std error 0.032 0.028 0.031 0.031 NA
w/o Kostenki 5.636 0.343262 0.151 0.59 0.188 0.071 NA
std error 0.031 0.027 0.029 0.029 NA
w Kostenki 7.338 0.196693 0.11 0.599 0.144 0.094 0.054
std error 0.066 0.028 0.059 0.039 0.081
w/o Kostenki 5.866 0.209375 0.136 0.59 0.171 0.08 0.022
std error 0.061 0.027 0.054 0.037 0.073


Dzh1 = Dzharkutan1_BA, Late BMAC

Steppe_E = Steppe_MLBA_East

SGDP chisq tail prob Irula Dzh1 Sintashta Steppe_E Dali_EBA
w Kostenki 6.954 0.433685 0.681 0.203 0.116 NA NA
std error 0.023 0.034 0.028 NA NA
6.294 0.505837 0.678 0.198 NA 0.124 NA
std error 0.023 0.034 NA 0.028 NA
3.787 0.705485 0.629 0.225 NA 0.075 0.07
std error 0.029 0.036 NA 0.037 0.033

The above is interesting in that there are whole graves spread around from India to West Asia that are completely late BMAC in character. There seems no possible way for there to not be detectable BMAC ancestry in South Asia, considering the amount of cemeteries and remains. I think the Harappan sample(s) will show that BMAC ancestry is indeed important in South Asia.

Looking at the Swat Valley samples, it gets even more interesting…

Aligrama chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
12.238 0.0568697 0.49 0.355 0.091 0.063
std error 0.03 0.036 0.034 0.03
Butkara_IA chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
5.148 0.524941 0.404 0.489 0.03 0.077
std error 0.029 0.034 0.034 0.031
Pak_IA_Ali chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
8.014 0.237064 0.431 0.419 0.087 0.063
std error 0.042 0.053 0.05 0.043
S_Sharif_IA chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
8.433 0.208056 0.437 0.364 0.141 0.059
std error 0.018 0.023 0.022 0.018
SPGT chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
11.378 0.0773638 0.316 0.503 0.113 0.069
std error 0.014 0.018 0.018 0.015

Interestingly, there seems to be no need for Andronovo admixture in Butkara, Pakistan_IA_Aligrama, and also the first Aligrama can do okay with just Dali, plus late BMAC. Of course, this all depends on the underlying population being similar to the Irula. Either way though, the Steppe ancestry should really not move. Next, I’ll see how including all BMAC samples affects the output.

Aligrama chisq tail prob Irula BMAC Steppe_East WSiberia_N
15.054 0.0198432 0.5 0.359 0.083 0.058
std error 0.027 0.036 0.034 0.022
Butkara_IA chisq tail prob Irula BMAC Steppe_East WSiberia_N
8.929 0.177591 0.411 0.477 0.052 0.06
std error 0.025 0.033 0.034 0.022
Pak_IA_Ali chisq tail prob Irula BMAC Steppe_East WSiberia_N
6.755 0.344126 0.438 0.402 0.104 0.055
std error 0.037 0.053 0.05 0.032
S_Sharif_IA chisq tail prob Irula BMAC Steppe_East WSiberia_N
10.458 0.106644 0.442 0.368 0.144 0.046
std error 0.015 0.022 0.022 0.013
SPGT chisq tail prob Irula BMAC Steppe_East WSiberia_N
8.626 0.195754 0.313 0.509 0.102 0.076
std error 0.011 0.016 0.015 0.009


Update 8-22-18– Looking at Shahr_I_Sokhta 1,2, and 3.

SIS1 chisquare Tail-prob Ganj_Dareh W_Siberia
46.394 7.33E-08 0.953 0.047
std error 0.02 0.02
SIS1 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia
2.715 0.84368 0.775 0.148 0.076
std error 0.03 0.021 0.019
SIS1 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Onge
5.606 0.346473 0.716 0.172 0.095 0.017
std error 0.051 0.027 0.022 0.031
SIS1 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Irula
  2.275 0.80994 0.738 0.155 0.073 0.034
std error 0.051 0.022 0.02 0.041
SIS2 chisquare Tail-prob Ganj_Dareh W_Siberia
31.452 5.13E-05 0.837 0.163
std error 0.021 0.021
SIS2 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia
31.336 2.19E-05 0.842 -0.005 0.163
    std error 0.036 0.023 0.022
SIS2 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Onge
7.09 0.214035 0.663 0.042 0.15 0.145
std error 0.057 0.031 0.024 0.034
SIS2 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Irula
9.597 0.0874816 0.61 0.031 0.128 0.231
std error 0.058 0.023 0.021 0.049
SIS2 chisquare Tail-prob Ganj_Dareh W_Siberia Irula
11.044 0.0870409 0.662 0.124 0.214
std error 0.043 0.022 0.047
SIS2 chisquare Tail-prob SIS1 Irula
20.805 0.00407046 0.661 0.339
std error 0.046 0.046
SIS2 chisquare Tail-prob SIS1 W_Siberia Irula
12.337 0.0548625 0.636 0.074 0.29
    std error 0.045 0.025 0.049
SIS2 chisquare Tail-prob Sarazm_EN Irula
7.525 0.376375 0.707 0.293
std error 0.043 0.043
SIS3 chisquare Tail-prob Ganj_Dareh W_Siberia
270.628 0 0.835 0.165
    std error 0.023 0.023
SIS3 chisquare Tail-prob Ganj_Dareh W_Siberia Onge
  14.444 0.0250507 0.494 0.089 0.417
std error 0.031 0.024 0.03
SIS3 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Onge
10.019 0.074706 0.401 0.063 0.097 0.439
std error 0.052 0.028 0.023 0.032
SIS3 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Irula
6.653 0.247741 0.229 0.004 0.039 0.727
std error 0.058 0.024 0.02 0.048
SIS3 chisquare Tail-prob Ganj_Dareh W_Siberia Irula
6.47 0.372669 0.231 0.035 0.734
std error 0.042 0.02 0.047
SIS3 chisquare Tail-prob SIS1 Irula
5.423 0.608532 0.23 0.77
std error 0.041 0.041
SIS3 chisquare Tail-prob Geoksiur Irula
5.274 0.626561 0.213 0.787
std error 0.038 0.038



Narasimhan et al, The Genomic Formation of South and Central Asia, Posted March 31, 2018, doi:





European Farmers Part I; Mediterranean vs Danubian

The farmers of Europe appear to be a very closely related group, that derives from a potentially singular source. We’ve seen several papers over the last couple years devoted to farmers. Last year brought us Lipson et al. (2017), and Mathieson et al. (2017). These papers brought us many new samples from the Mediterranean, Central Europe, and the Balkans. The datasets from these two papers will be the source that I am working with here.

Firstly, I wanted to look at a simple tree to find a decent fit. That led to the following:

Farmer simple

This one was not a bad fit. It just had one zero drift edge towards Iron Gates, which would probably be taken care of if more hunters were included that lacked as much ANE as Iron Gates. While the Peloponnese samples are an outgroup to the other farmers, the Koros samples, from the First Temperate Neolithic, appear to be very close to the ancestral population for both the Balkan and Mediterranean groups. For the purpose of starting here, it seems fine. Next, I wanted to add Iberia EN as an offshoot of the Cardial EN samples from Croatia, just to see if the two of Mediterranean origin really are closely related.

simple Farmer2

This graph left what looks to be a needed admixture event from a hunter branch related to Iron Gates, to Iberia EN.

simple Farmer3

This graph actually turned out very nice. Iberia EN was able to branch from the same population as the Croatian Cardial and only needed a little extra HG ancestry. This also removed the zero edge from Iron Gates. For the next run, I am going to place LBK Austria coming off the branch to Starcevo.


The first thing I will try after seeing this worst Z-score is to try an admixture edge from the branch related to Iron_Gates into LBK Austria.

simple Farmer5

Surprisingly, the admixture from a European hunter did not take care of that worst Z-score. So, I scrapped that and decided to go with the admixture from Croatian Cardial into LBK Austria.

simple Farmer6

This graph resulted in LBK being a mix of 59% Starcevo and 41% Cardial. Still, we have a worst Z that wants Starcevo to also be closer to Cardial. In this case, I will first try a shared branch opposite of Koros, and if needed, after the HG-related admixture at B3.

simple Farmer7

While this is not a bad result, we do have a couple zero edges here that I would like to resolve. The admixture from the Cardial branch to LBK has also reduced to 5% in this graph. The worst Z involves the Peloponnese Neolithic and Starcevo, and also Iron Gates and LBK. I first want to try an edge from around Iron Gates to LBK to see how that does.


For this last graph, the A2 node for HG was eliminated since there was a 0 drift edge. All HG admixture now comes off of A1. The extra HG into LBK Austria has now put the worst Z-score around 3, which isn’t too bad. The edges all look good. The surprising part is that LBK comes out nearly 50% Cardial-related. This is interesting because, LBK was seen as just a subset of late Starcevo and potentially some Vinca influence.  Since this is unexpected, I am going to see if there is more shared drift between the two before splitting, after the extra HG admixture coming after splitting with a group related to Koros EN.


Still, we have LBK Austria coming out as nearly half Cardial-related. While these results are interesting, they are not matching with D-stats, f3-ratio, or qpAdm results. There may be something else here that will take more complex graphs to figure out.

Here is another way of looking at it.


This graph makes a little more sense, with the separation of Mediterranean and Danubian groups a little more. The next step will be to separate the two before Koros EN. I will continue working from here for the rest of the post. If you have any more ideas, let me know. I will post updates as I have more.

Here are stats that have me thinking there is nothing here as far as admixture from Croatian Cardial.


Out Test Pop1 Pop2 D-stat Z-score SNPs
Mbuti_DG Cardial_EN LBK_Austria Starcevo -0.000114 -0.537 899799
Mbuti_DG Cardial_EN LBK_EN Starcevo 0.000076 0.398 903127
Mbuti_DG Cardial_EN LBK_Austria Koros_EN -0.000287 -0.959 892981
Mbuti_DG Cardial_EN LBK_EN Koros_EN -0.000035 -0.119 895930


Source Source Target f_3 std. Err Z SNPs
Koros_EN LaBrana Iberia_EN -0.002397 0.001668 -1.437 403414
Koros_EN Iron_Gates Iberia_EN -0.000355 0.001208 -0.294 598713
Koros_EN French_HG Iberia_EN -0.001492 0.001914 -0.779 127825
Cardial_EN LaBrana Iberia_EN -0.001178 0.001525 -0.773 410098
Cardial_EN Iron_Gates Iberia_EN 0.001799 0.001092 1.648 585125
Cardial_EN French_HG Iberia_EN -0.000595 0.001713 -0.348 131631
Koros_EN LaBrana LBK_EN -0.005487 0.00118 -4.65 580503
Koros_EN Iron_Gates LBK_EN -0.005153 0.000779 -6.615 755300
Koros_EN French_HG LBK_EN -0.003943 0.001382 -2.854 175316
Cardial_EN LaBrana LBK_EN -0.002165 0.001022 -2.118 555955
Cardial_EN Iron_Gates LBK_EN -0.001198 0.000649 -1.847 711734
Cardial_EN French_HG LBK_EN -0.001455 0.001165 -1.249 171888
Starcevo LaBrana LBK_EN -0.00003 0.000841 -0.035 600458
Starcevo Iron_Gates LBK_EN 0.000585 0.000568 1.031 770549
Starcevo French_HG LBK_EN -0.000667 0.001019 -0.654 180239
Koros_EN Iron_Gates LBK_Austria -0.005504 0.000912 -6.033 715204
Koros_EN Iron_Gates Cardial_EN 0.002384 0.001571 1.517 553269
Koros_EN Iron_Gates Iberia_EN -0.000355 0.001208 -0.294 598713
Starcevo Iberia_EN LBK_Austria 0.000357 0.000672 0.53 615071
Starcevo Iberia_EN LBK_EN -0.000208 0.000516 -0.403 665936
Starcevo Cardial_EN LBK_Austria 0.00095 0.00067 1.419 581722
Starcevo Cardial_EN LBK_EN 0.001295 0.000578 2.24 626360
Starcevo Iron_Gates LBK_Austria 0.000502 0.000715 0.702 729333
Starcevo Iron_Gates LBK_EN 0.000585 0.000568 1.031 770549
Koros_EN Iberia_EN LBK_EN -0.000175 0.000696 -0.251 647053
Koros_EN Iberia_EN Starcevo 0.000738 0.001072 0.689 496318
Koros_EN Cardial_EN LBK_EN 0.000021 0.000789 0.026 609865
Koros_EN Cardial_EN Starcevo -0.000584 0.001115 -0.524 470485



Lipson, M. et al. Parallel palaeogenomic transects reveal complex genetic history of early European farmers. Nature 551, 368–372 (2017)

Mathieson, I. et al. The genomic history of Southeastern Europe. Nature 555, 197-208 (2018)