Is there really Basal Eurasian, and or Iranian/Caucasus-related ancestry in Anatolia’s first hunters and farmers? Maybe not…

I have been having a go at some new graphs here to see if there is really any Basal Eurasian in early Anatolians. While I am limited to just a few samples and do not have access to Dzudzuana (Lazaridis et al (2018) yet, it has become quite clear that Basal Eurasian, or Ancient North African is not needed in these samples. And while it was also reported in Feldman et al (2019) that early farmers from Boncuklu harbored ancestry from a group related to Iranian farmers or Caucasus hunter-gatherers, this need not be the case.

While going back to look at d-stats between some of these groups, I noticed some oddities that were produced when using African populations and modern East Asians, with other ancient Eurasians. Modern Africans show a preference for more recent than ancient Eurasians. There’s even a significant preference for Chalcolithic Iranians in 10,000 year old samples from Malawi, which makes no sense. This is probably some type of artifact in the data. This, to me, was creating the impression of a Basal Eurasian branch where none need be found. When you compare ancient Eurasians to each other, it does disappear.

For instance, the Z-scores using Chimp as an outgroup, and even Altai or Denisovan, were completely different than using Mbuti, Mota, or any other African individual or population. The significance of these stats vanished when comparing the first Anatolians versus ancient European hunters.

Not only that, it also disappeared when using ancient East Asians and Native Americans such as Tianyuan, Vietnam_N, Vanuatu_2900BP and Brazil_LapaDoSanto_9600BP. Sure enough, when you compare modern East Asians versus these samples, the modern groups end up being significantly closer to Ust_Ishim and Africans. So, instead of all ancient East Asians having Basal Eurasian, or all modern East Asians having ancestry from Ust_Ishim or some special relationship with Africans, this may also be an artifact. The second is possible, but probably minimally so.

So, what we have is Dstats in the order of (Chimp/Altai/Denisovan, Ust_Ishim; Anatolia_HG/Boncuklu_N, SunghirIV/Kostenki14/Vestonice16/Tianyuan/Vietnam_N/Brazil_LapaDoSanto_9600BP) all being quite insignificant. This had me look into creating some more graphs to look at the issue and deciding to stay away from using modern East Asians and Africans in analysis of ancient Eurasians.

The graph that I ended up with combined Natufians, Iranian farmers from Ganj Dareh, both the Anatolian hunter-gatherer from Pinarbasi and early farmers from Boncuklu, Upper Paleolithic European SunghirIV, Iberomaurusians, and Upper Paleolithic samples from Siberia and China, Ust-Ishim, Yana, and Tianyuan, respectively.

For this graph, I grouped SunghirIV and the Anatolians on a branch together, separate from Ust_Ishim and Tianyuan, as neither show any signs of shared drifted with these groups that the other doesn’t have.

 

Anatolians2

With this graph, it brings up the interesting prospect that deep ancestry did not reach Anatolia, or the Caucasus with Dzudzuana, but later in the Levant, Caucasus, and Zagros region. With the very minimal excess sharing between Boncuklu and Iran, I chose to use Boncuklu as the population that donates the Anatolian, or Dzudzuana-like stuff to Iran, while the Natufians preferred a source more like Pinarbasi (Anatolia_HG).

Without samples covering a range of time between 30,000-10,000 years ago across all of West Asia, it is hard to say if this will pan out. For now, this is the best that I can do.

While neither the Anatolians or Sunghir share any special relationship with Tianyuan or Ust-Ishim, Yana is significantly closer to Upper Paleolithic Europeans, suggesting they stem from the same branch.

Iberomaurusians fit best as a mix of the so-called Basal Eurasian or Ancient North African and something coming from the same branch as, but before the split of Boncuklu and Anatolia_HG.

The Iranian farmers first wanted to be grouped with Boncuklu, then onto Yana and Iberomaurusians. Afterwards, there was a Z-score between 2-3 that asked for more ancestry related to Tianyuan, but deep than the branch to Yana, going into Iranians. I had noticed this long ago and included it in previous posts. It does keep the graph together much better as more populations are added as well.

As previously mentioned, I avoided the use of any African or modern East Asian population. I did not want to have any artifacts create unnecessary admixture events or ghost populations.

For the next graph, I added in the Gravettian samples from Vestonice. The graph still held together and with an insignificant Z-score.

Anatolians3

For my final go, I decided to add the Epigravettian sample, Villabruna, a 14,000 year old hunter-gatherer from Italy and the reason behind the name “Villabruna cluster”. Villabruna actually shares significantly more drift with ancient Anatolians than Vestonice does, but also significantly more with Ancient North Eurasians (ANE) like Yana and MA1, than the Anatolians do.

For this reason, I chose to make Villabruna, the much younger sample, a mix of Gravettian, Anatolian, and Yana. Due to the fact that there is really no place for some pure population to hide between Dzudzuana and SE European Gravettians, along with the fact that R1b and significant relationships with Native Americans are there, it makes sense that ANE would also be in there. To me, this model feels much more parsimonious than trying to make everyone some mixture involving the much younger population that has no evidence of existing before a much later period. As I said, it is running out of real estate for some place to hide-out for 20-30,000 years.

The major part of the reason I think that this Basal Eurasian or North African was needed in Anatolians, was making them have significant ancestry from Villabruna, who has that relationship with ANE and East Asians over Anatolia. This means that you then must compensate for that relationship not in Anatolia by giving them deeper ancestry that is probably not there. They share enough drift that these tools allow you to branch them together, but then you get caught up in a catch 22 by having to add all of these extra admixture events. For instance, having to make Anatolia a mix of Natufians/Levant, Iran, and WHG, rather than just another branch of West Eurasians that contributes to North Africans, Mesolithic Europeans, and into the Iranian plateau and maybe as far as India before the Neolithic.

In this last graph, Villabruna is a mix of  77% Vestonice and Anatolia_HG (51% and 49% respectively) and 23% from a lineage related to Yana.

anatolians4.png

While there may be new data that makes this impossible, there is certainly nothing there right now to say that it isn’t. There isn’t a single stat out there with ancient Eurasians that can break this tree down as far as I know. I included many populations to make sure that it would hold up. I look forward to seeing more data out of West Asia, and particularly access to the Dzudzuana samples to see if this holds up.

UPDATE:

Just to make things a little more clear, I will show in more detail what I am talking about. It is the use of Africans and modern East Asians that gets in the way of robust analysis of ancient Eurasians. For whatever reason, there is an artifact that draws modern Asians and ancient West Asians much closer to Africans than UP samples from Europe and Asia.

Here are a few d-stats to show what I am talking about..

Chimp Malawi_Ho IBM Ganj_Dareh -0.00011 -0.321 580681
Chimp Malawi_Holocene IBM SunghirIV -0.001876 -4.17 584427
Chimp Malawi_Holocene IBM AnatoliaHG -0.000176 -0.411 507596
Chimp Malawi_Holocene IBM Iran_ChL 0.000235 0.654 569410
Chimp Malawi_Holocene IBM Yana -0.002173 -5.98 589155
Chimp Malawi_Holocene Ganj_Dareh SunghirIV -0.001849 -4.311 581183
Chimp Malawi_Holocene Ganj_Dareh AnatoliaHG -0.000137 -0.33 506101
Chimp Malawi_Holocene Ganj_Dareh Iran_ChL 0.00042 1.45 567512
Chimp Malawi_Holocene Ganj_Dareh Yana -0.002128 -6.497 585873
Chimp Malawi_Holocene SunghirIV AnatoliaHG 0.001456 2.912 507356
Chimp Malawi_Holocene SunghirIV Iran_ChL 0.002128 5.01 569380
Chimp Malawi_Holocene SunghirIV Yana -0.000277 -0.64 591351
Chimp Malawi_Holocene Anatolia_HG Iran_ChL 0.000391 0.959 497265
Chimp Malawi_Holocene Anatolia_HG Yana -0.001986 -4.528 511289
Chimp Malawi_Holocene Iran_ChL Yana -0.002501 -7.398 573952
Chimp MbutiSGDP IBM SunghirIV -0.001592 -4.869 1037166
Chimp MbutiSGDP IBM AnatoliaHG -0.000818 -2.706 836965
Chimp MbutiSGDP Ganj_Dareh SunghirIV -0.001268 -4.134 1003826
Chimp MbutiSGDP Ganj_Dareh AnatoliaHG -0.000363 -1.268 823271
Chimp Mbuti.SGDP SunghirIV AnatoliaHG 0.000712 2.001 847590
Chimp Ust_Ishim SunghirIV AnatoliaHG 0.000044 0.062 845798
MbutiSGDP Ust_Ishim SunghirIV AnatoliaHG -0.000853 -1.281 882870
Tianyuan Ust_Ishim SunghirIV AnatoliaHG 0.000254 0.312 705443

As you can see, there are some things here that seem a little off. It is even worse when you use samples like Ust_Ishim, Yana, and Tianyuan. I can say that it isn’t about there being some type of ancient attraction to Chimp either. The attraction is with Africans. This isn’t only the case for Mbuti, but even ancient Africans like Malawi_Hora_Holocene. Stats are even more significant for the Chalcolithic Iranians, which makes zero sense for 10,000 year old samples from SE Africa. The trees below shows that when you combine Mbuti to here, it doesn’t mess up the Eurasian part of the tree, but requires admixture from those groups into the Mbuti. The first tree is just adding Mbuti to the base of the tree, then the admixture to the Mbuti in the second graph.

mbutianatolians.png

mbutianatolians2.png

The graph above would be even worse if I added a modern East Asian like the Onge. Ths would require Onge admixture to the Mbuti too. All-in-all, I think this shows that if you include an African or modern Asian in your graph to start, you are already setting yourself up for issues here that might affect the outcome. While Chimp is maybe not completely ideal as an outgroup, it is more inline with the graph that has no African or Chimp in it. The stats line up much better with the graph. Not only that, it is apparent that any deep ancestry not covered within in the Ust_Ishim, West, East branching is still easily detectable without Chimp or an African outgroup in the graph.

Another little interesting thing with this graph that lines up with the stats is the following comparison between Natufians and Iranians in relation to Yana and Tianyuan. If all East Asian ancestry in Iran was mediated via a Yana-like population, then the stat with Tianyuan should be minimal compared to Yana. However, this isn’t the case and lines up nice with the graph asking for significant East Asian into Iran.

Chimp Tianyuan Natufian Ganj_Dareh 0.00292 4.92 414476
Chimp Yana Natufian Ganj_Dareh 0.00139 2.683 484580

I think this all shows it is definitely safe to run qpGraph with no African or Chimp as an outgroup and running qpGraph with outpop: NULL. Safer, it seems, than including said groups at the risk of affecting analyses.

Sources

Felman et al (2019) Late Pleistocene human genome suggests a local origin for the first farmrs of central Anatolia. https://www.nature.com/articles/s41467-019-09209-7

Lazaridis et al (2018) Paleolithic DNA from the Caucasus reveals core of West Eurasian ancestry. https://www.biorxiv.org/content/10.1101/423079v1

 

6 thoughts on “Is there really Basal Eurasian, and or Iranian/Caucasus-related ancestry in Anatolia’s first hunters and farmers? Maybe not…”

  1. I’m still trying to get my head around the reasoning here. Does it really all come down to a problem with using modern Africans as outgroups? (the “Basal Eurasian effect” didn’t even require modern East Asians to trigger it, it just happened that it triggered when comparing with modern East Asians too). So basically:

    D(Mbuti, Ust-Ishim; WHG, Anatolia_N)

    Would be significantly negative, and that’s why it was thought that Anatolia_N had “Basal Eurasian” admixture. If you get that same stat and change Mbuti for Chimp/Neanderthal/Denisovan the effect disappears and the stat becomes insignificant?

    Or maybe you mean that Anatolia_N does have BEu admixture but Boncuklu_N/Anatolia_HG doesn’t, in which case swapping both in the above stat would be negative but then using the other alternative outgroups it becomes insignificant?

    If any of the above scenarios is true, it looks like a rather silly mistake and strange that no one has noticed it earlier. Especially if it implies Anatolia_N (Barcin_N) too (if it’s restricted to Anatolia_HG/Boncuklu_N it would be bit less strange given they have received less attention).

    As per the graph, still Natufian and Iran_N need that BEu/ANA admixture, though in very different amounts (53% vs. 12%), which should have made a significant difference in D-stats trying to figure out the amount of BEu admixture in them (while IIRC they are usually considered both the most “Basal” samples we have with over 50% admixture from it).

    While I never believed much in the concept of Basal Eurasian (instead I think it’s the result of gene flow between Africa and Eurasia -both ways-) after the main Out of Africa, the effect itself as seen in D-stats looked real to me. So it’d b surprising if it was just due to a simple mistake in the choice of outgroups.

    Like

  2. I have seen the D-stats you posted on Eurogenes now and got a much better idea about the problem. I guess that Ust-Ishim related ancestry in Magdalenians and ANE (and modern East Asians), with Magdalenian+ANE ancestry in WHG could explain this ghost BEu effect.

    So I guess this could be worked around for West Eurasian population by using Vanuatu_2900BP in the place of Ust-Ishim? And then which of the D-stats in the form (Chimp, Vanuatu_2900BP; West_Eurasian1, West_Eurasian2) are significant, if any? According to the qpGraph, probably any that involved Natufian would be significant. Probably Iran_N too. Any other?

    Like

  3. Probably you read this paper back then, but in case you didn’t:

    https://www.biorxiv.org/content/10.1101/674986v1

    “Some of the f4-statistics commonly used to study population history and admixture (13) even shift sign when using array SNPs compared to when using all discovered SNPs, thus incorrectly reversing the direction of the ancestry relationship one would infer from the same set of genomes (for example: f4(BantuKenya,San;Mandenka,Sardinian) is positive (Z=2.9) using all variants but negative (Z=-3.11) when using commonly employed array sites).”

    Like

  4. Alberto: Useful point to that paper. Their Fig S3 shows changes in Fst from different arrays, including Human Origins, against all discovered variants and variants identified through archaic outgroup. Extracted and plotted some pair differences here: https://imgur.com/a/GQrhWjJ

    Fst is analogous to the F2 statistics from which higher order F3, F4 statistics are built. Fst to Africa generally appears to drop relative to Human Origins on the all* and archaic variant spectrum across the board, but disproportionately high for Africa-Oceania and Africa-Native American comparisons and with higher Fst comparison. Fst does not appear to drop for EastAsia-Oceania or EastAsia-NativeAmerica. So if F2 statistics behaved analogously, you would expect this to possibly deepen splits in graphs within ENA relative to split from Africa in graphs.

    *all must include low frequency new variants from modern people, but as performs similarly to archaic assessed variants, probably mostly deals with ancient variants missed by HumOri.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s