With Narasimhan et al (2018), we got our first look at Central, South Central, and South Asian aDNA. Not only did we get to see new steppe samples throughout the Bronze Age, but even from the Chalcolithic, through the Bronze Age in the Turan region, including BMAC. While there certainly looks to be steppe ancestry in South Asia, it has likely been highly inflated with previously available aDNA, and those that did not account for ANE that was already present in the region. The anticipation of the soon to be released Harappan sample(s), the models will only improve further.
This post will be constantly evolving as I add new outputs from qpAdm and qpGraph, so keep checking back in.
What I have noticed using qpAdm is that South Asian Dravidians do wonders as stand-ins for Harappan ancestry. So, we may see that some group greatly resembles them. I have seen that using the Palliyar and Paniya does work well, but the Irula does seem to work best. I don’t know whether that really means anything or the fact that they have more coverage.
The first thing I did was to look for populations to occupy the right pops, or populations which create the most significant D-stats between my left populations, or those set as the populations used in the mixture. Aside from using an African, Mbuti_DG, I found that using Ust-Ishim, Onge, Ami, EHG, Iron_Gates, Anatolia_N, Ganj_Dareh_N, and Karitiana. Kostenki14 is a hit and miss, as it doesn’t always have significant stats involved comparing two populations. This could be due to the age of the sample and not really developing any significant drift that can help differentiate populations in the test. This can lead higher chi-squares and lower tail-probabilities.
For the following, Brahmin_SGDP and Brahmin_Tiwari did have good marker counts, ranging from 170-200K, but the Brahmin_TN and Brahmin_UP sit around 50K, so they should be taken with a grain of salt.
SIS1= Shahr_I_Sokhta_BA1
Arm_EBA= Armenia_EBA
SGDP |
chisq |
tail prob |
SIS1 |
Irula |
Sintashta |
Dali_EBA |
Arm_EBA |
w Kostenki |
2.872 |
0.82475 |
0.208 |
0.623 |
0.1 |
0.069 |
NA |
|
|
std error |
0.038 |
0.035 |
0.036 |
0.035 |
NA |
w/o Kostenki |
2.031 |
0.844866 |
0.212 |
0.62 |
0.103 |
0.065 |
NA |
|
|
std error |
0.037 |
0.034 |
0.036 |
0.035 |
NA |
w Kostenki |
1.868 |
0.760109 |
0.167 |
0.609 |
0.065 |
0.097 |
0.063 |
|
|
std error |
0.071 |
0.034 |
0.063 |
0.051 |
0.08 |
w/o Kostenki |
2.599 |
0.761475 |
0.178 |
0.61 |
0.057 |
0.097 |
0.058 |
|
|
std error |
0.063 |
0.035 |
0.065 |
0.046 |
0.081 |
Tiwari |
chisq |
tail prob |
SIS1 |
Irula |
Sintashta |
Dali_EBA |
Arm_EBA |
w Kostenki |
8.292 |
0.217452 |
0.138 |
0.583 |
0.208 |
0.071 |
NA |
|
|
std error |
0.025 |
0.021 |
0.023 |
0.02 |
NA |
w/o Kostenki |
7.332 |
0.19711 |
0.139 |
0.577 |
0.211 |
0.073 |
NA |
|
|
std error |
0.025 |
0.02 |
0.023 |
0.02 |
NA |
w Kostenki |
5.855 |
0.320606 |
0.089 |
0.579 |
0.154 |
0.099 |
0.08 |
|
|
std error |
0.04 |
0.021 |
0.039 |
0.026 |
0.049 |
w/o Kostenki |
5.351 |
0.253112 |
0.096 |
0.574 |
0.162 |
0.097 |
0.071 |
|
|
std error |
0.039 |
0.02 |
0.039 |
0.026 |
0.049 |
TN |
chisq |
tail prob |
SIS1 |
Irula |
Sintashta |
Dali_EBA |
Arm_EBA |
w Kostenki |
0.925 |
0.988309 |
0.156 |
0.656 |
0.113 |
0.074 |
NA |
|
|
std error |
0.043 |
0.039 |
0.04 |
0.04 |
NA |
w/o Kostenki |
0.557 |
0.989882 |
0.168 |
0.643 |
0.117 |
0.072 |
NA |
|
|
std error |
0.042 |
0.037 |
0.038 |
0.039 |
NA |
w Kostenki |
0.861 |
0.973004 |
0.145 |
0.653 |
0.097 |
0.086 |
0.019 |
|
|
std error |
0.079 |
0.039 |
0.073 |
0.049 |
0.095 |
w/o Kostenki |
0.0624 |
0.960366 |
0.191 |
0.638 |
0.13 |
0.068 |
-0.027 |
|
|
std error |
0.079 |
0.038 |
0.072 |
0.049 |
0.093 |
UP |
chisq |
tail prob |
SIS1 |
Irula |
Sintashta |
Dali_EBA |
Arm_EBA |
w Kostenki |
7.561 |
0.272087 |
0.147 |
0.598 |
0.181 |
0.075 |
NA |
|
|
std error |
0.032 |
0.028 |
0.031 |
0.031 |
NA |
w/o Kostenki |
5.636 |
0.343262 |
0.151 |
0.59 |
0.188 |
0.071 |
NA |
|
|
std error |
0.031 |
0.027 |
0.029 |
0.029 |
NA |
w Kostenki |
7.338 |
0.196693 |
0.11 |
0.599 |
0.144 |
0.094 |
0.054 |
|
|
std error |
0.066 |
0.028 |
0.059 |
0.039 |
0.081 |
w/o Kostenki |
5.866 |
0.209375 |
0.136 |
0.59 |
0.171 |
0.08 |
0.022 |
|
|
std error |
0.061 |
0.027 |
0.054 |
0.037 |
0.073 |
NEW-8-9-18
Dzh1 = Dzharkutan1_BA, Late BMAC
Steppe_E = Steppe_MLBA_East
SGDP |
chisq |
tail prob |
Irula |
Dzh1 |
Sintashta |
Steppe_E |
Dali_EBA |
w Kostenki |
6.954 |
0.433685 |
0.681 |
0.203 |
0.116 |
NA |
NA |
|
|
std error |
0.023 |
0.034 |
0.028 |
NA |
NA |
|
6.294 |
0.505837 |
0.678 |
0.198 |
NA |
0.124 |
NA |
|
|
std error |
0.023 |
0.034 |
NA |
0.028 |
NA |
|
3.787 |
0.705485 |
0.629 |
0.225 |
NA |
0.075 |
0.07 |
|
|
std error |
0.029 |
0.036 |
NA |
0.037 |
0.033 |
The above is interesting in that there are whole graves spread around from India to West Asia that are completely late BMAC in character. There seems no possible way for there to not be detectable BMAC ancestry in South Asia, considering the amount of cemeteries and remains. I think the Harappan sample(s) will show that BMAC ancestry is indeed important in South Asia.
Looking at the Swat Valley samples, it gets even more interesting…
Aligrama |
chisq |
tail prob |
Irula |
Dzh1 |
Steppe_E |
Dali_EBA |
|
12.238 |
0.0568697 |
0.49 |
0.355 |
0.091 |
0.063 |
|
|
std error |
0.03 |
0.036 |
0.034 |
0.03 |
Butkara_IA |
chisq |
tail prob |
Irula |
Dzh1 |
Steppe_E |
Dali_EBA |
|
5.148 |
0.524941 |
0.404 |
0.489 |
0.03 |
0.077 |
|
|
std error |
0.029 |
0.034 |
0.034 |
0.031 |
Pak_IA_Ali |
chisq |
tail prob |
Irula |
Dzh1 |
Steppe_E |
Dali_EBA |
|
8.014 |
0.237064 |
0.431 |
0.419 |
0.087 |
0.063 |
|
|
std error |
0.042 |
0.053 |
0.05 |
0.043 |
S_Sharif_IA |
chisq |
tail prob |
Irula |
Dzh1 |
Steppe_E |
Dali_EBA |
|
8.433 |
0.208056 |
0.437 |
0.364 |
0.141 |
0.059 |
|
|
std error |
0.018 |
0.023 |
0.022 |
0.018 |
SPGT |
chisq |
tail prob |
Irula |
Dzh1 |
Steppe_E |
Dali_EBA |
|
11.378 |
0.0773638 |
0.316 |
0.503 |
0.113 |
0.069 |
|
|
std error |
0.014 |
0.018 |
0.018 |
0.015 |
Interestingly, there seems to be no need for Andronovo admixture in Butkara, Pakistan_IA_Aligrama, and also the first Aligrama can do okay with just Dali, plus late BMAC. Of course, this all depends on the underlying population being similar to the Irula. Either way though, the Steppe ancestry should really not move. Next, I’ll see how including all BMAC samples affects the output.
Aligrama |
chisq |
tail prob |
Irula |
BMAC |
Steppe_East |
WSiberia_N |
|
15.054 |
0.0198432 |
0.5 |
0.359 |
0.083 |
0.058 |
|
|
std error |
0.027 |
0.036 |
0.034 |
0.022 |
Butkara_IA |
chisq |
tail prob |
Irula |
BMAC |
Steppe_East |
WSiberia_N |
|
8.929 |
0.177591 |
0.411 |
0.477 |
0.052 |
0.06 |
|
|
std error |
0.025 |
0.033 |
0.034 |
0.022 |
Pak_IA_Ali |
chisq |
tail prob |
Irula |
BMAC |
Steppe_East |
WSiberia_N |
|
6.755 |
0.344126 |
0.438 |
0.402 |
0.104 |
0.055 |
|
|
std error |
0.037 |
0.053 |
0.05 |
0.032 |
S_Sharif_IA |
chisq |
tail prob |
Irula |
BMAC |
Steppe_East |
WSiberia_N |
|
10.458 |
0.106644 |
0.442 |
0.368 |
0.144 |
0.046 |
|
|
std error |
0.015 |
0.022 |
0.022 |
0.013 |
SPGT |
chisq |
tail prob |
Irula |
BMAC |
Steppe_East |
WSiberia_N |
|
8.626 |
0.195754 |
0.313 |
0.509 |
0.102 |
0.076 |
|
|
std error |
0.011 |
0.016 |
0.015 |
0.009 |
Update 8-22-18– Looking at Shahr_I_Sokhta 1,2, and 3.
SIS1 |
chisquare |
Tail-prob |
Ganj_Dareh |
W_Siberia |
|
46.394 |
7.33E-08 |
0.953 |
0.047 |
|
|
std error |
0.02 |
0.02 |
SIS1 |
chisquare |
Tail-prob |
Ganj_Dareh |
Anatolia |
W_Siberia |
|
2.715 |
0.84368 |
0.775 |
0.148 |
0.076 |
|
|
std error |
0.03 |
0.021 |
0.019 |
SIS1 |
chisquare |
Tail-prob |
Ganj_Dareh |
Anatolia |
W_Siberia |
Onge |
|
5.606 |
0.346473 |
0.716 |
0.172 |
0.095 |
0.017 |
|
|
std error |
0.051 |
0.027 |
0.022 |
0.031 |
SIS1 |
chisquare |
Tail-prob |
Ganj_Dareh |
Anatolia |
W_Siberia |
Irula |
|
2.275 |
0.80994 |
0.738 |
0.155 |
0.073 |
0.034 |
|
|
std error |
0.051 |
0.022 |
0.02 |
0.041 |
SIS2 |
chisquare |
Tail-prob |
Ganj_Dareh |
W_Siberia |
|
31.452 |
5.13E-05 |
0.837 |
0.163 |
|
|
std error |
0.021 |
0.021 |
SIS2 |
chisquare |
Tail-prob |
Ganj_Dareh |
Anatolia |
W_Siberia |
|
31.336 |
2.19E-05 |
0.842 |
-0.005 |
0.163 |
|
|
std error |
0.036 |
0.023 |
0.022 |
SIS2 |
chisquare |
Tail-prob |
Ganj_Dareh |
Anatolia |
W_Siberia |
Onge |
|
7.09 |
0.214035 |
0.663 |
0.042 |
0.15 |
0.145 |
|
|
std error |
0.057 |
0.031 |
0.024 |
0.034 |
SIS2 |
chisquare |
Tail-prob |
Ganj_Dareh |
Anatolia |
W_Siberia |
Irula |
|
9.597 |
0.0874816 |
0.61 |
0.031 |
0.128 |
0.231 |
|
|
std error |
0.058 |
0.023 |
0.021 |
0.049 |
SIS2 |
chisquare |
Tail-prob |
Ganj_Dareh |
W_Siberia |
Irula |
|
11.044 |
0.0870409 |
0.662 |
0.124 |
0.214 |
|
|
std error |
0.043 |
0.022 |
0.047 |
SIS2 |
chisquare |
Tail-prob |
SIS1 |
Irula |
|
20.805 |
0.00407046 |
0.661 |
0.339 |
|
|
std error |
0.046 |
0.046 |
SIS2 |
chisquare |
Tail-prob |
SIS1 |
W_Siberia |
Irula |
|
12.337 |
0.0548625 |
0.636 |
0.074 |
0.29 |
|
|
std error |
0.045 |
0.025 |
0.049 |
SIS2 |
chisquare |
Tail-prob |
Sarazm_EN |
Irula |
|
7.525 |
0.376375 |
0.707 |
0.293 |
|
|
std error |
0.043 |
0.043 |
SIS3 |
chisquare |
Tail-prob |
Ganj_Dareh |
W_Siberia |
|
270.628 |
0 |
0.835 |
0.165 |
|
|
std error |
0.023 |
0.023 |
SIS3 |
chisquare |
Tail-prob |
Ganj_Dareh |
W_Siberia |
Onge |
|
14.444 |
0.0250507 |
0.494 |
0.089 |
0.417 |
|
|
std error |
0.031 |
0.024 |
0.03 |
SIS3 |
chisquare |
Tail-prob |
Ganj_Dareh |
Anatolia |
W_Siberia |
Onge |
|
10.019 |
0.074706 |
0.401 |
0.063 |
0.097 |
0.439 |
|
|
std error |
0.052 |
0.028 |
0.023 |
0.032 |
SIS3 |
chisquare |
Tail-prob |
Ganj_Dareh |
Anatolia |
W_Siberia |
Irula |
|
6.653 |
0.247741 |
0.229 |
0.004 |
0.039 |
0.727 |
|
|
std error |
0.058 |
0.024 |
0.02 |
0.048 |
SIS3 |
chisquare |
Tail-prob |
Ganj_Dareh |
W_Siberia |
Irula |
|
6.47 |
0.372669 |
0.231 |
0.035 |
0.734 |
|
|
std error |
0.042 |
0.02 |
0.047 |
SIS3 |
chisquare |
Tail-prob |
SIS1 |
Irula |
|
5.423 |
0.608532 |
0.23 |
0.77 |
|
|
std error |
0.041 |
0.041 |
SIS3 |
chisquare |
Tail-prob |
Geoksiur |
Irula |
|
5.274 |
0.626561 |
0.213 |
0.787 |
|
|
std error |
0.038 |
0.038 |
Narasimhan et al, The Genomic Formation of South and Central Asia, Posted March 31, 2018, doi: https://doi.org/10.1101/292581
Like this:
Like Loading...
Related
Thanks, I was looking forward to this.
Have you been able to estimate any steppe admixture in Irula itself? I think it could be around 5%, some the admixture from Irula in the models above hovering around 60% would mean that the steppe admixture might increase some 3%.
Using Eurogenes’ Global 25 I get around 20% Sintashta in Brahmin_UP and 15% in Bramin_TN, so if that 5% in Irula is real then the results would be very close.
LikeLike
I think the Irula can have 0% actual steppe. They also lack any R1a, from what I can see, which is not very common. I think that we will see BMAC ancestry starts to peek its head out. We will see those graves weren’t a dead end in South Asia. Turan-like admixture may be higher than steppe ancestry in South Asia.
LikeLike
Nice work Chad
It seems that IA Swat samples are predominantly of Irula -type & BMAC ancestry, with anything between 3-14 % of steppe MBA and also some Siberian-type admixture.
Based on the received narrative, are we to expect that most of the Indus region was still non-IE in the late Iron Age? Perhaps there is an archaeologically invisible steppe – “Hub” in the plains of Punjab
LikeLike
It’s hard to say and I don’t like getting into linguistics too much, but there are a couple groups from IA Swat that need no Steppe MLBA for a plausible fit. Even Brahmins are comparable to these groups, so I still think Steppe ancestry has been very inflated in South Asia. BMAC is the more visible group archaeologically and would also be genetically speaking here.
LikeLike
Hey, a question from my side. In the admixture analysis plot, i have seen some % of component in Mal’ta Boy(ANE) that is maximized in some South Indian groups like Paniya ? Can you please tell what % of ancestry does an ANE like population contributes to paniya ?
Thanks
LikeLike
Thanks for this Chad keep up the good work.
LikeLike
@Alberto I haven’t been able to access your blog, I keep getting sent to some maintenance page.
LikeLike
@Al Bundy, sorry about that. If you still can’t access it, could you contact me (alberto6674 at gmail dot com) so I can figure out the problem? Thanks for reporting it.
@Chad, sorry, as soon as Al sees my message feel free to delete these off topic messages.
LikeLike
Ok got it , thanks Alberto and Chad.
LikeLike
Can you enlighten me as to the differences
between Kotia/Satsurbia (CHG), Ganj_Dareh_N and Wezmeh Cave (WBC1)
with regard to
proportional ancestry: from Basal Eurasian & EHG
or point me to a paper or commentary
The paper by Faranaz Broushaki ( science July 2016) ” Early Neolithic Genomes from Eastern Fertile Crescent “makes no mention Of the paper by M. Gallego-Llorente/R.Pinhasi” the Genetics of pastoralists from Zagros, Iran
Nature: August 2016
LikeLike
Sure. WC1 and Ganj Dareh are pretty much identical early farmers of the Zagros. CHG is largely descended from the same stock as the farmers, but with noticable input from Anatolian and EHG-like populations.
LikeLike
That answers my questio
follow up
1. Can you tell me as to what percent of CHG ancestry is from Anatolian and EHG respectively
2. And when some one like me who has 35 % Early Iranian Neolithic (like) ancestry + 10% from CHG + 5% from ANE ( ? Same as EHG)=
– the so called ANI
Can all this ANI have arrived in South Asia with Neolithic alone?
Does it need to be also from the steppe much later?
Because I’m a Dravidian speaker and can my trace local residence deep in south India for the past 7 generations
in the area of Irula and Puliya
Thank you for your time and effort in advance
LikeLike
Yes, I can elaborate a bit. Here is a qpAdm model for CHG.
left pops:
CHG
Anatolia_HG
Ganj_Dareh_N
EHG
right pops:
Chimp
Ust_Ishim
Brazil_LopaDoSanto_9600BP
MA1
Iron_Gates
Levant_N
Tepe_Abdul_Hosein_N
Kostenki14
Villabruna
West_Siberia_N
numsnps used: 563247
best coefficients: 0.253 0.644 0.102
std. errors: 0.024 0.020 0.017
fixed pat wt dof chisq tail prob
000 0 7 9.031 0.250445 0.253 0.644 0.102
As to your second question, you can see that from my post on South Asia, that some of the ANI is from local groups of C and SC Asia during the Chalcolithic. Some steppe input is needed for some of the Iron Age groups. As far as the Irula and Puliya, they could have all of their West Eurasian ancestry from both farmer and hunter input. The Irula may not have any steppe ancestry, but it wouldn’t be referred to as ANI in them, but more as ASI; referring to the ancestry that was present before the Iron Age. I hope that was clear.
LikeLike
“they could have all of their West Eurasian ancestry from both farmer and hunter input. The Irula may not have any steppe ancestry, but it wouldn’t be referred to as ANI in them, but more as ASI”
Thank you- that helps
LikeLike
@Abraham Joseph, i assume you are kerala(or nearby regions). The CHG ancestry in kerala and malabar region might be because of the traders from West asia many of whom settled there(e.g Syrian christians, Mapilas)
LikeLike
@Alberto, can you model the Ror and Jats groups in this paper https://www.cell.com/ajhg/fulltext/S0002-9297(18)30398-7 in terms of Indus peripheries(S1S1/S1S3) , BMAC and rest of the ancient samples ?
LikeLike
Alberto may not see that here. You may want to ask at his blog.
LikeLike
Stupid mistake from my side(i apologize) 🙂 but anyways Chad, can you model the Rors and Jats groups in terms of Indus_peripheries and other aDNA from south central Asia ?
LikeLike
Yes Tim- precisely – from central Kerala.
And there were Nestorian Christian traders between 5th and 9th centuries plying in Silk & Spice trade between Tang China and the Mediterranean. But their history is unrecorded. And so is the history of Kerala during these 4 centuries- zip. Interesting side to population genomics
LikeLike
Newbie here, sorry if my question is ignorant. I don’t quite understand Irula’s admixture in this above model, can you break it down for me on what their admixture is? I’m guessing Irula are mostly around 90%+ ENA with some minor Iran_N?
Based on your data looking at Irula here, would you guess Mesolithic South Asia were mix of ENA + ANE? or were they fully ENA?
LikeLike
The Irula are rumored to be the closest living population to the Rakhigarhi IVC samples. I use them as a baseline for native ancestry to the Indus. The Irula are mostly ENA, but I have not really looked hard at them. They should have ANE ancestry, on top of what was in the Iranian farmers.
LikeLike
I get very poor fits with them modeling as admixture of Neolithic + ENA in nMonte. Please try to model them with your qpAdm when you have time.
Irula:Average
fit: 13.4056,
Onge: 80,
Ganj_Dareh_N: 20,
Irula:Average
fit: 12.9848
Onge: 77.5
Sarazm Eneolithic : 22.5
Irula:Average
fit: 6.0546
Onge: 35
SIS3: 65
LikeLike
“I’m guessing Irula are mostly around 90%+ ENA with some minor Iran_N?” — a newbie just like you but i think Iran_N’s contribution to Irula might be around 20-25% .
“would you guess Mesolithic South Asia were mix of ENA + ANE? or were they fully ENA?” — Based on data till now it’s entirely possible that Iran_N like ancestry might have been present in Indus region prior to the Neolithic. David Reich himself said that this might be a possibility in a recent literature festival in india.
LikeLike
That is for Indus region, which is very obvious. But i’m talking about hunter-gatherer spectrum before the arrival of Iran_N-like folks.
LikeLike
Here is someones attempt at modeling them, They get good fit. Using something called “Simulated_AASI” from Anthogenica and G25 by DMXX i believe.
“Irula:Average”,
“fit”: 1.742,
“Simulated_AASI”: 64.17,
“Ganj_Dareh_N”: 23.33,
“Sintashta_MLBA”: 8.33,
“Thailand_IA”: 4.17,
LikeLike