Another look at South Asian aDNA

With Narasimhan et al (2018), we got our first look at Central, South Central, and South Asian aDNA. Not only did we get to see new steppe samples throughout the Bronze Age, but even from the Chalcolithic, through the Bronze Age in the Turan region, including BMAC. While there certainly looks to be steppe ancestry in South Asia, it has likely been highly inflated with previously available aDNA, and those that did not account for ANE that was already present in the region. The anticipation of the soon to be released Harappan sample(s), the models will only improve further.

This post will be constantly evolving as I add new outputs from qpAdm and qpGraph, so keep checking back in.

What I have noticed using qpAdm is that South Asian Dravidians do wonders as stand-ins for Harappan ancestry. So, we may see that some group greatly resembles them. I have seen that using the Palliyar and Paniya does work well, but the Irula does seem to work best. I don’t know whether that really means anything or the fact that they have more coverage.

The first thing I did was to look for populations to occupy the right pops, or populations which create the most significant D-stats between my left populations, or those set as the populations used in the mixture. Aside from using an African, Mbuti_DG, I found that using Ust-Ishim, Onge, Ami, EHG, Iron_Gates, Anatolia_N, Ganj_Dareh_N, and Karitiana. Kostenki14 is a hit and miss, as it doesn’t always have significant stats involved comparing two populations. This could be due to the age of the sample and not really developing any significant drift that can help differentiate populations in the test. This can lead higher chi-squares and lower tail-probabilities.

For the following, Brahmin_SGDP and Brahmin_Tiwari did have good marker counts, ranging from 170-200K, but the Brahmin_TN and Brahmin_UP sit around 50K, so they should be taken with a grain of salt.

SIS1= Shahr_I_Sokhta_BA1

Arm_EBA= Armenia_EBA

 

SGDP chisq tail prob SIS1 Irula Sintashta Dali_EBA Arm_EBA
w Kostenki 2.872 0.82475 0.208 0.623 0.1 0.069 NA
std error 0.038 0.035 0.036 0.035 NA
w/o Kostenki 2.031 0.844866 0.212 0.62 0.103 0.065 NA
std error 0.037 0.034 0.036 0.035 NA
w Kostenki 1.868 0.760109 0.167 0.609 0.065 0.097 0.063
std error 0.071 0.034 0.063 0.051 0.08
w/o Kostenki 2.599 0.761475 0.178 0.61 0.057 0.097 0.058
std error 0.063 0.035 0.065 0.046 0.081
Tiwari chisq tail prob SIS1 Irula Sintashta Dali_EBA Arm_EBA
w Kostenki 8.292 0.217452 0.138 0.583 0.208 0.071 NA
std error 0.025 0.021 0.023 0.02 NA
w/o Kostenki 7.332 0.19711 0.139 0.577 0.211 0.073 NA
std error 0.025 0.02 0.023 0.02 NA
w Kostenki 5.855 0.320606 0.089 0.579 0.154 0.099 0.08
std error 0.04 0.021 0.039 0.026 0.049
w/o Kostenki 5.351 0.253112 0.096 0.574 0.162 0.097 0.071
std error 0.039 0.02 0.039 0.026 0.049
TN chisq tail prob SIS1 Irula Sintashta Dali_EBA Arm_EBA
w Kostenki 0.925 0.988309 0.156 0.656 0.113 0.074 NA
std error 0.043 0.039 0.04 0.04 NA
w/o Kostenki 0.557 0.989882 0.168 0.643 0.117 0.072 NA
std error 0.042 0.037 0.038 0.039 NA
w Kostenki 0.861 0.973004 0.145 0.653 0.097 0.086 0.019
std error 0.079 0.039 0.073 0.049 0.095
w/o Kostenki 0.0624 0.960366 0.191 0.638 0.13 0.068 -0.027
std error 0.079 0.038 0.072 0.049 0.093
UP chisq tail prob SIS1 Irula Sintashta Dali_EBA Arm_EBA
w Kostenki 7.561 0.272087 0.147 0.598 0.181 0.075 NA
std error 0.032 0.028 0.031 0.031 NA
w/o Kostenki 5.636 0.343262 0.151 0.59 0.188 0.071 NA
std error 0.031 0.027 0.029 0.029 NA
w Kostenki 7.338 0.196693 0.11 0.599 0.144 0.094 0.054
std error 0.066 0.028 0.059 0.039 0.081
w/o Kostenki 5.866 0.209375 0.136 0.59 0.171 0.08 0.022
std error 0.061 0.027 0.054 0.037 0.073

NEW-8-9-18

Dzh1 = Dzharkutan1_BA, Late BMAC

Steppe_E = Steppe_MLBA_East

SGDP chisq tail prob Irula Dzh1 Sintashta Steppe_E Dali_EBA
w Kostenki 6.954 0.433685 0.681 0.203 0.116 NA NA
std error 0.023 0.034 0.028 NA NA
6.294 0.505837 0.678 0.198 NA 0.124 NA
std error 0.023 0.034 NA 0.028 NA
3.787 0.705485 0.629 0.225 NA 0.075 0.07
std error 0.029 0.036 NA 0.037 0.033

The above is interesting in that there are whole graves spread around from India to West Asia that are completely late BMAC in character. There seems no possible way for there to not be detectable BMAC ancestry in South Asia, considering the amount of cemeteries and remains. I think the Harappan sample(s) will show that BMAC ancestry is indeed important in South Asia.

Looking at the Swat Valley samples, it gets even more interesting…

Aligrama chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
12.238 0.0568697 0.49 0.355 0.091 0.063
std error 0.03 0.036 0.034 0.03
Butkara_IA chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
5.148 0.524941 0.404 0.489 0.03 0.077
std error 0.029 0.034 0.034 0.031
Pak_IA_Ali chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
8.014 0.237064 0.431 0.419 0.087 0.063
std error 0.042 0.053 0.05 0.043
S_Sharif_IA chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
8.433 0.208056 0.437 0.364 0.141 0.059
std error 0.018 0.023 0.022 0.018
SPGT chisq tail prob Irula Dzh1 Steppe_E Dali_EBA
11.378 0.0773638 0.316 0.503 0.113 0.069
std error 0.014 0.018 0.018 0.015

Interestingly, there seems to be no need for Andronovo admixture in Butkara, Pakistan_IA_Aligrama, and also the first Aligrama can do okay with just Dali, plus late BMAC. Of course, this all depends on the underlying population being similar to the Irula. Either way though, the Steppe ancestry should really not move. Next, I’ll see how including all BMAC samples affects the output.

Aligrama chisq tail prob Irula BMAC Steppe_East WSiberia_N
15.054 0.0198432 0.5 0.359 0.083 0.058
std error 0.027 0.036 0.034 0.022
Butkara_IA chisq tail prob Irula BMAC Steppe_East WSiberia_N
8.929 0.177591 0.411 0.477 0.052 0.06
std error 0.025 0.033 0.034 0.022
Pak_IA_Ali chisq tail prob Irula BMAC Steppe_East WSiberia_N
6.755 0.344126 0.438 0.402 0.104 0.055
std error 0.037 0.053 0.05 0.032
S_Sharif_IA chisq tail prob Irula BMAC Steppe_East WSiberia_N
10.458 0.106644 0.442 0.368 0.144 0.046
std error 0.015 0.022 0.022 0.013
SPGT chisq tail prob Irula BMAC Steppe_East WSiberia_N
8.626 0.195754 0.313 0.509 0.102 0.076
std error 0.011 0.016 0.015 0.009

 

Update 8-22-18– Looking at Shahr_I_Sokhta 1,2, and 3.

SIS1 chisquare Tail-prob Ganj_Dareh W_Siberia
46.394 7.33E-08 0.953 0.047
std error 0.02 0.02
SIS1 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia
2.715 0.84368 0.775 0.148 0.076
std error 0.03 0.021 0.019
SIS1 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Onge
5.606 0.346473 0.716 0.172 0.095 0.017
std error 0.051 0.027 0.022 0.031
SIS1 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Irula
  2.275 0.80994 0.738 0.155 0.073 0.034
std error 0.051 0.022 0.02 0.041
SIS2 chisquare Tail-prob Ganj_Dareh W_Siberia
31.452 5.13E-05 0.837 0.163
std error 0.021 0.021
SIS2 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia
31.336 2.19E-05 0.842 -0.005 0.163
    std error 0.036 0.023 0.022
SIS2 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Onge
7.09 0.214035 0.663 0.042 0.15 0.145
std error 0.057 0.031 0.024 0.034
SIS2 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Irula
9.597 0.0874816 0.61 0.031 0.128 0.231
std error 0.058 0.023 0.021 0.049
SIS2 chisquare Tail-prob Ganj_Dareh W_Siberia Irula
11.044 0.0870409 0.662 0.124 0.214
std error 0.043 0.022 0.047
SIS2 chisquare Tail-prob SIS1 Irula
20.805 0.00407046 0.661 0.339
std error 0.046 0.046
SIS2 chisquare Tail-prob SIS1 W_Siberia Irula
12.337 0.0548625 0.636 0.074 0.29
    std error 0.045 0.025 0.049
SIS2 chisquare Tail-prob Sarazm_EN Irula
7.525 0.376375 0.707 0.293
std error 0.043 0.043
SIS3 chisquare Tail-prob Ganj_Dareh W_Siberia
270.628 0 0.835 0.165
    std error 0.023 0.023
SIS3 chisquare Tail-prob Ganj_Dareh W_Siberia Onge
  14.444 0.0250507 0.494 0.089 0.417
std error 0.031 0.024 0.03
SIS3 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Onge
10.019 0.074706 0.401 0.063 0.097 0.439
std error 0.052 0.028 0.023 0.032
SIS3 chisquare Tail-prob Ganj_Dareh Anatolia W_Siberia Irula
6.653 0.247741 0.229 0.004 0.039 0.727
std error 0.058 0.024 0.02 0.048
SIS3 chisquare Tail-prob Ganj_Dareh W_Siberia Irula
6.47 0.372669 0.231 0.035 0.734
std error 0.042 0.02 0.047
SIS3 chisquare Tail-prob SIS1 Irula
5.423 0.608532 0.23 0.77
std error 0.041 0.041
SIS3 chisquare Tail-prob Geoksiur Irula
5.274 0.626561 0.213 0.787
std error 0.038 0.038

 

 

Narasimhan et al, The Genomic Formation of South and Central Asia, Posted March 31, 2018, doi: https://doi.org/10.1101/292581

 

 

 

 

25 thoughts on “Another look at South Asian aDNA”

  1. Thanks, I was looking forward to this.
    Have you been able to estimate any steppe admixture in Irula itself? I think it could be around 5%, some the admixture from Irula in the models above hovering around 60% would mean that the steppe admixture might increase some 3%.
    Using Eurogenes’ Global 25 I get around 20% Sintashta in Brahmin_UP and 15% in Bramin_TN, so if that 5% in Irula is real then the results would be very close.

    Like

  2. I think the Irula can have 0% actual steppe. They also lack any R1a, from what I can see, which is not very common. I think that we will see BMAC ancestry starts to peek its head out. We will see those graves weren’t a dead end in South Asia. Turan-like admixture may be higher than steppe ancestry in South Asia.

    Like

  3. Nice work Chad
    It seems that IA Swat samples are predominantly of Irula -type & BMAC ancestry, with anything between 3-14 % of steppe MBA and also some Siberian-type admixture.
    Based on the received narrative, are we to expect that most of the Indus region was still non-IE in the late Iron Age? Perhaps there is an archaeologically invisible steppe – “Hub” in the plains of Punjab

    Like

  4. It’s hard to say and I don’t like getting into linguistics too much, but there are a couple groups from IA Swat that need no Steppe MLBA for a plausible fit. Even Brahmins are comparable to these groups, so I still think Steppe ancestry has been very inflated in South Asia. BMAC is the more visible group archaeologically and would also be genetically speaking here.

    Like

    1. Hey, a question from my side. In the admixture analysis plot, i have seen some % of component in Mal’ta Boy(ANE) that is maximized in some South Indian groups like Paniya ? Can you please tell what % of ancestry does an ANE like population contributes to paniya ?
      Thanks

      Like

  5. @Al Bundy, sorry about that. If you still can’t access it, could you contact me (alberto6674 at gmail dot com) so I can figure out the problem? Thanks for reporting it.

    @Chad, sorry, as soon as Al sees my message feel free to delete these off topic messages.

    Like

  6. Can you enlighten me as to the differences
    between Kotia/Satsurbia (CHG), Ganj_Dareh_N and Wezmeh Cave (WBC1)
    with regard to
    proportional ancestry: from Basal Eurasian & EHG
    or point me to a paper or commentary
    The paper by Faranaz Broushaki ( science July 2016) ” Early Neolithic Genomes from Eastern Fertile Crescent “makes no mention Of the paper by M. Gallego-Llorente/R.Pinhasi” the Genetics of pastoralists from Zagros, Iran
    Nature: August 2016

    Like

    1. Sure. WC1 and Ganj Dareh are pretty much identical early farmers of the Zagros. CHG is largely descended from the same stock as the farmers, but with noticable input from Anatolian and EHG-like populations.

      Like

      1. That answers my questio
        follow up
        1. Can you tell me as to what percent of CHG ancestry is from Anatolian and EHG respectively
        2. And when some one like me who has 35 % Early Iranian Neolithic (like) ancestry + 10% from CHG + 5% from ANE ( ? Same as EHG)=
        – the so called ANI

        Can all this ANI have arrived in South Asia with Neolithic alone?
        Does it need to be also from the steppe much later?
        Because I’m a Dravidian speaker and can my trace local residence deep in south India for the past 7 generations
        in the area of Irula and Puliya

        Thank you for your time and effort in advance

        Like

      2. Yes, I can elaborate a bit. Here is a qpAdm model for CHG.

        left pops:
        CHG
        Anatolia_HG
        Ganj_Dareh_N
        EHG

        right pops:
        Chimp
        Ust_Ishim
        Brazil_LopaDoSanto_9600BP
        MA1
        Iron_Gates
        Levant_N
        Tepe_Abdul_Hosein_N
        Kostenki14
        Villabruna
        West_Siberia_N

        numsnps used: 563247

        best coefficients: 0.253 0.644 0.102
        std. errors: 0.024 0.020 0.017

        fixed pat wt dof chisq tail prob
        000 0 7 9.031 0.250445 0.253 0.644 0.102

        As to your second question, you can see that from my post on South Asia, that some of the ANI is from local groups of C and SC Asia during the Chalcolithic. Some steppe input is needed for some of the Iron Age groups. As far as the Irula and Puliya, they could have all of their West Eurasian ancestry from both farmer and hunter input. The Irula may not have any steppe ancestry, but it wouldn’t be referred to as ANI in them, but more as ASI; referring to the ancestry that was present before the Iron Age. I hope that was clear.

        Like

  7. “they could have all of their West Eurasian ancestry from both farmer and hunter input. The Irula may not have any steppe ancestry, but it wouldn’t be referred to as ANI in them, but more as ASI”

    Thank you- that helps

    Like

  8. @Abraham Joseph, i assume you are kerala(or nearby regions). The CHG ancestry in kerala and malabar region might be because of the traders from West asia many of whom settled there(e.g Syrian christians, Mapilas)

    Like

      1. Stupid mistake from my side(i apologize) 🙂 but anyways Chad, can you model the Rors and Jats groups in terms of Indus_peripheries and other aDNA from south central Asia ?

        Like

  9. Yes Tim- precisely – from central Kerala.
    And there were Nestorian Christian traders between 5th and 9th centuries plying in Silk & Spice trade between Tang China and the Mediterranean. But their history is unrecorded. And so is the history of Kerala during these 4 centuries- zip. Interesting side to population genomics

    Like

  10. Newbie here, sorry if my question is ignorant. I don’t quite understand Irula’s admixture in this above model, can you break it down for me on what their admixture is? I’m guessing Irula are mostly around 90%+ ENA with some minor Iran_N?

    Based on your data looking at Irula here, would you guess Mesolithic South Asia were mix of ENA + ANE? or were they fully ENA?

    Like

    1. The Irula are rumored to be the closest living population to the Rakhigarhi IVC samples. I use them as a baseline for native ancestry to the Indus. The Irula are mostly ENA, but I have not really looked hard at them. They should have ANE ancestry, on top of what was in the Iranian farmers.

      Like

      1. I get very poor fits with them modeling as admixture of Neolithic + ENA in nMonte. Please try to model them with your qpAdm when you have time.

        Irula:Average
        fit: 13.4056,
        Onge: 80,
        Ganj_Dareh_N: 20,

        Irula:Average
        fit: 12.9848
        Onge: 77.5
        Sarazm Eneolithic : 22.5

        Irula:Average
        fit: 6.0546
        Onge: 35
        SIS3: 65

        Like

  11. “I’m guessing Irula are mostly around 90%+ ENA with some minor Iran_N?” — a newbie just like you but i think Iran_N’s contribution to Irula might be around 20-25% .

    “would you guess Mesolithic South Asia were mix of ENA + ANE? or were they fully ENA?” — Based on data till now it’s entirely possible that Iran_N like ancestry might have been present in Indus region prior to the Neolithic. David Reich himself said that this might be a possibility in a recent literature festival in india.

    Like

    1. That is for Indus region, which is very obvious. But i’m talking about hunter-gatherer spectrum before the arrival of Iran_N-like folks.

      Like

  12. Here is someones attempt at modeling them, They get good fit. Using something called “Simulated_AASI” from Anthogenica and G25 by DMXX i believe.

    “Irula:Average”,
    “fit”: 1.742,
    “Simulated_AASI”: 64.17,
    “Ganj_Dareh_N”: 23.33,
    “Sintashta_MLBA”: 8.33,
    “Thailand_IA”: 4.17,

    Like

Leave a reply to Alberto Cancel reply