onsdag 29. januar 2014

Is there a "East-Asian" influence in Continental Europeans? Part II

This post goes further in the the previous post "Is there a "East-Asian" influence in Continental Europeans?". It further elaborate on the separate run Finestructure using European, Siberian and East-Asian samples shown at the down part of the previous blogpost. In that part we only looked at the first dimension 1 and 2 and here we move further to the higher dimensions.

The third dimension is actually the same dimension we have seen many times when doing the PCA plot for the European panel in this project giving the characteristic "V" shape where South Europeans cluster at the root while Finns, Saamis cluster on one branch while Eastern Europeans branch on the other. This dimension is the Saami-Finnish branch variation vs South Europe. This variations peaks on one side among Saamis and Finns and as we can see from the gradient map it also exists consistently among the Siberians but not among the East-Asians. On the other side it peaks among Sardinians, Basque and Italians and the East-Asians cluster here with the South-Europeans. I am very unsure about the interpretation here but as both Siberians and East-Asians is not at the extreme on either side of the variation I tend to believe it may represent a gene flow from Europe towards Siberia and East-Asia. Maybe the northern spread represent a geneflow of Saami/Finnish like hunter gatherers eastward into Siberia and the lower part a geneflow from Europe towards East-Asia through todays India.

Dimension 3 - peaks among Saami, Finns vs Sardinians and Basque

Dimension 4 is the equivalent to the other branch of the "V" in Europe. On one extreme we found the Lithuanians, Mordovians and other Eastern European populations (actually here there is a Chukchi individual that a very little higher value than the top Lithuanian). The other extreme is Basque, Western Europeans, Saamis, Scandinavians and also the East-Asians. Also here I am unsure about the interpretation but it appear to show consistency as in dimension 3 but this time with a different spread. It may be a geneflow spread from Eastern Europe eastward through Siberia.

 Dimension 4 - peaks among Lithuanians vs Basque 

Dimension 5 appear to be a dimension that peaks among East-Asians on one side and Siberians on the other with the Europeans between. As we can see there is tendency to Siberian like influence in western part of Europe.

  Dimension 5 - peaks among Siberians vs East-Asians

This dimension appear interesting with regard to the question if there is any East-Asian influence among continental Europeans. So if we zoom to Europe and remove the PCA elements from outside Europe leaving only the European PCA elements we cen a more detailed view.

Dimension 5 - peaks among Siberians vs East-Asians

What is very striking here is that it appear to peak among Eastern Europeans and to some degree also Finns but appear least among the Basque, Western and South-West Europeans and among Norwegians and Saamis. It may suggest a gene flow from East-Asia that have divided in half an earlier haplotype distribution that may have gone from Western Europe to Siberia but now only remains among Western Europeans Scandinavians and Siberians. This dimension has a striking resemble to another dimension in Europe I have earlier believed to be internal European variation.

The PCA coordinates for all individuals and all dimensions can be downloaded here.

fredag 24. januar 2014

Is there a "East-Asian" influence in Continental Europeans?

Updated 27/01/14

This question have been following me since the last previous blogpost where I found this geographical distribution of chunkcounts PCA dimension 4 between Euroasian populations.

Everybody would probably agree about the northern distribution shown in blue and green apparently showing a genetic connection between North-East European populations like Saamis and Finns and Northern Siberian populations all the way from Fennoscandia to Beringia in Eastern Siberia as shown many times in this blog, from university research and from other bloggers. However if we look closer at the color distribution for the map:

We should of course not take this color distribution to literally as PCA plots distirbutions can be affected by many things but as we can see the red grade is close to the brown. This probably means the whole area from western continental Europe to East- and South-East Asia appear to show haplotype similarity. So it appears not only to have been a northern East-West influence but also a southern East-West influence as well.

To investigate this further I did a seperate run Chromopainter using 23k linked SNP at Chromosome 1 with a selection of individuals that had been phased together with over 2000 individuals using a high number of iterations in BEAGLE minimizing the error rate to a minimum. I used Chromopainters admixture functionality to design a admixture test using the relevant East-Asian, African and Siberian populations. As far I know from previous screening using ADMIXTURE all these individuals appear unmixed without any European admixture.

As we can see North-East Europeans appears to score low on East-Asian influence while continental Europeans from western and central Europe appear to score rather high vs the East-Asians. As expected we see the largest African like influence among the more southern populations and the North-Siberian influence in the more North-East European populations. Please note as this result is based on only 1 chromosome it doesn't always correlate 100% with the result of a whole genome analysis. There is a strong negative correlation between the East-Asian and Siberian component at -0.56 and there is even a stronger negative correlation between the Siberian and African component. The correlation between the East-Asian and Siberian component is weak at -0.17.

We can further see this using a area plot for the same data. The continental European populations from western and central Europe appear quite consistent to be closer to the East-Asian populations.

In Finestructure (from the previous run with the whole genome using superindividuals) the clustering appear to confirm what was observed above. The East-Asian influence appear rather consistent for all European populations included. In the North-East European populations this East-Asian influence appear to be lacking or less but the North-Siberian influence appear as expected from earlier analysis.

Individual results for project participants for PCA dimension 4 shown in the first image above. The first column show the actual PCA plot value and the second the ranking seen from top as in East-Asian and bottom as in Siberian.

This is a seperate PCA run using Europeans, East Asians and Siberians as reference. The X axis (horizontal) shows dimension 1 and the Y-axis (vetical) shows dimension 2. The X axis shows the European vs East-Asian influence, the Y-axis shows the European vs Siberian influence.

Europe vs East Asia and Siberians overview

 Europe vs East Asia and Siberians zoomed at Europeans

  Europe vs East Asia and Siberians zoomed at Europeans detailed 

Gradient map of PCA dimension 1 - blue most East-Asian, brown least East-Asian like.

 Gradient map of PCA dimension 2 - blue least Siberian like, brown most Siberian like 

fredag 10. januar 2014

Euroasian variation gradiation maps

This is a graphic presentation of the 7 Euroasian haplotype based (chunkcount) PCA plots from the latest project run using 289k linked SNP's. The average number of SNP in each segment chunk in the world panel is 13 (with a heavy overweight of Europeans and Northern Europeans in particular). This analysis is at this stage experimental. Please ignore coloring in Africa, Australia and Greenland as no populations are included from these continents.

The first dimension peaks on one side among Finns and Saamis (brown), and on the other side among Panya, Chukchi and Cambodians (light blue or blue). It appears all Fennoscandians in general belongs to this brown component.. This component appear identical to dimension 1 in the previous analysis. As this is the first dimension it also explain the largest variation in haplotypes between the populations. As the "blue side" here appear mostly at the coast peaking among populations that show affiliation to Papuans and Melanasians I suspect it seperate this ancient population from old European hunter gatherers.

Euroasian dimension 1

The second dimension peaks on one side among Lithuanians and Scandinavians and on the other side among the "The Others" containing the remaing individuals, but in the remaining panel it is peaks among Miao and some other East-Asian population. It seem to show a clear division of West and East Euroasian populations. All Fennoscandians belong in general to this western cluster.

Euroasian dimension 2

The third dimension peaks on one side among the "Others" and secondary at Bedouins and other Middle East populations and on the other side among North Siberian populations. It seem to represent influence from the African continent. This influence have reached as far north as South-Scandinavia but to less extent among Saamis and Finns. This dimension may be related to dimension 3 in the previous analysis.

 Euroasian dimension 3 

The fourth dimension peaks among Dai, Cambodian and Han in South-East Asian on one side and among Koryak, Yugagir and Nganassans in North-Siberia. Saamis, Finns and to a degree Scandinavians seems most similar in variance to the North-Siberian group while Continental-Europeans appears more similar to the South-East Asian group.

 Euroasian dimension 4

The fifth component peaks among Saamis, Finns and some South-East Asian populations on one side and among Northern Siberians on the other. Scandinavians appear less related to this dimension. This dimension may be related to dimension 3 in the previous analysis.

This component appears more difficult to explain as other analysis have shown no connection between Saamis, Finns and South-East Asians but to North-Siberians as in dimension 4. It may be an effect of having large sample of Europeans and small sample of other populations however its striking that the clustering appears consistent among the various non-European populations and not spread out randomly as if there was no structure. As far as I know and can remember there has not been done such wide scale analysis before using linked haplotypes so it may be something not seen before.

  Euroasian dimension 5 

The sixth dimension peaks among East-Asians on one side and the Indian subcontinent on the other. Saamis  Finns appear a little closer to the Indian subcontinent than Scandinavians.

 Euroasian dimension 6  

The seventh dimension peaks among the Lithuanians and Koryak on one side and among the Chukchi, South-Indians and among some North-Siberians. In more general terms as the heatmap shows similarity between Eastern Europe and South-East Asians. This dimension as dimension 5 appears difficult to explain and the same stated about this there apply here as well.

 Euroasian dimension 7

The remaining higher dimensions appear to show local variation between Siberian groups.