Breaking Down the Process

The authors have essentially split all of their analysis and its organization in the paper into two major sections, the second of which has several subsection (Benestan et al., 2016).

The first thing that they had to do for their study was to use Population Differentiation approaches to differentiate the single nucleotide polymorphisms (SNPs) that they found. This was accomplished using BAYESCAN, ARLEQUIN, and OUTFLANK. They pulled out the SNPs and then differentiated them based on their geographic location. They had three different lobster populations, so if these SNPs were outliers in all three population datasets (three separate population differentiation determinations), they were labeled to be under divergent selection. If they were not significant in all three populations, they were labeled as under neutral selection, and if they were sometimes significant, but not always, they were identified as being under balancing selection, and were removed.

After this part of the process was completed, the authors moved to environmental correlation analysis, environmental feature modeling, and redundancy analysis, to link the spatial structure of the lobster to the presence of genes under divergent selection. For this part of the analysis, the authors have uploaded all of their SNP data and the scripts used for the analysis, which can be found here. For this tutorial, we'll first use their identified SNPs and go through their code to see if we can get the same results with their R script. Then we'll do it in Python, and finally we'll try to identify the original SNPs again using the process that their briefly describe.

Benestan, L., Quinn, B. K., Maaroufi, H., Laporte, M., Clark, F. K., Greenwood, S. J., … Bernatchez, L. (2016). Seascape genomics provides evidence for thermal adaptation and current-mediated population structure in American lobster (Homarus americanus). Molecular Ecology, 25(20), 5073–5092.