Searching for a stock structure in Sardina pilchardus from the Adriatic and Ionian seas using a microsatellite DNA-based approach

In the present study the genetic variability of European sardine from Adriatic and Ionian seas was investigated in order to detect the occurrence of genetic structure within and between these basins. In several samples the analysis of genetic variability at eight microsatellite loci showed a number of homozygote individuals higher than expected at Hardy-Weinberg equilibrium. The inter-population differentiation level estimated by AMOVA, qST and r R ST and Bayesian descriptors detected no signs of population differentiation between the samples analysed. These results are consistent with previous studies based on allozymes and several mitochondrial DNA markers and add further evidence contradicting the early identification, based on morphological and reproductive data, of two sub-populations in the Adriatic Sea.


INTRODUCTION
The identification of fish stocks is the first step in management and conservation processes (Waldman 2005). During the last few decades several approaches have been applied in the attempt to characterize these management units: i) use of morphological and meristic data (Waldman 2005); ii) use of specific parasite-host interaction ("natural tagging") (Abaunza et al. 2008); iii) analysis of microchemical composition of otoliths and scales (Feyrer et al. 2007); iv) use of genetic data (Hauser and Seeb 2008); v) and a combination of these approaches (Baibai et al. 2012). The use of molecular tools is one of the most widely used methodologies. This approach is based on the detection of genetic differences to evaluate the degree of reproductive isolation between stocks (Begg and Waldman 1999) and the results obtained are generally quite reliable since the markers employed are fairly selectively neutral and can exclude environmental effects on phenotypic traits (Kapuscinski and Miller 2007). However, detection of genetic differentiation is often a challenge, especially in the marine environment. As a matter of fact, it was demonstrated that marine taxa are less structured than freshwater ones (De Woody and Avise 2000) and the marine environment is characterized by a seeming lack of physical barriers that could enhance the potential for dispersion and gene flow in marine fish populations (Palumbi 1994). Nevertheless, especially in near shore environments, the pathway of oceanographic boundaries coupled with complex shoreline topography could generate local areas of larval retention or create a barrier to migration and thus enhance reproductive isolation between nearby areas, leading to some degree of genetic structuring (Zardoya et al. 2004). In recent years, the vast majority of population genetics studies in marine fishes have employed microsatellites as the main molecular markers to resolve stock structure at a fine scale.
The identification of multiple stocks in small pelagic fish has been a longstanding challenge for fisheries science because these species show a wide distribution range and are characterized by good dispersal capabilities and pelagic spawning with spreading of eggs and larvae (Whitehead 1985). Nevertheless, microsatellite DNA allowed the detection of genetic structure in Clupeiformes, such as the Atlantic herring (Clupea harengus) (Andrè et al. 2011) and the European Sprat (Sprattus sprattus) (Limborg et al. 2012).
The European sardine (Sardina pilchardus Walbaum, 1792; hereafter referred to as sardine) is a small pelagic fish inhabiting the northeastern part of the Atlantic Ocean, from the Senegalese to the Icelandic coasts, as well as the Mediterranean and Black Seas (Whitehead 1985;Grant and Bowen 1998). This species is one of the most exploited fishery resources throughout its distributional range, especially along the Moroccan and Spanish Atlantic coasts (FIGIS 2004). In the Mediterranean Sea, sardine together with the European anchovy (Engraulis encrasicolus) account for approximately 50% of total catches (FAO 2005). In addition, this fish species is one of the most abundant and one of the commercially most important in the Adriatic Sea (Santojanni et al. 2005) and accounts annually for over US $32 million, with a peak of 90000 t landed in the early 1980s (Cingolani et al. 2004). All the studies on sardine population genetics have shown low levels of genetic differentiation (Spanakis et al. 1989, Laurent et al. 2007, Gonzales and Zardoya 2007a, Kasapidis et al. 2012, mainly suggesting the existence of differentiation between the Atlantic Ocean and the Mediterranean Sea (Atarhouch et al. 2006). However, the Adriatic Sea seems to show differences in specific hydrological and oceanographic features (i.e. depth, temperature, salinity), especially between its northern and its southern parts (Artegiani et al. 1997). In addition, this basin shows a semi-enclosed topographic conformation that could promote the genetic isolation of Adriatic sardine from the populations of the rest of the Mediterranean Sea. Early morphological and meristic surveys, coupled with reproductive data, suggested the presence of two distinct sardine stocks within the Adriatic basin (Alegría-Hernandez et al. 1986). These observations led these authors to also hypothesize the existence of a genetic differentiation between the northern and the southern stocks caused by the presence of the Jabuka Pit, which could have acted as a barrier to gene flow (Alegría-Hernandez et al. 1986). However, the more recent application of molecular methods, essentially based on allozymes and mitochondrial DNA variability, did not detect any significant genetic differences either within the Adriatic sardines or between the Adriatic samples and those from the adjacent Ionian Sea (Carvalho et al. 1994, Tinti et al. 2002. Since the Adriatic-Ionian area was not taken into consideration in previous studies that have explored genetic structure of sardine on the basis of microsatellite DNA, the aim of this study was to investigate whether this marker can help to i) detect the occurrence of differentiated genetic units within the Adriatic Sea and ii) distinguish between sardine individuals from the Adriatic and Ionian basins.

Sample collection, DNA extraction, PCR amplification and visualization on gel
Fin clips and muscle tissues from a total of 377 mature individuals of European sardine (Sardina pilchardus) were collected from several fishing grounds in the Adriatic and Ionian seas ( Fig. 1) between April 2009 and March 2010 and preserved in absolute ethanol until DNA extraction. The overall genetic analyses included individuals collected from five localities in the Adriatic Sea (CH, SB, VI, TR and MN) and from two localities in the Ionian basin (IO and CT) ( Table 1). The DNA was extracted using standard phenol-chloroform procedures described in Taggart et al. (1992). All samples were screened at nine microsatellite loci. Seven of these loci were described for European sardine: SAR9, SAR1.5, and SAR1.12 (dinucleotide loci) (Gonzalez and Zardoya 2007b); the primers for Sp10 (dinucleotide locus) were designed from a sequence retrieved from GenBank (accession number AY241279); SpI5, SpI7 and SpIII93 (tetranucleotide loci) primers were designed from sequences published in GenBank (accession numbers HM031962, HM031963 and HM031964, respectively) ( Table 2). Sar1-D06(B) and SarB-A07 were characterized for the Pacific sardine (Sardinops sagax sagax) by Pereyra et al. 2004 (Table 2).
Polymerase chain reaction (PCR) conditions for the above microsatellite loci were optimized and samples were run as described in Ruggeri et al. (2012).

Data analysis and statistical treatment of data
We used genotype and allele frequencies of the microsatellite loci analysed to obtain standard genetic diversity estimates. The presence of null alleles and other genotyping errors (allele dropout and stutter peaks) were assessed with the program MICROCHECKER 2.2.1 (Van Oosterhout et al. 2004). Null allele frequencies were estimated per locus and per locality using the algorithm of Dempster et al. (1977) available in FreeNa (Chapuis and Estoup 2007 Pereyra et al. 2004 †Underlined bases were added to 5′ end of the reverse primers to promote adenylation by Taq DNA polymerase (Brownstein et al. 1996). 1 The authors contributed both to the locus isolation and to the characterization of primer sequence reported here. 2 The authors contributed to the locus isolation only. 3 The primer sequences were described in the present study. ver, allelic richness (R S ) was estimated using the rarefaction index method (El Mousadik and Petit 1996), as implemented in FSTAT.
Hardy-Weinberg genotypic equilibrium within all sampling locations was assessed by estimating the inbreeding coefficient, F IS (Weir and Cockerham 1984), and tested with the exact test implemented in Genepop v.4.0.10 (Rousset 2008) using a Markov chain method of 1000 batches of 2000 iterations each, with the first 500 iterations discarded before sampling (Guo and Thompson 1992). Genepop was also used to test for linkage disequilibrium between loci. The tests were conducted by means of 1000 batches of 2000 iterations each of the Markov chain method. In order to obtain more reliable results, P-values from multiple comparisons of both tests (deviations from Hardy-Weinberg equilibrium and the presence of linkage disequilibrium) were corrected using a sequential Bonferroni correction (Rice 1989).
The presence of loci potentially affected by selection was investigated by the approach of Beaumont and Nichols (1996) implemented in the FDIST2 program (Beaumont 2002), in which coalescent simulations are used to get a null distribution and confidence intervals around the observed locus-specific F ST values.
Levels of genetic divergence between pairs of samples were estimated based on q ST (Weir and Cockerham 1984), an analogue of Wright's F ST , using FSTAT, whereas rR ST (Slatkin 1995) was calculated using R ST -CALC (Goodman 1997). In addition, in order to test the extent to which null alleles influence the pairwise genetic differentiation values, the software FreeNa (Chapuis and Estoup 2007) was used to calculate Weir (1996) F ST pairwise values with the exclusion of null allele (ENA) method.
A Bayesian approach, implemented in the program STRUCTURE v.2.3.2.1 (Pritchard et al. 2000;Falush et al. 2003) was used to detect the number of clusters (K) of sardine at the Hardy-Weinberg equilibrium. In order to estimate the most probable K, we conducted runs assuming K from one to six and selecting an admixture model with correlated allele frequencies.
Each K was performed of six independent runs with a Monte Carlo Markov Chain of 10 6 iterations after a 10 5 burn-in replications period. Prior information about the geographical sub-areas or the basin from which each sample came were provided and used in STRUCTURE simulations. Additionally, in order to improve the detection of a clustering within the dataset, a run was performed using only the locations at the extremes of the area studied herein.
The existence of a genetic structure in sardine from the Adriatic and Ionian basins was also assessed through a hierarchical analysis of molecular variance (AMOVA) using Arlequin 3.0 (Excoffier et al. 2005). We performed the AMOVA analysis by comparing i) the Adriatic samples (CH, SB, VI, TR and MN) and the Ionian samples (IO and CT); and ii) samples from the northern Adriatic Sea (CH and SB) with those from the southern part of this basin (VI, TR and MN). The molecular variance was partitioned among groups (F CT ), among populations within groups (F SC ) and within populations (F ST ), calculating F-statistics from genotypic frequencies. The significance for all AMOVA calculations was assessed by 20000 permutations and P-values were adjusted using a sequential Bonferroni correction.
The relationship between genetic differentiation and geographical distance (isolation by distance) was evaluated through a Mantel test (using ISOLDE in Genepop). The logarithm of the linear shortest sea-distance expressed in kilometres was regressed against the genetic differentiation estimates obtained with both rR ST and q ST estimators and the significance of correlation between geographical and genetic matrices was obtained with a permutation test of 10000 iterations.
A simulation method implemented in an extended version of POWSIM software (Ryman and Palm 2006) was used to assess the statistical power of the dataset, for detection of population differentiation. Tests were carried out between the seven localities and between the two basins using default parameters (1000 dememorizations, 100 batch and 1000 iterations per batch) and several combinations of population divergence time (t) and effective population size (N e ) (N e /t: 1000/10, 1000/20; 5000/5, 5000/10, 5000/20; 10000/2, 10000/5, 10000/10; a simulation scenario with no divergence among samples [1000/0] was also tested). Each N e /t tested were simulated with 1000 replicates.

PCR success and genetic variability within samples
The PCR success was high for eight out of nine microsatellite loci genotyped, since the missing data values represented 0.83% of the global dataset. However, the amplification of locus SpI5 proved to be difficult in several individuals within the MN sample. In addition, SpI5 seems to be systematically affected by the presence of null alleles, as was also detected at other localities. For these reasons, we decided to exclude this locus from the following analyses, which are therefore based on the remaining eight loci genotyped. MICRO-CHECKER test showed a lack of PCR artifacts in all the remaining eight microsatellite loci. In all these loci, with the exception of SAR1D06 and SpI7 loci, we detected the possible presence of null alleles affecting 11 out of 56 tests. Additionally, at least one locus for each sample (except in MN sample) showed the possible presence of null alleles and this genotyping error seems to affect mainly the VI (Vieste) sample showing null alleles at three out of eight loci analysed. We found that correcting the dataset for null allele frequencies with the Brookfield algorithm (Brookfield 1996) did not qualitatively affect the results, since 8 out of 56 tests were still significant. The microsatellite polymorphism was high, varying from 12 alleles at SpI7 locus to 52 alleles at SAR1.12 and SpIII93 loci ( Table 3). The percentage of private alleles amounted to 20.14% of the total number of alleles observed at all screened loci. The CH, IO and CT samples showed the highest percentages of private alleles (3.82%, 3.47% and 3.82% respectively, Table  S1). The mean number of alleles (N A ) per sample (over all loci) seems to be quite similar between all samples, varying from 18.13 alleles in MN to 24.13 alleles in IO (average of 22.43±1.99). The allelic richness (R S ) showed quite similar values between all samples, with an average of 18.51±0.36 alleles ( Table 3). The expected heterozygosity (H E ) showed average values of 0.851±0.007 (Table 3), while the observed heterozygosity (H O ) showed an average value of 0.794±0.020 (Table 3).
A significant departure from Hardy-Weinberg equilibrium was found in five samples (CH, VI, TR, MN and IO) that showed heterozygote deficiency (Table 3).
A total of 11 out of 196 tests were significant for linkage disequilibrium (LD) performed for each locuspair across each sample. Only two tests were still significant (P < 0.05) after a sequential Bonferroni correction involving the SAR1.5-SpIII93 locus-pair in the CH sample and the SAR9-SAR1.12 locus-pair in the VI sample. The results showed no qualitative changes at either intra-population or inter-population level when the statistical treatment was repeated excluding SAR9 or SAR1.12 from the dataset. In addition, the outlier analysis using FDIST2 showed no indication of selection influence in all the loci analysed.

Spatial genetic structure in the Adriatic and Ionian sardine samples
Pairwise values of q ST , rR ST and ENA F ST over all loci were low and ranged from -0.0072 to 0.0192 (q ST ), -0.0039 to 0.0020 (rR ST ) and -0.032 to 0.0024 (ENA F ST ; Table 4), respectively. Two out of 21 pairwise q ST were significant (P<0.05) and involved the CH vs. MN and TR vs. MN sample pairs. None of the pairwise q ST remained significant after a sequential Bonferroni correction. In the pairwise rR ST comparisons 2 out of 21 comparisons were significant (P < 0.05), involving the CH vs. MN and IO vs. MN sample pairs. As in the q ST pairwise comparisons, none of the rR ST values remained significant after a sequential Bonferroni correction (Table 4).
The analysis carried out with STRUCTURE showed a higher posterior probability (Pr[X/K*]) for K=1 (values obtained for mean logarithmic posterior probabilities for K=1, -18452,8; K=2, -19146,5; K=3, -19559,2; K=4, -19535,7; K=5, -20265,4; K=6, -20840,9). This result was confirmed both by treating each sample as a single entity and by pooling the samples into three main "populations" (central and northern Adriatic Sea [CH + SB], southern Adriatic Sea [VI + TR + MN] and Ionian basin [CT + IO]). In addition, when the test was carried out using only the samples from the geographic extremes of the area investigated (CH, CT and IO), no genetic structure was detected.
The AMOVA results revealed no genetic structures between Adriatic and Ionian samples (-0.04%) and the AMOVA analysis detected no genetic differentiation between the northern and the southern Adriatic samples (0.05%; data not shown). In addition, the Mantel test showed a lack of possible structuring determined by isolation by distance phenomenon (IBD). In fact, non-significant values (P>0.05) were obtained when both the pairwise q ST /(1-q ST ) and the rR ST /(1-rR ST ) were plotted against the log of geographical distances (data not shown).
The POWSIM analysis showed that the statistical power of the dataset allows true population differentiation (F ST ) values as large as 0.0010 to be detected  (Table 5).

Genetic diversity and deviation from Hardy-Weinberg equilibrium
The present study shows high levels of polymorphism at eight microsatellite loci in both Adriatic and Ionian sardine samples. The mean values of heterozygosities (H O and H E ), allelic richness (R S ) and the N A observed here are comparable with the high microsatellite variability levels reported for many marine species studies (O'Connell and Wright 1997;DeWoody and Avise 2000) and particularly with those described for S. pilchardus samples from the Atlantic Ocean and the Mediterranean Sea (Gonzalez and Zardoya 2007a).
The dataset here used showed deviations from Hardy-Weinberg genotypic proportions due to an excess of homozygote genotypes in several samples. This phenomenon has commonly been reported for marine species of both invertebrates (Zouros and Foltz 1984) and fish (Waldman andMcKinnon 1993, Maggio et al. 2009) and was often observed in pelagic fish (Laurent et al. 2007, Gonzalez and Zardoya 2007a, Zarraonaindia et al. 2009, André et al. 2011. A homozygote excess may be explained as a consequence of evolutionary or technical processes, such as i) inbreeding; ii) selection; iii) the effect of mixing between different sub-populations (Wahlund effect); and iv) genotyping errors (i.e. null alleles).
Though evolutionary processes have often been the main cause of deviations from Hardy-Weinberg equilibrium, in this case the role of null alleles seems to be the most realistic source of homozygote excess. In fact, null allele is a very common phenomenon in microsatellite DNA studies on many marine fish, mediated by mutations that can modify the microsatellite flanking region sequences and result with one allele un-amplifi-cation (O'Connell and Wright 1997). MICROCHECK-ER investigation showed the possible presence of null alleles in several loci for the samples analysed, even though no specific locus (with the exception of SpI5, that was removed from data set) systematically showed null allele signals. Null alleles could therefore affect our dataset, inducing a number of homozygotes higher than that expected at Hardy-Weinberg equilibrium.

Genetic differentiation within the Adriatic Sea and between the Adriatic and Ionian samples
The results of this study seem to indicate a lack of genetic differentiation both within the Adriatic samples and between Adriatic and Ionian samples. This finding is consistent with previous surveys based on other molecular markers, such as allozyme and mitochondrial DNA (Carvalho et al. 1994, Tinti et al. 2002, and adds further evidence contradicting the early identification of two morphologically differentiated sardine sub-populations separated by the Jabuka Pit (Alegría-Hernández et al. 1986) within the Adriatic Sea.
The presence of a single evolutionary unit of sardine in the Adriatic and Ionian seas is supported by the results of AMOVA and STRUCTURE analyses that detected no differentiations in the samples studied. Consistent with the presence of a genetically homogenous population, we found very low values with no statistically significant pairwise comparison between samples on the basis of both allelic frequency (q ST ) and allelic size (rR ST ). Very low values in q ST and in rR ST estimation are quite common in marine fish (Hoarau et al. 2002, Ward 2006) and they are mainly related to high gene flow mediated by migratory behaviour, the occurrence of technical issues that reduce the molecular marker resolution power, or issues associated with genetic phenomena, such as allele size homoplasy (i.e. convergent evolution of alleles of the same size). Sardine is a pelagic fish with high mobility that could enhance the gene flow on a medium spatial scale and could promote the genetic homogeneity observed in the present study. In fact, the exchange of relatively few individuals among geographically close areas may be sufficient to homogenize allele frequencies and therefore affect the possibility of detecting any genetic differentiation (Slatkin 1987). The existence of a high gene flow within and between our samples could even be in agreement with the previous detection of morphological differences in the Adriatic and Ionian sardine samples (Alegría-Hernández et al. 1986, Carvalho et al. 1994. In fact, the morphological variation is suspected to be determined at micro-geographical scale by environmental features and the existence of this kind of variation in a genetically homogenous population has previously been detected in the round sardine (Sardinella aurita) stock from the coastal waters of Florida (Kinsey et al. 1994). Therefore, like the above-mentioned case of the round sardine, the morphological differentia- tion within the Adriatic Sea for the European sardine could be mediated by the mere effect of phenotypic plasticity.
In species with high gene flow the most probable mechanism that drives genetic differentiation is isolation by distance (IBD). The increase in geographical distance could imply a gradual evolution of genetic divergence caused by some degree of equilibrium between migration and genetic drift (Slatkin 1993, Durand et al. 2005, especially in populations located on the borders of the species range. IBD was described as a possible mechanism of genetic differentiation even in some Atlantic sardine populations (Laurent et al. 2007, Gonzalez andZardoya 2007a). However, in the present study we were not able to detect IBD in the geographical range analysed, probably because of limited distance between the samples analysed.
Another possible explanation for the lack of genetic differentiation observed that should not be ignored can be found in the statistical power of our data. The number of loci and the number of samples analysed in comparison to the high N A detected in the present study, as well as the role of null alleles detected in our data set, may not be sufficient to bring out genetic differences. The power analysis showed true levels of differentiation larger than 0.0010, which is sometimes a threshold higher than the level of differentiation detected in the q ST and the rR ST pairwise estimates obtained here. However, results from other studies aimed at detecting genetic differentiation between sardine populations based on microsatellites Zardoya 2007, Kasapidis et al. 2011), even with comparable or higher number of loci and individuals examined, showed F ST levels as low as those observed in this study. Therefore, detecting low F ST values is common in this species with microsatellite markers, suggesting that low genetic differentiation has not been determined by a low statistical power. In addition, the contribution of null alleles to detect significant pairwise differentiation values between sardine samples seems to be relatively low. In fact, the comparisons between pairwise estimates obtained from q ST , rR ST and those from the F ST estimated with the ENA method did not show values with completely different order of magnitudes.
Another scenario can be related to the influence of allele size homoplasy in determining the failure in the identification of sardine stock structure. In fact, the high levels of polymorphism that characterize microsatellites may also limit the estimation of the degree of genetic differentiation, especially the q ST and rR ST estimates (Goudet et al. 1996, Hedrick 1999. It was established that homoplasy is prevalent in populations with large size, as is the case with marine species, in association with high mutation rate markers, such as microsatellite DNA (Estoup 2002, O'Reilly et al. 2004. Homoplasy could thus hinder the possibility of detecting a real but low degree of genetic differentiation in our dataset. A further support to this hypothesis might be the detection of private alleles mainly in the sam-ples located on the borders of the analysed area (CH and IO-CT). The genetic composition of these samples may suggest a certain degree of differentiation that is masked by homoplasy and is consistent with separate demes (under the "ecological paradigm", see Waples and Gaggiotti 2006) within the Adriatic basin, detected at temporal scale (Ruggeri et al. 2012).
In conclusion, consistently with previous studies in which different molecular tools were used (Carvalho et al. 1994, Tinti et al. 2002, the present study suggests a lack of genetic differentiation in sardine both within the Adriatic Sea and between the Adriatic and Ionian seas. However, the genetic homogeneity observed could be apparent and the identification of a real but subtle structuring in sardine population could be limited by technical difficulties and by the incomplete knowledge of molecular mechanisms that control the mutational processes in microsatellite DNA. In order to fill this gap, it could be useful to use other molecular markers, such as single nucleotide polymorphisms, which could improve the detection of separate stocks within this area by removing homoplasy effects from F ST estimates (Coates et al. 2009). The identification of different stocks within the Adriatic Sea could have important implications for the management of this fishing resource, which has been demonstrated to be affected by fluctuation in demography and a variety of fishing efforts (Ruggeri et al. 2012).