Comparison of several populations of Norway lobster, Nephrops norvegicus (L.), from the Mediterranean and adjacent Atlantic. A biometrics study*

Six populations of Nephrops norvegicus were compared using canonical variate analysis on morphometric characters. The areas where the samples were obtained were the south coast of Portugal (Algarve) in the Atlantic and five areas in the Mediterranean: the Catalan Sea, the Ligurian Sea, the Tyrrhenian Sea, the Adriatic Sea and the Gulf of Euboikos. Each sample consisted of 50 males with sizes chosen to belong as much as possible to the carapace length range of 30-35 mm. Two systems of variables were used: body measurements (74 variables) and counts of spine rows on the carapace (23 variables). Several criteria were used to select experimental units and variables to include in the statistical analysis: a) homogeneous groups for all areas (using as indicators the mean and the variance of the carapace length), b) normal distribution of the original variables within each group, c) homogeneity of variance between groups, d) variables with few missing values. The conjunction of all these criteria led to a reduction of individuals and variables to be included in the analysis. In the end, 27 variables representing body measurements and groups of 22 to 43 individuals for each area were kept. The first two canonical correlation coefficients were highly significant (p<0.0001) with the corresponding variates explaining 81% of the variation. There were no single pairs of populations showing complete separation, although the degree of overlap was different when different pairs were compared. The 3 populations in the West and Central Mediterranean, Catalan Sea, Ligurian Sea and Tyrrhenian Sea showed the highest levels of similarity. The population from the Atlantic showed greatest distances overall, followed by the population from the Euboikos Gulf, representing the Eastern extreme of the geographical range.


INTRODUCTION
Biometric studies have been used to find differences among populations of the same species that may indicate genetic and/or environmental effects.Statistical techniques for detection of biometrical differences range from simple comparisons of regression lines between several groups to multivariate statistical analysis, dealing simultaneously with a large range of variables.Discussions related with the use of regression models for biometrics studies in crustaceans are provided in Lovett and Felder (1989) and Clayton (1990).
Techniques requiring complicated calculations, as is the case of multivariate techniques, have become more popular with the availability of microcomputers.These techniques allow the simultaneous analysis of many variables, for example rearranging them in linear combinations to produce new variables (variates).These techniques reduce the number of important variables to a much smaller number that those originally measured.Because several variables can be considered simultaneously, interpretations can be made that are not possible with univariate statistics (James and McCulloch, 1990).These techniques also allow the detection of non-obvious patterns in the variability, therefore indicating groups within the data.These properties of multivariate analysis have been used in morphometrics to verify dissimilarities corresponding to populations of the same species.
The most widely used multivariate techniques, in the context of data classification, are 'principal component analysis' (PCA) and 'canonical variate analisys' (CVA).CVA is also referred to as 'linear discriminant function analysis' or 'linear discriminant analysis'; Braak, 1995).A review of the use of multivariate analysis by James and McCulloch (1990) shows that over 40% of around 500 papers reviewed used one of these two techniques.
Both PCA and CVA are dimension-reduction techniques.The basic data used in the context of morphometric studies, consists of a matrix of individuals (lines) and morphometric measurements (columns) formed by several sub-matrices, each sub-matrix relative to one of the groups (populations).In CVA normal standardisation of the data is done within each sub-matrix, using for each group the mean of the corresponding vari-able in that group.This stresses the differences among the groups, as opposed to PCA, where the standardisation of the data is done ignoring the classes and using a single population mean for all individuals (Digby and Kempton, 1987).CVA seeks linear combinations of the variables that have greatest between group variation relative to their within group variability (Digby and Kempton, 1987).The first linear combination (canonical variate) that maximises this ratio is found.The next variate will be the linear combination obeying the same criteria but subject to the condition of being uncorrelated with the previous one.The coefficients of the canonical varieties may be used to indicate the importance of each one of the original variables for the discrimination of the groups.In such a situation, the original variables should not be strongly correlated with one another (Braak, 1995).
The variables used here, body measurements, are highly correlated.For this reason CVA was used only to look for population differences, with no interpretation of the magnitude of the canonical coefficients.This technique, if useful for discriminating the groups, will find a small number of canonical variates explaining a large proportion of the variation in the system.
The approach used in this work was one of dealing with the largest number of variables possible, using multivariate analysis.This was done in the hope of finding differences of neutral adaptive value that may indicate either population segregation or adaptation to different environments.The use of multivariate techniques is particularly useful when no single character or constant combination of characters allows separation of groups (Newmann, 1996).
Since the objective of this work was the identification of population differences based on morphometric measurements, the multivariate technique used was CVA.

Sampling procedure
The areas sampled were: the south coast of Portugal (Algarve) off the port of Faro in the Atlantic (P), the Catalan Sea off Barcelona (C), the Ligurian Sea off Genoa (L), the Tyrrhenian Sea off P.S. Stefano (T), the Adriatic Sea off Ancona (A) and the Euboikos Gulf off Athens (G).
Only males were used in this study to avoid noise introduced by the expected changes in body proportions at the onset of maturity in females.Selection of individuals for the samples was made by attempting to collect as many individu-als as possible with carapace length between 30 and 35 mm.
In all the studied areas 50 male specimens were collected in the fall of 1993.Each specimen was placed in an individual bag so that broken appendices would not be lost.They were frozen and shipped to the same lab were the measurements were taken on all individuals.
In each animal 96 measurements where made, including two groups or systems of variables: dimensions of body parts and counts of the numbers of spines in rows of spines.Measurements were made to the 0.01 mm below, using digital callipers.To avoid bias due to the measuring procedure considerable care was taken during this phase of the work including: -all measurements and counts were done by the same person using the same measuring instrument.
-10 individuals, not part of the study samples, were measured first for training.
-all measurements were taken within 1 week.
-the order of measurement of the individuals was established by alternating one individual from each area.

Statistical analysis
Table 2 includes the numbers of individuals within each size group for the different areas.The initial objective of obtaining a sample of individuals with carapace length between 30 and 35 mm was not met.The samples for the different areas had significantly different means and variances.In order to obtain comparable groups, an acceptable range of carapace length was defined.The criteria used consisted in finding the maximum range that showed no significant differences 74 M. CASTRO et al.  [20][21][22][23][24][25] between the largest and smallest means and variances for the different areas.The means were compared using a t statistic for two independent groups of unknown variances.The variances were compared using the F max statistic.The carapace ranges considered were first 25 to 40 mm.This range was rejected because both means and variances were significantly different at the 0.05 level.Next, the range was reduced to include individuals with carapace length within 25 to 35 mm.This range was accepted for neither means or variances showed significant differences at the 0.05 level.The results of the statistical analysis of the different groups are presented in Table 3.The next step consisted in verifying the assumptions for CVA: a) normal distribution within each group and b) homogeneity of variances among the groups.Only individuals belonging to the carapace range selected previously were included.The procedure PROC UNIVARITE, (SAS Inc., 1988) was used to test normality.This package performs a Shapiro-Wilk statistic for the null hypothesis that the input data have a random normal distribution (SAS Inc., 1988).The homogeneity of variances among groups was tested using a F max statistic.The procedure PROC SUM-MARY (SAS Inc., 1988) was used to calculate the variances and simple SAS programming allowed the calculation of the test statistics.For these tests, α=0.01 was used.Significant deviations from normality and variables for which homogeneity of variance was rejected are identified in Table 4. BIOMETRICS OF N. NORVEGICUS 75 For many of the variables used in this work there were measurements missing in some individuals due to broken parts of the exoskeleton.This was most common for measurements on claws, pereiopods and pleopods.Since CVA requires no missing values for all the variables included in the analysis, the use of all the variables selected would result in a considerable decrease in sample size.To avoid this, another selection of variables was made, eliminating from the analysis the ones that presented a large number of missing values.An arbitrary criteria was used, and only variables with measurements for at least 95% of the individuals were kept.The variables meeting this criteria are identified in Table 4.
A total of 28 variables met all the criteria discussed, namely, normal distribution within each group, homogeneity of variance among groups, no missing values for at least 95% of the individuals within the carapace range chosen.
For spine counts, most variables were rejected due to deviations from normality.These variables tend to have a uniform distribution, with the same number of spines for most of the individuals and only a few with a different number.Of all the variables within this system, only one, QRSL2, the count for the second row of spines on the left chela, met all the criteria for variable admission.This is a variable of little interest because of the large variability in size and shape of this appendage, that is frequently regenerated.Due to this, all variables on spine counts were ignored, leaving only 27 variables on body measurements for CVA (Table 4).
The canonical variate analysis was done using the routine PROC CANDISC, part of the statistical analysis package SAS (SAS Inc., 1985).Distance between populations was evaluated using the Mahalanobis distance (Manly, 1986).

RESULTS
Table 5 shows the results of the CVA for the first five variates.The canonical correlations for the first three variates are significant at the 0.05 level, with the first two showing highly significant values (Prob>F below 0.0001).Figure 1 was obtained considering the individual scores based on standardised canonical coefficients for the first two variates.To allow the distinction of different populations, separate graphs for each one were done, using the same scale.No two populations are distributed over the same area, but there is some degree of overlap for any two groups that may be considered.
Because the original variables are strongly correlated, the canonical structure of the variates, expressed by the standardised canonical coefficients, was not used to explain the importance of each one of the original variables (Braak, 1995).
To complement the information provided by the CVA, a quantification of multivariate differences between populations was done using Mahalanobis distances.The results are shown in Table 6.This global approach can be complemented by the analysis of Figure 2, showing the position of each population represented by the average value for the individual scores relative to the first two canonical variates.

DISCUSSION
The statistical technique used in this work aims at finding morphometric differences that could be interpreted in terms of the geographical distribution of the studied populations and the results of the genetic studies done on the same populations.If both studies define a similar association of groups then a genetic basis could be proposed for the observed morphometric differences and these techniques which are, much easier and cheaper to apply could provide the basis for population identification, requiring the consideration of such information for stock identification and fisheries management purposes.Morphometric differences found in the absence of genetic ones, indicate environmentally induced differences, that can also be used to define groups and to interpret the influence of environmental variables on the morphology of this species.Such an approach has been used for other decapods in the Mediterranean and Atlantic, such as the identification of populations of Aristeus antennatus (Sardà et al., 1998) and Maja species (Newman, 1996).
Multivariate statistics was used as a tool to understand global population differences indicated by morphometric relationships.Canonical variate analysis has provided such information in other studies for marine species where parallel use of genetics and morphometric studies led to the conclusion that CVA on morphometric data was equivalent to genetic analysis for group identification (Corti et al., 1988).This was not always the case; in one study on bivalves, enzymatic differences between two populations did not correspond to the groups identified with CVA (Machado and Costa, 1994).
The canonical variates found in this study show significant differences for the populations studied.The first two canonical variates, of similar discriminating power, account for 81% of the variability and are associated with highly significant canonical correlation coefficients (p-value of the F test below 0.0001 for both variates).Despite this result no pair of populations show complete separation.All populations overlap to a smaller or higher degree with all the others.Still, the graphical representation of the variate scores, shows a tendency for the points of each population to aggregate at different areas of the graph, indicating differences among the different groups.
The three populations in the West and Central Mediterranean, Catalan Sea, Ligurian Sea and Tyrrhenian Sea showed the highest levels of similarity.The population from the Atlantic showed higher distances overall, followed by the population from the Euboikos Gulf, representing the Eastern extreme of the geographical range.Despite these differences, no association can be made between distance indices and geographical distances.As an example, the population of the Atlantic (P) is closest to the one from the Euboikos Gulf (G) in terms of Mahalanobis distance (Table 6) and mean canonical values (Figure 2), but these two populations are at opposite extremes in terms of geographical position.
The lack of association between distance indices and geographical distance, was also found when the genetic variability of these same populations was analysed (Maltagliati et al., 1998).As in this work, the differences are small and not in agreement with changes along a gradient.Despite this, it is interesting to verify that the distribution in clusters found by Maltagliati et al., is similar to the separation due to the second canonical variable, found in this work.Two distinct groups are found; one with the populations of the Adriatic, Euboikos and Catalan Sea (positive mean values for the scores of the second component) and another group with the populations from the Tyrrhenian Sea, Ligurian Sea and Atlantic (negative mean values for the same score).In conclusion, the observed differences may have one of two causes.They may be the result of local isolation of these populations, expressing some small degree of genetic differentiation, or they may just be the result of environmentally induced differences.The existence of a pelagic larval phase, close to the surface, makes the case of genetic isolation difficult to accept without further proof.The environmental causes are the ones most likely to determine the observed morphometric differences.
Nephrops norvegicus is a species considered not to have significant migrations in juvenile and adult stages.Only during the pelagic larval phases could genes be transferred from one population to the others.It is assumed that in the absence of information on larval recruitment mechanisms for this species in this particular region, all hypotheses for explaining population differences based on geographical isolation are speculative.
FIG.2.-Mean values for the 1 st and 2 nd canonical variables, for each one of the populations studied.
Table 1 lists variables and the corresponding variable codes that will be used during this work.
BIOMETRICS OF N. NORVEGICUS 73

TABLE 1 .
-List of codes and description of the variables measures in this study.

TABLE 2 .
-Number of individuals in each 5 mm carapace length class for each area.Mean and variance of the carapace length are also included.Std, standard deviation.

TABLE 3 .
-Data used to select appropriate range of carapace length to use in the analysis.Calculations for ranges[25-40[ and [25-35[ are shown.The sample size, mean and variance of each group as well as the basic data used in the calculations of the test statistics are included.N, sample size.Var, variance.

TABLE 4 .
-Result of variable checking for different criteria of inclusion in CVA.The criteria used were normal distribution for populations of all the areas, homogeneity of variance among areas and missing values for no more than 5% of the individuals.

TABLE 5 .
-Abridged table with results of the canonical variate analysis, presenting the values for the first 5 canonical variates.Included in the table are the canonical correlation values, the statistics relative to the eigenvalues representing the ratio of the between-class to within-class variation and the probability values for the F test H o : the canonical correlation in the current row and all that follow is zero (modified from the output of PROC CANDISC,SAS Institute Inc., 1985).Plots of the individual scores for the first two canonical variables resulting from CVA. n, number of individuals