SEXUAL DIMORPHISM OF LABRADOR RETRIEVER DOGS BY MORPHOMETRY

The domestic dog (Canis familiaris) is the species of greatest morphological diversity among mammals. Seventy-four Labrador Retriever dogs27 males and 47 females – were used in this experiment. Thirty quantitative biometric characteristics, related to morphology were measured. The objective of this study was to evaluate the morphometric traits of the Labrador Retriever breed to establish descriptive biometric attributes that may show sexual dimorphism through principal component analysis (PCA) and discriminant analysis (DA). The PCA was processed using all the variables and performing a pre-selection of the most correlated variables. The DA was performed for the 30 variables and also for the five most correlated variables with the first component (CP1), in order to classify new individuals. The PCA was able to identify sexual dimorphism in size, with both the 30 original variables as with the preselected variables, the latter optimized the reduction to two principal components. The DA was able to discriminate the two populations, both for 30 variables as for the five variables most correlated with the CP1. The functions with five variables can be used to classify other purebred dogs for sex, with an error of about 6.75%. PALAVRAS-CHAVE: Principal component analysis. Canis familiaris. Morphology.


INTRODUCTION
Sexual dimorphism can be considered a key evolutionary trait that can lead to important biological discoveries (AJAYI et al. 2012).Sexual dimorphism evolved in mammals to ensure greater reproductive success for individuals, especially males.Secondary sexual characteristics are attributes that facilitate mate choice, even if this feature is of little benefit or no apparent addition to its survival.These distinctions tend to be more pronounced in polygamous, with day habits and living in an open habitat (MCPHERSON; CHENOWETH, 2012) species.These features are considered as physical or behavioral attributes genetically transmitted to offspring.
According to Polák e Frynta (2010), the artificial breeding during domestication implies pressures of different types of selection that occurs under natural conditions, and therefore it has different consequences on the body size.Even more, when breeders do not select by the size of the body, but by other morphological and behavioral characteristics.Therefore, in domestic breeds the favoring of larger males in sexual selection occurring in nature suffers a relaxation as a result of the domestication process.Lark, Chase and Sutter (2006) reported that the population of dogs is managed by man for longer than any other domesticated animal, providing enough time and opportunity to select for new phenotypic variations.The geographical isolation and selection for various tasks such as herding, guarding, hunting, and company have created specialized subtypes within the species.
The species that exhibits the greatest morphological diversity among mammals is the domestic dog (Canis familiaris), and may exhibit sexual dimorphism in size of individuals (OSTRANDER; LINBLAD-TOH, 2006).In this species the variation in the skull and skeleton is superior to all other species of the Canidae family (WAYNE, 1986).Shultz et al. (2009) asserted that current studies of sexual dimorphism focus mainly on body size.Although some studies include other parts of the body (FAIRBAIRN, 2005), they use linear measurements and necessarily the size of the body is incorporated in these data.According to Dias e Barros (2009), morphometry studies only measurements of morphological characters of living beings like heights, widths and lengths, among others.Silva et. al. (2007) assigned that phenotypic characterization of a particular racial group may be aided by morphometric measurements.Labrador Retriever is a British breed of dogs widespread in Brazil.It has several functions to men, such as assistance dog, rescue, guide, sniffer police dog and, primarily, a companion dog.According to Palika (2008) in the United States, the breed has achieved great popularity, which led to an increase of rearing corroborating to the lack of racial patterns and hence great variation in the breed.The objective of this study was to evaluate morphometric characteristics of the breed Labrador Retriever to establish descriptive biometric attributes that can differentiate these individuals sexually by means of multivariate statistical techniques.

MATERIAL AND METHODS
Seventy-four Labrador Retriever dogs -27 males and 47 females -, obtained from commercial kennels located in the states of Rio de Janeiro, São Paulo and Minas Gerais were used in this experiment.The dogs used met the following requirements: be an adult over two years old, breeder or matrix animal, own record in the Brazilian Confederation of Cinofilia (CBKC) or the American Kennel Club (AKC), and no noticeable signs of pregnancy or lactation.There were no dogs in the same kennel with the same paternal and maternal ancestry, ie, no full siblings evaluated.The objective of this procedure was to maximize genetic variability besides a more reliable morphometric characterization of the Labrador breed in Brazil.
Thirty quantitative biometric characteristics, continuously variable, relating to the morphology of the head, trunk and anterior and posterior limbs (Table 1) were measured.The biometric characteristics were measured using anthropometric ruler (± 0.1 cm), measuring tape (± 0.1cm) tape to measure circumferences (±0.1 cm) and a caliper (± 0.1 cm).Measurements of heights, member circumferences and body lengths were made with the standing animal, minimizing errors due to variations in the positions of animals.The animals were placed with the body weight distributed evenly among the four members whose axes (forearm to prior members and shuttle to subsequent) must remain upright.All measurements were taken in centimeters, because it is the unit of measure most used in Brazil.To compare the body size between sexes in Labrador Retriever, the means and standard deviations were estimated from 30 morphometric variables (Table 2).To demonstrate that quantitative measures of character inherent in skeletal morphology are able to show sexual dimorphism of Labrador Retriever dog, analysis of main components was performed in two ways.Initially we used the 30 measured variables and subsequently the analysis was performed with 23 pre-selected variables.
Initially, ACP was processed using all 30 variables.Only the first three principal components were used to demonstrate sexual dimorphism, as they explained about 50% of the total variation (Table 3), summarizing much of the original information.Jackson (1993) reported that there is no optimum value for percentage of explanation by the first two or three axes.Generally, the larger the number of variables, the lower the percentage of explanation for these first components (MELO; HEPP, 2008).(1) Three components needed to explain 50% of the total variation.
Each principal component is a linear combination of all variables, and some of these variables are more representative within this component and others less (FERREIRA, 2008).According to Abreu et al. (1999), the variables strongly correlated with a particular main component have more importance for this component.The variables and their correlations with the first three principal components are presented (Table 4).
The CP1 is the most representative component, accounting for approximately 36% of the total variance.The correlations of this component with the original variables show an inversely proportional relationship in view of the negative correlations nature.The CP1 is directly connected to the body size of the animals, because it is highly correlated with variables related to body size directly correlated with each other, such as neck perimeter, height to the middle of the torso, perimeter of the muzzle, height at withers and length of the trunk, these with correlation with approximately 75% or more with CP1 (Table 4).-0.74765 -0.10305 -0.04809 (1) Five more correlated variables with the CP1.
The principal component analysis was able to characterize sexual dimorphism.The CP1 combined with either one of the other two components was effective in trying to make the distinction of male and female Labrador Retriever (Figure 1).The use of three principal components to represent the sexual dimorphism is not required in order that such a distinction is displayed using only the first two components, which together explain approximately 43% of the total variation.This fact confirms the capacity of key components in reducing the dimension of the space of the original 30 variables to only two components.
The discrepancy between the positions of females and males in relation to CP1 (y-axis) allows us to suggest a difference in size between males and females in Labrador Retriever (Figure 1).All variables are negatively correlated with the CP1, which establishes higher scores for females, because of the lower estimates for the measured characteristics.When placed on the Cartesian plane the scores of females provide superiority in positioning compared to the y axis (CP1), in analogy to the males, who have higher morphometric estimates and, consequently, lower scores on the CP1.Sexual dimorphism in size may be related to male-male competition, ensuring the reproductive success of these animals, resulting in the selection of larger males (LANDE, 1980).According to Sutter et al. (2008), sexual dimorphism in size is present in most domestic dogs, being evidenced by the height at the withers in some breeds, in which males are generally larger than females.
The pre-selection of variables aimed to better effect in the data reduction, and was performed by dispersing 30 variables of the first three principal components, which together represent 50% of the total variation.Wold et al. (1987) assigned that the principal component analysis estimates the correlation structure of the variables, in which the importance of a variable is determined by the size of their residual variance and this can be used for variable selection.
Analyzing the graphs of correlation of the variables with the components (Figure 2), we selected the variable with distance of at least 0.6 of the origins of the axes in both graphs, since the variables with the highest correlation with a particular component are those that further get from the origin of the axis of the respective component.
The pre-selection of the most correlated variables was performed considering that the PCA assumes that the variables are correlated (VICINI; SOUZA, 2005).Using the 23 pre-selected variables through the graphic of dispersion of the variables in main components, PCA noted eigenvalues, the percentage of variance explained by each component and the proportions of the accumulated variance (Table 5).
Pre-selection was able to reduce the need for three main components to explain 50% of the total variation, as observed in the first procedure (30 variables) for only two main components.The correlations of the variables with the components were obtained for the first two components, which together represented approximately 50% of the total variation (Table 6).
For the second procedure (pre-selection of variables), the CP1 was the most representative, accounting for approximately 42% of the total variance.The correlations of this component with the original variables also show an inversely proportional relationship in view of the negative nature of correlations.
The principal component analysis with preselected data was also able to characterize sexual dimorphism.The CP1 combined with CP2 explains just over 50% of the total variation and was efficient in trying to make the distinction between males and females in Labrador Retriever (Figure 3).This fact confirms the capacity of key components in reducing the dimension of the original space of 23 variables to only two components.The five variables most correlated with CP1 in both procedures (30 or 23 variables) are equivalent, which validates the relevance of the perimeter of the muzzle, neck perimeter, height to the middle of the torso, height at withers and trunk length in sexual dimorphism.
González et al. ( 2011) asserted that morphometric analysis can be used to evaluate the homogeneity/heterogeneity of the indicated features by a particular group of animals and to establish a gradient of distances between them.Therefore, 30 variables measured for the discriminant analysis on the basis of sexual dimorphism of the Labrador Retriever dogs (two populations: males and females) were used.Cruz and Regazzi (2008) pointed that the classification is assigned replacing the measures observed in each variable for each of the individuals in the generated functions.The function that presents the highest value after replacing the measurements of each animal will be the population to which the individual belongs.The allocation or classification of new individuals may occur using the discriminant functions for each population (Male and Female) (Table 7).The Hotelling test ( =24.69) was significant (p<0.01),which indicates the difference between the mean vectors of both sexes.According to Mingoti (2005), it is necessary to assess the quality of the discriminant function built, being able to assess their significance and the probability of misclassification.The Hotelling test is a multivariate significance test that makes a comparison between the vectors of means among normal multivariate and independent populations.Ferreira (2008) stated that there are several methods for estimating the costs of misclassification, one being re-substitution method.Mingoti (2005) reported that the method of resubstitution or consistency analysis uses the information generated based on a contingency table, thus it was generated a contingency table with only two individuals misclassified (Table 8).The same probability (p = 0.50) was assigned for the two populations (males and females), when processed discriminant analysis, since there is no difference in the importance of populations.
The Chi-Square test ( =0.016) was not significant (p>0.05),therefore, pointed out the similarity between the observed frequencies (animals classified by discriminant functions for each sex), and the expected frequencies (animals measured for each sex).
The consistency analysis technique was used to estimate the probability of misclassification of sexes.The estimate is given by the apparent error rate, which was 2.7% for discriminant analysis with 30 variables.The apparent error rate is underestimated, as these animals used to generate the classification function were used to estimate the error rate (FERREIRA, 2008).However, with a rate of 2.7% discriminant analysis was able to demonstrate sexual dimorphism for dogs from the Labrador Retriever breed using the 30 measured variables.
The discriminant analysis was also processed using only five variables that were most correlated with CP1 in both previous procedures (PCA with 30 and 23 variables), since this component is the one with higher capacity of explanation of the total variation.This study was carried out in order to, not only get functions that discriminate males and females, but also enable the classification of other breed animals for sex, using a smaller number of characteristics (Table 9).With the replacement of the measures in the two discriminant functions (males and females), the function that has the highest result (score) is that the individual should be allocated (PINTO et al., 2008).
The Hotelling test ( =201.48) was significant (p<0.01),indicating the difference between the vectors of mean of both sexes.Thus, we generated a new contingency table (Table 10).The Chi-Square test ( =0.022) was not significant (p>0.05),however, noted the similarity between the observed frequencies (animals classified by discriminant functions for each sex) and expected frequencies (animals measured for each sex).
With a smaller number of variables, larger number of wrong classifications was observed.In the analysis of the consistency, apparent error rate (AER) equals to 6.75% was obtained.
The pre-selection of variables by principal components analysis also allowed discrimination of animals in relation to sex, and still generally reduce the number of variables used for classification with an apparent error rate of approximately 6.75%.According to Purzyc, Kobrynczuk and Bojarski (2010), the correct classification of 70% of the animals by the discriminant analysis was satisfactory.The accuracy rate obtained in this study was approximately 93.25%, therefore, it is considered satisfactory.

CONCLUSIONS
When processed with 30 variables, principal components analysis was able to reduce the dimension to three components accounting for approximately 50% of the total variation, demonstrating aptitude in identifying the sexual dimorphism of the Labrador Retriever dogs.When processed with 23 variables (pre-selection), the PCA reduced to two principal components needed to explain approximately 50% of the total variation, corroborating its efficiency in indicating sexual dimorphism of dogs.Therefore, the pre-selection of variables was able to optimize the purpose of principal components analysis, given its superiority in reducing the dimensionality.
In both studies the variables muzzle perimeter, neck perimeter, height to the middle of the torso, height at withers and trunk length showed greater importance in morphometric differentiation of the sexes.
The discriminant analysis of Anderson was able to differentiate the two populations (males and females) for both the 30 original variables as for the five most correlated variables, selected by principal component analysis.The pre-selection of data performed with PCA was able to reduce the number of variables for use in discriminant analysis, without causing changes impact on the proportion of bad rating.However, the use of a greater number of variables made possible to reduce the probability of misclassification.
Both the discriminant function obtained by using the 30 variables, as the one obtained when using the five most correlated variables, can be used in other papers/studies in an attempt to classify other dogs breed by sex.When contemplated five correlated variables, the error rate is approximately 6.75%.

Figure 2 .
Figure 2. Correlations of the 30 variables with the first three principal components.01 = total length, 02 = length of skull, 03 = width of the skull, 04 = length of the muzzle, 05 = circumference of the muzzle, 06 = width of the bridge of the nose in base, 07 = width of the bridge of the nose tip, 08 = width of the ear, 09 = length of the ear, 10 = circumference of the neck, 11 = height at withers, 12 = height of substernal emptiness, 13 = Height of the chest, 14 = height to the middle of torso, 15 = width of the chest, 16 = circumference of the chest, 17 = height of the rump, 18 = width of the rump, 19 = length of the rump, 20 = height of the insertion of the tail, 21 = width of the base of the tail, 22 = width of the tip of the tail, 23 = circumference of the tail, 24 = length of the tail , 25 = height of the elbow, 26 = circumference of forearm, 27 = circumference of metacarpus, 28 = height of knee, 29 = circumference of knee, 30 = length of the trunk.
Five more correlated variables with the CP1.

Table 1 .
Biometric characters measured

Table 2 .
Means and standard deviations, in centimeters, of 30 measured variables.

Table 4 .
Correlations between the original variables and the first three principal components.

Table 6 .
Correlations between original variables and the two principal components.

Table 7 .
Coefficients of the discriminant functions using 30 variables.

Table 9 .
Coefficients of the discriminant functions using five correlated variables.

Table 10 .
Contingency table with five correlated variables.