MATERIALS AND METHODS

This study is organized around two kinds of groups; the species groups of Wessels Boer (1968) and what I call character groups. Species groups are the 17 groups of species that Wessels Boer used to organize his monograph. I follow these here, with some modifications. I do not intend to imply that the species in these groups are all closely related, or that the groups have any taxonomic basis, just that I use them as a convenience in studying a large and complex genus. Details of each species group are found on the relevant pages.

Character groups, on the other hand, are groups of specimens discovered by the methods of analysis explained here. Character groups have unique combinations of character states and are the entities to which species concepts are applied. Thus character groups are equivalent to species. Within character groups, subgroups may occur, and these may be recognized as subspecies, as explained below.

I use the term variable for any qualitative (binary or multistate) or quantitative (continuous, meristic) attribute. I use the terms character and trait in the sense of the Phylogenetic Species Concept (PSC) of Nixon and Wheeler (1990). Characters are qualitative variables that are found in all comparable individuals within a terminal lineage (i.e., species); traits are variables that are not universally distributed among comparable individuals within such a lineage.

Some authors have pointed out the tenuous distinction between qualitative and quantitative characters (e.g., Stevens, 1991; Thiele, 1993). In this study characters are termed qualitative if they have non-overlapping state distributions. In this sense, use of the term qualitative is a shorthand expression for such characters (Thiele, 1993).

Choice of species concepts. In this study the Phylogenetic Species Concept (PSC) is used. This was defined by Nixon and Wheeler (1990) as: "the smallest aggregation of populations....diagnosable by a unique combination of character states in comparable individuals". Individual specimens are considered comparable because all are fertile. Reasons for the choice of this concept are discussed in more detail in Henderson (2005).

Two operational modifications are necessary in order to apply the PSC. According to Davis and Nixon (1992), phylogenetic species are delimited by successive rounds of aggregation of local populations, based on analysis of traits and characters. Because there is no a priori method of placing specimens in populations, and consequently distinguishing a priori between traits and characters, all specimens (i.e., treating specimens as populations) for all variables (i.e., traits and characters), were used in the analysis (see below). A second modification of the PSC involves variation in quantitative variables. Some groups of specimens with unique combinations of qualitative character states nevertheless vary greatly in quantitative variables and may occur in disjunct geographic areas. Such subgroups may differ significantly from one another in one or more variables. Luckow (1995), in her discussion of the PSC, stated that "groups of populations that differ not by fixed characters, but by differences in mean values would be recognized as subspecies or varieties [under the PSC]." I follow a slightly modified version of this here. When subgroups can be delimited geographically and can be distinguished by one or more significantly different mean values of a variable, then I apply a phylogenetic subspecies concept.

In summary, the PSC is applied to groups of specimens with unique combinations of qualitative character states (i.e., character groups), and a PSC subspecies concept is applied to subgroups that can be delimited geographically and by one or more significantly different mean values of a quantitative variable (i.e., geographic subgroups).

Data matrix construction. Specimens from the following herbaria were examined and scored: AAU, BH, BM, C, F, G, LE, MEXU, MO, NY, PMA, and US (herbarium abbreviations from Holmgren et al., 1990). Type specimens from other herbaria (e.g., C, K, M) were examined but not necessarily scored. Rarely, more than one duplicate of a collection was scored. A search was made for qualitative variables in which two or more states of the variable were present among the specimens and could be scored unequivocally. This search was based on those variables used in previous monographs (e.g., Wessels Boer, 1968) and on a survey of specimens. A dissecting microscope was used to survey floral variables. A search was also made for quantitative variables that could be measured from specimens or taken from specimen labels (in case of ranges, median values were used). Variables were counted or measured with a ruler, digital calipers, or protractor.

Data matrices were constructed with specimens as rows and variables as columns. Additional columns recorded a specimen identification number, collector, collector's number, herbarium, country, latitude, longitude, and elevation. Latitude and longitude were either taken from the specimen label, or from the collection locality using either maps or electronic gazetteers.

Data analysis. Multivariate methods of analysis, cluster analysis (CA), principal component analysis (PCA), principal coordinate analysis (PCoA), and discriminant analysis (DA), were carried out using the programs NTSYS (Rohlf, 2000) and Systat (Wilkinson, 1997). The methods described below are general ones for the whole study, but may not apply to all parts of the study.

Some inferential statistics are used in this study. Although random samples are required for statistical inference, the samples of herbarium specimens are not random. However, there is no reason to believe that collectors favor any particular kind of specimen over others. Therefore I proceed with inference, and the results should be considered accordingly.

Specimens with missing values were excluded. Analyses are thus based on subsets of the data, as noted. Data were transformed before analysis, either by standardizing or by log transforming. Because some quantitative variables were not normally distributed, they were log10-transformed. Discrete variables were square root transformed. For PCA and DA, variables were assumed to be multivariate normal, although this assumption is not readily tested (Tabachnik & Fidell, 2001). Although some ordination procedures are considered relatively insensitive to violations of normality assumptions (Tabachnik & Fidell, 2001), results should be viewed as approximate.

Character group (species) and subgroup (subspecies) delimitation. CA was used to divide qualitative variables into traits or characters. The SIMQUAL module of NTSYS with the simple matching coefficient (for binary and multistate variables) was used to produce a similarity matrix. The SAHN module of NTSYS was used to subject the similarity matrix to the unweighted pair group method, arithmetic average (UPGMA) clustering algorithm. Successive analyses were used, with all variables used in the first analysis. Suspected traits (i.e., those variables both states of which occur in adjacent and otherwise homogeneous groupings) were removed, and the analysis run again until groups were found with unique combinations of states. These groups were recognized as character groups. Specimens that had not been included in the analysis because of missing data were then assigned to their respective character group based on their morphology and geography.

Variation within each character group was then examined, based on analysis of geography, quantitative variables, and traits. The purpose of these analyses was to look for evidence of either clinal variation and/or presence of geographic subgroups (i.e., subspecies).

Geographic distribution of character groups was analyzed. Distributions were mapped with Arcview GIS 3.2 (Environmental Systems Research Institute, Inc.) using latitude and longitude data for each specimen. Each dot on the maps represents at least one specimen. Distributions were examined for evidence of disjunctions.

Variation within character groups was analyzed with either PCA of a correlation matrix of standardized quantitative variables, or PCoA of a distance matrix (average taxonomic distance) of standardized quantitative variables and traits.

If evidence of subgroups, either based on geography or morphology, was found, then individual variables were examined. A t-test (two-sample, separate variance test on log10-transformed variables) or one-way ANOVA (on log10-transformed variables) was used to test for subgroup differences for each quantitative variable. The Bonferroni pairwise procedure was used to see which pairs of means differed significantly. If there was at least one significant (P <0.01) difference in any variable for each possible pair of subgroups, then these were recognized as geographic subgroups (i.e., subspecies).

DA of these pre-classified subspecies was carried out with the same data as for PCA. Wilk's lambda was used to test the hypothesis that group centroids are equal. The most discriminatory variables and percentage classification success are reported.

Linear regression was used to analyze relationships within subspecies between log10-transformed quantitative variables and latitude, longitude, and elevation. If there was a significant correlation between variables, squared multiple R is reported. This shows the amount of variance in the dependent variable explained by the independent variable.

Taxonomic treatment. A detailed genus description of Geonoma is given, based on data from this study and from Uhl and Dransfield (1987).

Most types (or images of types) of names of Geonoma have been examined for this study. These are indicated by an exclamation mark (!) following the host herbarium. Several names, the types of which have been destroyed, are listed as excluded names. Images of types of new taxa deposited at NY will be available at the website http://www.nybg.org/bsci/herbarium_imaging/.

The dichotomous key for species groups is in an interactive, web page format using Lucid Phoenix (Centre for Biological Information Technology, University of Queensland). Keys to species within species groups, and to subspecies, are in a traditional format.

In the descriptions, only those qualitative (characters and traits) and quantitative variables scored, measured, or counted in this study are reported. For each quantitative variable, one measure of central tendency, the mean, and two measures of variability, the range and coefficient of variation, are given, as well as sample size (e.g., stem length: 2.0(1.4–2.5) m, CV 0.2, N = 7). The coefficient of variation is given instead of the standard deviation so that the relative amount of variation in each taxon and each variable can be compared.

For descriptions of subspecies, only those quantitative characters that are significantly different between two or more subspecies are given. In keys to subspecies, the most discriminatory quantitative characters from DA are used.