Sample sizes and related problems in multivariate archaeology

Westwood, S., 2003. Sample sizes and related problems in multivariate archaeology. MPhil, Nottingham Trent University.

10183532.pdf - Published version

Download (24MB) | Preview


The number of samples available for the statistical analysis of archaeological scientific data, specifically chemical compositional data, is typically small and determined by practical considerations. However the number of variables measured is often high. We investigate a number of sample size problems that can arise in the multivariate analysis of compositional data with limited sample sizes.

Initially current trends in published articles on the multivariate analysis of compositional data are reviewed. We report that analyses are typically undertaken on data with between 8 and 20 variables with sample sizes in the region of 30 to 100, and are commonly analysed using principal components analysis or cluster analysis. We investigate the claims that projection pursuit is a 'sharper' tool than principal components analysis for the analysis of multivariate compositional data, but reject it on the grounds that a) limited sample size can result in the detection of spurious structure frequently, and b) the length of time required to fully examine results is excessive.

In trivariate lead-isotope ratio studies it is generally accepted that a minimum of 20 observations is adequate to define a lead-isotope field. Using simulated and actual data with a greater number of observations than have previously been available, we question this assumption and demonstrate that 40 or more observations are required to demonstrate non-normality in some cases. Our approach to determining sample sizes in lead-isotope data comprises direct testing of normality and assessment of modality. Our method of assessing modality is to generate kernel density estimates of data and count modes, this also allows us to perform a comparison between different methods of kernel density estimation. This work suggests that adaptive kernel density estimates are better able to estimate the density of air unknown population. We use this insight to extend a formal test of modality using kernel density estimation that provides more interpretable results.

Item Type: Thesis
Creators: Westwood, S.
Date: 2003
ISBN: 9781369316964
Divisions: Schools > School of Science and Technology
Record created by: Linda Sullivan
Date Added: 30 Sep 2020 15:22
Last Modified: 20 Sep 2023 09:07

Actions (login required)

Edit View Edit View


Views per month over past year


Downloads per month over past year