To the list of courses || GAT2018 || To the theme || Estonian

**Data**: SAAREMAA.XLS ### Instructions

**Statsoft Statistica**, you can find the members of each cluster in *Advanced* → *Members of each cluster distances*.

## Exercise 3668. Points 4, theme: Cluster Analysis |
Open exercise |

- What are the four forest stand types separated by cluster analysis (k-means clustering, k = 4) from data in the attached file? Tree species names in Latin are in worksheet
*Puud*. - Which part of the difference between stands is described by these clusters?
- How large is the probability that the same share or larger amount of differences could be described if the trees were distributed randomly among stands?
- Do the clusters depend on the distance function (Euclidean, Block, SQ Euclidean etc)? Which distance function yields in different clusters?
- Mention the distance function you used for answering the first and second questions.

- The number of clusters has to be given for the
*k-means*clustering. Here, k = 4. - The object to cluster are observations in rows and variables are tree species proportions in columns from
*Kuusk*to the category other trees*Muud_puud*. The answer should be stand name not tree species name. E.g. if the wood consists mainly of pines it is pine wood. The wood is called mixed if there is no clear dominant tree species. A statistical cluster can also include forest stands dominated by different species. - If using the
**SDC**Cluster analysis, copy the columns with tree proportions to the input cell. If tree names are included check*Variable names are in the first row*. - Uncheck
*Object names are in the first column*. - The number of iterations can be zero, as significance is not asked.
- Press
*Calculate*.

The members of each cluster are in the results panel. The proportion of explained variance is in the header part of results.

Log in to send your results and to see the expected answer and responses from other students.