To the list of courses || To: GAT2018 || Estonian

Cluster Analysis

The tool Non-spatial clusters in the Spatial Data Calculator (SDC) provides necessary tools for these exercises. Alternatively, Statsoft Statistica installed to the computers in the room 202 can be used for such excercises. A cluster tree can be pruned at different levels. The number of clusters and the clusters depend on the pruning level.
Not answered
The answer needs to be updated
ID P Question or exercise
3668 4
  1. What are the four forest stand types separated by cluster analysis (k-means clustering, k = 4) from data in the attached file? Tree species names in Latin are in worksheet Puud.
  2. Which part of the difference between stands is described by these clusters?
  3. How large is the probability that the same share or larger amount of differences could be described if the trees were distributed randomly among stands?
  4. Do the clusters depend on the distance function (Euclidean, Block, SQ Euclidean etc)? Which distance function yields in different clusters?
  5. Mention the distance function you used for answering the first and second questions.
3665 1 Which two tree species are growing most separately (from other species and from each other) according to the data in the attached file and using cluster analysis (Euclidean distance, Single linkage)? Open
3666 5 The attached file contains some (somewhat outdated) data about EU member states. Standardize the variables Elanike arv (population) and GDP per Capita in the work sheet Majandus by subtracting the mean and dividing by the SD.
  1. Which groups of member states are formed by cluster analysis? NB! All countries must be classified to a cluster.
  2. Give a name to each group.
  3. Which method and distance function you used to get these clusters? Why you preferred these options?
  4. Why was asked to standardize the variables?
  5. Which proportion of variability is described by these clusters?
  6. Add the cluster tree if you used tree clustering.