Constud is designed for

  • Calculating indices of spatial pattern into a database or binary raster file;
  • Weighting features and exemplars needed for the most reliable similarity-based predictions;
  • Predicting nominal and numerical variables;
  • Calculating similarity between an observation and exemplars of a given class;
  • Storing the predictions in a database or as binary raster maps;

Components

  • Software application Constud.
  • Data base(s): knowledge base(s) of observations (feature vectors of so-called cases), parameters of machine learning, results of machine learning and predicted values.
  • Data layers of explanatory variables, mask and pre-classifiers.
    Data layers are needed only when spatial data are used. In the case of non-spatial data, the system consists of two components — knowledge base and the software.

Special Features

  • Constud software is able to generate knowledge base tables designed according to the need of Constud system.
  • Computation of local statistics automatically from multiple raster files.
  • Extensive options for local kernel and sample parameters.
  • Ability to compute local statistics in area defined by a mask layer.
  • Includes indices of spatial pattern rarely found in other applications. E.g. gradient strength, stripeness, second and third mode and mode weighted by inverse distance.
  • Data layers, metadata, exemplars, parameters, results of machine learning and prediction results are kept in an external database.
  • The raster layers used together may have different pixel size.
  • Predictions are calculated according to similarity, with no model.
  • Similarity of spatially close observations is reduced by a parameter called zero-distance. If an exemplar is spatially closer than zero-distance then the similarity is = 0 . At a larger distance similarity = similarity - zero-distance / distance.
  • In the case of spatial data, applicability of features and exemplars is restricted by validity range (database fields [From] and [Until]).
  • Search for the best solution is a continuous iterative process. Experience obtained during the process is saved in the database as numerical actuality values.
  • Features and cases can be added to and excluded from the training sample without interrupting the learning process.
  • Hitherto the best set of weights for features and exemplars can anytime be used for map generation.
  • The option to learn feature and exemplar weights separately for each category of a multinomial feature.
  • The system uses a separate validation sample or leave-one-out cross validation while learning.
  • The system can accommodate new data during machine learning.

Requirements for the Constud Database

Constud software is able to validate the structure of the connected database and to generate tables designed according to the need of Constud system. Current Constud 3 version is still in development. It works only in connection with MS Access database.

Requirements for Data Layers

In Constud, the the data layers are used as source for calculating local spatial statistics, for spatially limiting polygons while calculating indices and as a mask layer while calculating predictions to a limited area.

The spatial data layers, pre-classifiers and interpolation polygons used in Constud must correspond to the following requirements.

  • All data layers must be without header in unpacked binary raster format saved by rows (Clark Labs Idrisi32 rst format).
  • All raster files used must be in the same projection and in the same coordinate system.
  • Layers of numerical variables must be in Idrisi real (4 bytes per pixel) format.
  • Layers of nominal variables, mask and pre-classifier must be in byte format.
  • The pixel size must remain constant for all map sheets within the same data layer.
  • In case of explanatory variables, the name of each raster file can contain only digits (representing map sheet number) and the extension .rst. The map sheet number cannot exceed maximum value for integer format (2147483647).
  • Raster files of each data layer must be located in a different subdirectory. The names data layers are registered and their metadata stored in the table [D_LAYERS].
  • Boundary coordinates of each map sheet must be in the table [Map_sheets].