Changes Compared to Constud 2

Differences in the knowledge base structure
The names of database objects and fields are in brackets [] in the following text.

  • MS SQL Server Compact and MS Access support is removed.
  • Prepared standard deviations can be either in a table STDEV as common for all dependent variables or in a separate table for each dependent variable STDEV1, STDEV2 ..., where the number indicates the number of a dependent variable.
  • Nominality of a non-spatial variable is decided according to the statistic, not according to data layer.
  • Replacement features are not supported. An explanatory feature calculated to a data layer should be defined as a separate feature. If, during map generation, a feature is replacing the original feature used in learning, the feature parameters in [EXPL_VAR] must also be replaced.
  • Replaced field names in table [DEP_VAR]: [no_part_feature] => [n_part_features], and [no_features] => [n_features].
  • Log tables can be empty.
  • The field [sample] in log tables is replaced by separate fields for the size of learning sample [tsample] and validation sample [vsample].
  • If the database object containing observed cases, predictions or calculated similarities has a field [exemplars_used], then the list of exemplars used and their similarity to the case is stored.
  • Fields of numerical explanatory features are in the four bytes single (floating point number or real) format.
  • Similarity field [s] is in the single format ranging between 0...1, not as percentages as was in the previous Constud version.
  • If the field [DEP_VAR].[fit] is present, then it is used (in addition to the log table) to store hitherto the best prediction fit reached in learning.
  • The field [DEP_VAR].[divisor] is not used anymore.
  • Similarity between categories of an explanatory feature (if applied) must be as a floating point number ranging 0...1, not as percentages.
  • The number of learning iterations while learning features is not set from the Constud interface but is an obligatory field [DEP_VAR].[Niter_f].
  • The number of learning iterations while learning exemplars is not set from the Constud interface but is an obligatory field [DEP_VAR].[Niter_e].
  • The zero-similarity is not set from the Constud interface but is an obligatory field [DEP_VAR].[0_sim].
  • The range of similarity in standard deviations is not set from the Constud interface but is an obligatory field [DEP_VAR].[sim_range].
  • The zero-distance is not set from the Constud interface but is an obligatory field [DEP_VAR].[0_dist].
  • The change extent while learning features is not set from the Constud interface but is an obligatory field [DEP_VAR].[FChange].
  • The change extent while learning exemplars is not set from the Constud interface but is an obligatory field [DEP_VAR].[EChange].
  • Dimensionality is not set from the Constud interface but is an binary field [DEP_VAR].[spatial].
  • There are two alternatives for measuring similarity to a given category. The given category can be common for all cases (is read from [DEP_VAR].[given]) or is defined separately for each case and is read from the field [F] in the table of observed cases.
  • The AID codes of explanatory features must be byte format integers ranging 0... 254.
  • Fields [pred] and [sim] are replaced by fields [pr] and [s].
  • Presence/absence data must be in byte format, 1 marking presence and 0 marking absence.
  • Fields for minimum and maximum values in [Map_sheets] are in the single format.

Differences in Constud software
  • Constud processes run as an asynchronous background worker, which is not depending on input and output devices of the PC.
  • The main functions are in dynamic link libraries (dll) written in C#.
  • The workflow is reflected in Constud main window, not as a separate form.
  • Idrisi real format file is produced as a result of calculating local statistics to binary raster.
  • Idrisi rdc file is added to the rst file while calculating local statistics to binary raster.
  • Value of a numerical explanatory feature is read from database as a floating point number, nominal value as a byte.
  • For pre-classifiers and given codes, the value 255 means data missing.
  • Temporal overlap of a data layer and an observed case is checked only in case of spatial variables.
  • If an exemplar and a predictable case have the same VID code, these are considered as the same case and this exemplar is not used for predicting the case in learning.
  • Interpolation of prediction maps is not used and interpolations polygons are not supported.
  • Replacement features are not used.
  • Seamless calculation of local statistics is not supported.
  • Random multiplication is used in learning the sum of similarity needed for decision instead of linear changes. Formula: SSimMax = SSimMax * Rnd() * 2
  • Predicted category of a boolean or nominal variable is decided according to the mean similarity of exemplars instead of the total similarity as in the previous version.
  • Similarity to a given category is enabled for all types of nominal variable. In the Constud previous version it was available only for separately learned categories.
  • TSS statistic is calculated from predicted and observed presence/absence not from the sum of true positive and true negative similarity as in the previous version.
  • SQL Server stores the results faster if the server recovery mode is set to simple.
Differences in Constud data layers and in local statistics
  • The layers of explanatory variables can be in one, two or four bytes raster files.
  • Nominal features, pre-classifiers and mask files must be in one-byte raster files.
  • the field [EXPL_VAR].[precl] is removed. This field must be empty if preclassifying polygons are not appled while calculating values of this feature. If a preclassifier is intended to use, the field must contain the KID code of the preclassifying layer.
  • Local value is added to the indices both for nominal (index 0) and numerical (index 100) data.
  • Source layers must be in a folder available through the local network while calculating indices to binary raster.
Attributes in the table DEP_VAR and their default values
For technical details see Constud tutorial.
! — user must define; * — user has to define if the attribute is used; # — updated by Constud.
Field Meaning
! FID ID of the dependent variable
* given Given category of the nominal dependent variable
! calc To calculate this variable?
! spatial Spatial variable?
# Log_ID ID of the best fit in log table
* precl AID of the preclassifier
* name Variable name
! ftype Variable type (0, 2, 3, 4)
1 n_part_features Will be used for complex variables
3 sumsimmax The sum of similarity searched for decision
! dataquery Database object containing learning records
300 t_sample Approximate size of learning samples
1000 v_sample Approximate size of validation samples
# fit Hitherto the best validation fit
10 n_features Approximate number of learning features
50 0_dist Zero-similarity distance
0 0_sim Minimal accepted similarity
2 sim_range Similarity extent in SD for a numerical variable
2 FChange Change vigour used in learning feature weights
2 EChange Change vigour used in learning exemplar weights
5 Niter_f The number of iteratins in learning feature weights
2 Niter_e The number of iteratins in learning exemplar weights
False BoostRare To equalize the proportion of categories in samples in case of the boolean dependent variable?