Home About Us Contact Blog
 
 
What's New Products Buy Now Downloads Forum  


GeneXproTools Online Guide

Learn how to use the 5 modeling platforms
of GeneXproTools with the Online Guide

   
 
 
Last update: February 19, 2014

 

Class Encoding

For classification and logistic regression, the learning algorithms of GeneXproTools require a two-valued numeric representation {a, b} for the two classes of the response variable, where a and b are two different real numbers (see how categorical variables and response variables with multiple classes are handled, respectively, in the guides Category Mapping and Class Merging & Discretization). You can change the class encoding in the Class Encoding Window.



The rationale for using other encodings besides the canonical {0, 1} representation, is that the learning algorithms of GeneXproTools for classification and logistic regression, which include different fitness functions and different types of rounding thresholds, can exploit different class encodings for searching the solution space much more efficiently. For example, for fitness based exclusively on the confusion matrix or the number of hits, the class encoding you choose is irrelevant as no error measure based on the distance to the target output is being used to evaluate fitness. However, for fitness functions that explore some kind of distance between raw model outputs and actual values such as the mean squared error, using different encodings can change significantly the fitness landscape, as different ranges for the class representation produce very different results (theoretically, mathematics tells us that all ranges are equivalent, but this doesn’t hold in computational evolutionary systems). In fact, in these systems the standard {0, 1} representation is far from unbiased and better models can be created if a more flexible range is used, such as the symmetrical {-1000, 1000}, {-100, 100} or {-10, 10} and the asymmetrical {0, 1000}, {0, 100} or {0, 10} ; even the {-1, 1} representation, which is not very far from the usual {0, 1} encoding, results in a richer fitness landscape.

Note, however, that the default class encoding of GeneXproTools is the standard {0, 1} representation, but you can change the encoding in the Class Encoding Window. We especially advise you to do so if you’re working with fitness functions based on some kind of distance measure. The default fitness functions of GeneXproTools for classification (ROC Measure) and logistic regression (Positive Correl), are not distance-dependent, which is the reason why GeneXproTools uses as default the more common representation of {0, 1}.

Note also that the choice of different class encodings has no bearing on the way GeneXproTools shows the response variable in all the charts and tables, as in all cases GeneXproTools uses the {0, 1} representation.

GeneXproTools also allows you to invert the class encoding, which might be useful if you’d rather think of a certain outcome as Positive rather than Negative. You can invert the class encoding also in the Class Encoding Window.


See Also:


Related Tutorials:


Related Videos:


Leave Feedback
 
  Please enter the number below using the combo boxes before sending your feedback.
 3 8 4
   

 

 Time Limited Trial

 Try GeneXproTools for free for 30 days!

 Released February 19, 2014

 Last update: 5.0.3883



New Entries  



Add-ons − GeneXproServer  

   Subscribe to the GEP-List

3 8 4
   
 
 
Home | What's New | Products | Buy Now | Downloads | Quick Tour | Support | Contact Us | About Gepsoft | Sign Up
Forum | Blog | Videos | Tutorials | Server Knowledge Base | Logistic Regression Guide | Terms of Use | Privacy & Cookies
 
 

Copyright (c) 2000-2014 Gepsoft Ltd. All rights reserved.