Home About Us Contact Search >>
Products Buy Now Downloads Support
 

Logistic Regression Framework

Confusion Matrix

Download the Demo

 
 
 

Confusion Matrix

In its Logistic Regression Framework, GeneXproTools infers and shows two different Confusion Matrices: the Logistic Confusion Matrix and the ROC Confusion Matrix. Both these matrices are excellent indicators of the accuracy of a model (of both the core model and the final logistic regression model), but they can also be used to fine-tune the logistic regression model.

Confusion Matrix Analysis - Logistic Regression Framework
 

The Logistic Confusion Matrix is derived from the logistic regression model and infers the Most Likely Class through the predicted probabilities evaluated for each sample case. Thus, probabilities higher than or equal to 0.5 (the Logistic Cutoff Point) indicate a Positive response or a Negative otherwise. The model output closest to the Logistic Cutoff Point is highlighted in light green in the Confusion Matrix Table. Note that the exact value of the Logistic Cutoff Point is shown in the companion Logistic Confusion Matrix Stats Report.

In the Confusion Matrix Table you have access not only to the predicted probabilities for each class but also to the Most Likely Class plus how these predictions compare to actual target values. GeneXproTools also shows in the table the Type of each classification (true positive, true negative, false positive, or false negative) for all sample cases, which are obviously all the calculations you need to build the Confusion Matrix that is displayed in the graphic section.

The ROC Confusion Matrix, on the other hand, is inferred using the Optimal Cutoff Point, a parameter derived from the ROC Curve. This means that for model scores higher than or equal to the Optimal Model Threshold, a Positive response is predicted; and a Negative response otherwise. Note that, despite displaying here in this section the diagram representation of the ROC Confusion Matrix, the confusion matrix data (Predicted Class, Match, and Type) are shown in the Cutoff Points Table.

Note, however, that the statistics evaluated at the Optimal Cutoff Point (or OCP statistics, for short) might result in slightly different values than the ones derived from the ROC Confusion Matrix. Remember that OCP statistics are evaluated using the direct readings of all the parameters at the Optimal Cutoff Point (this point, which is highlighted in green both in the ROC Curve Table and Cutoff Points Table, is also highlighted here in green for a comparison with the Logistic Cutoff Point). For inverted models, for instance, the ROC Confusion Matrix was adjusted to match the default predictions of binomial logistic regression, which always predicts the “1” or positive class. The OCP statistics, however, are not adjusted for inversion and correspond to the actual values for the model. Also note that if you decide to export an inverted model to the Classification Framework, the confusion matrix you’ll get there using the Optimal Model Threshold will match the OCP statistics rather than the ROC Confusion Matrix.

Besides the canonical confusion matrix, GeneXproTools also shows a new kind of confusion matrix. This new confusion matrix plots the distribution of all the classification outcomes (TP, TN, FP, FN) along the different quantiles or buckets. This shows clearly what each model is doing, and where their strengths and weaknesses lay. And by comparing both Distribution Confusion Matrices (logistic and ROC), you can also see how both systems are operating. This is valuable information that you can use in different ways, but most importantly you can use it immediately to fine-tune the number of quantiles in your system so that you can get the most of the logistic fit (as a reminder, the ROC Confusion Matrix is quantile-independent and can be used as reference for fine-tuning the logistic regression model that is quantile dependent).

Distribution Confusion Matrix - Logistic Regression Framework

 
 
Download GeneXproTools for Windows Buy GeneXproTools Upgrade GeneXproTools
 
Logistic Regression Framework

   
   
 
GeneXproTools


   


"GeneXproTools is being used to look at problems involving parasite populations, where the data is highly skewed. The results using GeneXproTools are considerably better than those obtained using conventional statistics."

Prof John Barrett
Head of the Parasitology Group
University of Wales, UK

   
 

More

   

Tutorials



Quick Tour Videos





Gene Expression Programming


   Subscribe to the GEP-list
Enter 2 + 32 =
Signup Now

 
 
 
 
     

 
Home | What's New | Products | Buy Now | Upgrade | Downloads | Quick Tour | Support | Contact Us | About Gepsoft | Sign Up
Tutorials | Videos | FAQ | Knowledge Base | Logistic Regression KB | Terms of Use | Privacy & Cookies
 
 

Copyright (c) 2000-2013 Gepsoft Ltd. All rights reserved.