Default software parameter
  • The default GeneXproTools settings / parameters change depending on the dataset loaded i.e number of cases and variables. Can you explain which parameters are subject to change and what triggers the change please?

    Thank you in advance,

    Darren
  • Hi Darren,

    That’s true: GeneXproTools uses preset default parameters that depend on the number of records and variables for all modeling categories and also on the class distribution for Classification and Logistic Regression.

    The algorithms that control the default settings of GeneXproTools were optimized for each modeling category in order to provide a good starting point for the modeling process for all kinds of datasets and users (more advanced or first-time users). The range of parameters that change are numerous and include dataset partitioning schemes, subsampling, learning algorithm, population size, number of genes, gene size, linking function, function set, genetic operators and their rates, fitness function, random numerical constants settings, handling of missing values and categorical variables, and so on.

    Candida Ferreira

  • Thank you for the response. Is this process transparent? I ask because I would like to be able to monitor when and why parameters change for reporting purposes and so I can take account of any increase or decrease in model performance. Best wishes, Darren
  • Hi Darren,

    The process is totally transparent in the sense that you can see straightaway the default settings GeneXproTools selects for each dataset. I mean GeneXproTools could operate as a black box and show you just the model code; but you know that's not how it operates: everything is transparent and you can change all the settings you want to suit your needs.

    Having said that, the algorithms for selecting the default settings of GeneXproTools are very complex and are based on numerous heuristics that we've been fine-tuning as we gain more and more experience and the software becomes more and more sophisticated. Like I said previously, they are optimized to cover effectively as wide a range of scenarios as possible so that everyone can create a good model from the start.

    Individually, the heuristics that GeneXproTools uses for the defaults are not very complicated and they are not different from the heuristics any experienced user of GeneXproTools would come up with: with the defaults we are offering just a good starting point to all users to get them started. And then they can tweak the settings a bit to better suit their needs or just to try different approaches. That's what I do when I'm working on a problem: I start with the defaults and then I explore and learn more about the problem by trying different settings and approaches.

    Just to give you an example of some of the heuristics used in GeneXproTools for the defaults, you can take a look at the complete description of the algorithm for selecting the defaults just for Function Finding in version 4.0 (you can also find similar descriptions for Classification, Time Series Prediction, and Logic Synthesis):

    http://www.gepsoft.com/gxpt4kb/Chapter13/Section1/SS14.htm

    Now in version 5.0 the complexity of the software is much much higher and full descriptions like this one are almost impossible to create. And since they are totally inconsequential to the user, there's no need to provide them.

    Candida Ferreira

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Tagged