Home About Us Contact Blog
   
     
What's New Products Buy Now Downloads Forum Ext Custom Fitness
     
 
GeneXproTools  

GeneXproServer  

Gene Expression Programming

 
 

External Custom Fitness

In this article we show how to create an external Custom Fitness Function for GeneXproTools using Microsoft Visual Basic 6.0. Until now, to create custom fitness functions you had to be familiar with Javascript but now you can use the techniques in this article to create custom fitness functions in any language that supports the creation of COM components. This covers almost every language from scripting languages like Perl and Python to C++, Delphi, C# and Visual Basic. You can download all the project files, including complete projects for Visual Basic 6, VB.NET and C#, from the resources links a the bottom of this page.

How Does It Work?

The key to this technique is to delegate the fitness function calculation to an in-process component that is created and called from the Javascript fitness function in GeneXproTools. The Javascript is very simple:


var proc = new ActiveXObject("[PROGID.CLASSID]");
return proc.Calculate(gxptTarget, gxptOutput, gxptParams, gxptModelInfo);


The code above creates a new object and passes the GeneXproTools arrays to the method Calculate. These arrays are Javascript VBArray types that are new in version 4.0 of GeneXproTools and correspond to the various arrays of the previous version, which are still supported. These arrays are variants containing arrays of variants and their contents are:

  • gxptTarget: Contains the values of the dependent variable
  • gxptOutput: Contains the values calculated by the current model
  • gxptParams: Contains settings of the run
    • gxptParams(0) = number of samples
    • gxptParams(1) = averaged target output
    • gxptParams(2) = variance of the target output
    • gxptParams(3) = 0/1 rounding threshold
    • gxptParams(4) = number of samples in the predominant class
    • gxptParams(5) = minimum program size
    • gxptParams(6) = maximum program size
  • gxptModelInfo: Contains information about the model
    • gxptModelInfo(0) = program size
    • gxptModelInfo(1) = used variables
    • gxptModelInfo(2) = number of literals

The "[PROGID.CLASSID]" string in the code above must be replaced with the respective ProjectName.ClassName of your VB project. In our sample project the complete code would be:


var proc = new ActiveXObject("VBCustomFitness.MSE");
return proc.Calculate(gxptTarget, gxptOutput, gxptParams, gxptModelInfo);


And this is all that needs to be added to the custom fitness function in GeneXproTools. The image below shows the Javascript part of the custom fitness function in GeneXproTools.


Creating the VB Fitness Function

The easiest way to test this feature is to download the tutorial’s files, open the project VBCustomFitness in Visual Basic 6.0 and compile the library. Then start GeneXproTools, open the file VBCustomFitnessTest.gep (also in the tutorial's files) and start a run. If the run fails to start with the error “The custom fitness code does not compile” you must review the Javascript code ensuring that it matches the example and that the ActiveX string is correct.

If you prefer to start your own project you have to create a new VB project of type ActiveX DLL, add a class and implement the fitness function. You will also have to adjust the class id to match your component. If you change the name of the method Calculate or its signature you also have to adapt the Javascript code. Finally it is also possible to debug you fitness function from within Visual Studio or VB6's IDE. For VB6 open the project, add a breakpoint in the function body and press F5. Then open GeneXproTools and start the run.

For fitness functions created with VB.NET or C# you must open the project and change the debugging properties to start an external program. Point this property to the GeneXproTools executable (usually at c:\Program Files\GeneXproTools 4\GeneXproTools.exe) and then press F5 to start a debugging session.

Notice that aborting a debug session will probably unload GeneXproTools losing all your unsaved changes. In some cases it may also corrupt the run file so we strongly suggest that you create backups of your runs before using them for debugging a fitness function.

The tutorial’s projects implement the Mean Squared Error fitness function, with the following Calculate method (using the VB6 version):

Public Function Calculate(ByRef gxptTarget As Variant, _
                          ByRef gxptOutput As Variant, _
                          ByRef gxptParams As Variant, _
                          ByRef gxptModelInfo As Variant) _                           As Variant

    Dim nSamples As Long: nSamples = gxptParams(0)
    Dim fitness As Double
    Dim modelMinusTargetSquared As Double
    Dim MSE As Double
    Dim temp1 As Double
    Dim i As Long

    For i = 0 To nSamples - 1
        temp1 = 0
        temp1 = gxptOutput(i) - gxptTarget(i)
        temp1 = temp1 * temp1
        modelMinusTargetSquared = modelMinusTargetSquared + temp1
    Next

    MSE = modelMinusTargetSquared / nSamples

    If MSE <= 0.000000001 Then
        MSE = 0#
    End If

    Calculate = (1 / (1 + MSE)) * 1000

End Function


Performance

This approach is about 10% faster when compared with the same implementation of MSE in Javascript. This is an approximate value and was measured for a very simple fitness function. But the real advantage of this new technique will assert itself for more complex fitness functions involving more time consuming calculations.

With this approach you can store information between instantiations of your code allowing you to perform expensive operations like opening files only once and this is where you will see the major performance improvements.

The GXPT4CFHelper.DataHelper Library

This library is a free add-on to GeneXproTools that allows you to access run datasets from within your custom fitness functions. The library is part of this articles’ files which can be downloaded here. To install the DataHelper library unzip the files, open a command prompt, navigate to the folder where the file GXPT4CFHelper.dll file is and run the following command:

regsvr32 GXPT4CFHelper.dll

If you need to uninstall the library run the command:

regsvr32 GXPT4CFHelper.dll /u

From this point on the DataHelper class will be available to your internal and external Custom Fitness functions.

DataHelper methods

  • DataHelper.Initialise(ByRef RunPath As String, ByRef gxptTarget As Variant, ByRef gxptOutput As Variant, ByRef gxptParams As Variant, ByRef gxptModelInfo As Variant)

    This method initialises the library, must be called every time the fitness function is calculated and this call must appear before any other call into the library.

    Parameters:

    RunPath: string – Must contain the complete path to the GeneXproTools file.
    gxptTarget, gxptOutput, gxptParams, gxptModelInfo: variant – These are the internal GeneXproTools variant arrays that are passed to the fitness function and must be passed on to the library.

  • GetDoubleValue(ByVal Column As Long, ByVal Row As Long, ByVal DataSet As DataSetEnum)

    This method returns a data point of either the training set of the testing (if it exists).

    Parameters:

    Column: long – The variable index starting at zero.
    Row: long – The sample index starting at zero
    DataSet: enum – Flag that selects the set to fetch the data from:
    • dsTrainingSet(1) – Training Set
    • dsTestingSet(2) – Testing Set

DataHelper properties

  • CurrentSet: enum - dsTrainingSet(1), dsTestingSet(2)
    This property returns the set that is now being processed. It relies on both sets having a different number of samples and it does not return a correct value when both the Training and Testing set have the same number of samples.

  • Columns: long
    Returns the number of variables plus the dependent variable.

  • TrainingSamples: long
    Returns the number of samples in use in the Training set.

  • TestingSamples: long
    Returns the number of samples in use in the Testing set.

  • ErrorNumber: long
    Returns the last error after a DataHelper call. You should check if DataHelper.ErrorNumber is not zero to ensure that the previous call did not fail.

  • ErrorDescription: string
    Short description of the last error.

How It Works and Limitations

The DataHelper works by opening the GeneXproTools file and reading its datasets to memory. For the library to work correctly you must point it to the same run file that you are working on in GeneXproTools and you should save before you start a run. The DataHelper caches the loaded data for the duration of the session so if you replace any of the datasets you must restart GeneXproTools. As the data is loaded once the library will not be reset when you open a different run. You will have to restart GeneXproTools every time you want to use a different file. The library does not raise any errors and will fail silently when your code requests invalid data (like an inexistent data point). When an error is detected the ErrorNumber property is set to a non-zero value and the ErrorDescription is set to the description of the error.

The sample projects include examples on how to use this library in VB, VB.NET and C#.

Precautions

This is an advanced and powerful feature that also carries some responsibility. Your code is responsible for handling any exceptions and free up any memory it allocates. Also it must not modify the GeneXproTools parameters or attempt to deallocate them. Finally, to allow GeneXproTools to function correctly your fitness function must always return a double value between 0 and maximum fitness.


Resources

VB6, VB.NET, C# projects and the DataHelper library

Mean Squared Error

User Defined Fitness Functions

Files for the first edition of this article. VB6 project

Last modified: July 16, 2007

Disclaimer & License: The code made available in this article is provided as-is, it is copyright of Gepsoft Limited and falls under the license of GeneXproTools. The DataHelper library is also provided as-is under the same license agreement as GeneXproTools. Even though this library is not part of the product we will provide support for its use at our discretion.

 

 Time Limited Trial

 Try GeneXproTools for free for 30 days!

 Released February 19, 2014

 Last update: 5.0.3883

 Read more...

   Academic Editions

Academic licenses of all GeneXproTools editions are available at a discount for education institutions & students.

   Software Bundles

Bundles of GeneXproTools & GeneXproServer are available at a discount price for all editions of GeneXproTools, including academic editions.

   Subscribe to the GEP-List

3 8 4
   
 


 
Home | What's New | Products | Buy Now | Downloads | Quick Tour | Support | Contact Us | About Gepsoft | Sign Up
Forum | Blog | Videos | Tutorials | Knowledge Base | Server KB | Logistic Regression Guide | Terms of Use | Privacy & Cookies
 
 

Copyright (c) 2000-2014 Gepsoft Ltd. All rights reserved.