How to read this tutorial

This document describes in detail how to extend the KNIME framework with your own node. Although this manual decribes the development of one simple node it is not necessary to read everything in consecutive order. Use the content index to jump to the section you are interested in. You would rarely implement all the described functionality in one single node.

Contents:

  1. Section 1: Overview of the Eclipse plugin concept
  2. Section 2: The first steps on how to create your own node with the extension wizard
  3. Section 3: How to implement your own algorithm in a NodeModel with a NodeDialog
    1. Implementing the NodeModel
      1. The validateSettings method
      2. The loadValidatedSettings / saveSettings methods
      3. The configure method
      4. The execute method
    2. Implementing the NodeDialog
  4. Section 4: How to implement a NodeView for your node
  5. Section 5: How to save and load your internal representation
  6. Section 6: How to implement HiLiting
  7. Section 7: How to save and load an external model
  8. Section 8: How to adapt your NodeDescription
  9. Section 9: How to test and use your node
    1. Export and deploy your node as a plugin
  10. Download: The complete source code.

Section 1:

Overview of the Eclipse plugin concept

Although the core KNIME functionality does not depend on Eclipse, Eclipse is used as the workbench framework to provide a professional graphical user interface. For this reason all KNIME components are built up as so-called Eclipse plugins. In Eclipse everything is a plugin plugged into the extension points of other plugins. At its base, there is just a small runtime engine that executes the plugins and determines their dependencies. The required structure of an Eclipse plugin comprises a "plugin.xml" file, which contains the dependencies to other plugins and the extension points to which the plugin wants to connect. Furthermore, a plugin can provide its own extension points to which additional new plugins can be connected.

The KNIME workbench itself connects to several extension points of the Eclipse workbench (e.g. the editor-, preferences page-, perspective-extension point, etc.). KNIME itself also provides two extension points to which external providers can contribute to the functionality of KNIME. These extension points are the "Categories" extension point (org.knime.workbench.repository.categories) and the "Nodes" extension point (org.knime.workbench.repository.nodes). The "Categories" extension point allows you to introduce new category folders displayed in the node repository by simply adding an entry to the plugin.xml. The "Nodes" extension point enables you to contribute a new functional node to the node repository that is connected to the Java code by registering the corresponding NodeFactory in the plugin.xml. Besides the plugin.xml file, an Eclipse plugin requires a so-called "Bundle Activator", which inherits from the Eclipse class "Plugin". This is a housekeeping class that contains practically no functionality in terms of KNIME extensions but is nevertheless required by eclipse. The "Bundle Activator" is registered in the obligatory "MANIFEST.MF" file located in the META-INF directory. Furthermore, the "MANIFEST.MF" file contains information about the class path, required plugins, the vendor, which java packages should be made visible outside, etc.

The final file required is the "build.properties" file. It configures the way a plugin is exported (deployed). It defines the name of the jar file in which the classes of the plugin should be stored as well as the files to be included in deployment; it also enables you to define two build settings for a build that includes the source code and for one that does not. A source folder (mostly "src") is also provided that contains the java sources of the corresponding plugin. All this infrastructure is automatically created by the extension wizard described below to enable a developer to immediately focus on the real problem and not have to worry about the Eclipse infrastructure.

↑ Back to top

Section 2:

The first steps on how to create your own node

If you want to extend the functionality of KNIME you can implement your own Nodes and contribute them to the KNIME Node Repository. A node consists of four basic classes:

  1. The NodeModel: contains the main algorithm and administrates the data flow.
  2. The NodeDialog (optional): provides the means to configure the algorithm of the NodeModel.
  3. The NodeView (optional): displays information about the result of the NodeModel's algorithm.
  4. The NodeFactory: Bundles all the classes together.

In addition an XML file is mandatory (with exactly the same name as the factory). It specifies the name and type of the node, and describes the node, the configuration options, the in- and output, and the view(s) of the node.

So that the user does not have to create all files by hand, the KNIME platform provides a new-node-extension-wizard, which collects necessary information from the user and automatically creates a KNIME plugin.

In the KNIME perspective, select File->New->Other-> Konstanz Information Miner->Create a new KNIME Node-Extension. Continue by clicking the Next button. In the next screen enter the details about your node. Let's assume you want to write a numeric binner that assigns the data into equidistant bins according to the value of a specific attribute value. We will call that node NumericBinnerNode. Your new-node-extension-wizard would look similar to the one shown in Illustration 1:

Screenshot of the extension wizard

Enter a project name with which you will find it in the KNIME workflow navigator. The node class name is the name of your node. The necessary classes described above all start with this name (e.g. NumericBinnerNodeModel, NumericBinnerNodeDialog, etc.). In the Package name field specify the package name of the node. In the Node vendor field enter your name. Your name appears as the author name in the java doc. In the Node description text field provide a short description of your node. Finally specify the node type: this assigns a specific background color to the node in the KNIME workflow.

After clicking the Finish button the new-node-extension-wizard creates the project and all necessary directories and files required for a proper KNIME plugin as depicted in Illustration 2:

Screenshot of the resulting files in the package explorer

If the project contains errors concerning the Java build path try the following: Select from the menu Projekt->Clean... Select "Clean projects selected below" and select the newly created project or "Clean all projects". This fixes the errors in most of the cases.

The created classes contain only exemplary method stubs that can be completely deleted. You must add the desired functionality. The following sections explain how to do this step-by-step.

↑ Back to top

Section 3:

How to implement your own algorithm in a NodeModel with a NodeDialog

We start off by explaining a very simple binner. The bins are equally spaced, such that the whole range of a certain attribute is divided into n intervals. The data points with an attribute value within the k-th interval are considered to belong to the k-th bin. Therefore, the output is the original table with the binning information appended for each instance, i.e. row. The node also requires a dialog, as the user should be able to determine the number of bins and also specify the column on which the values should be binned.

NodeModel:

Before we start to implement the actual binning algorithm in the execute method, we have to define the fields we need in the NodeModel. (After creation the NodeModel already contains exemplary code which can be deleted). A convienent way to exchange the settings from the NodeModel to the NodeDialog is provided by the SettingsModel. As you will see later on, the NodeDialog also works with the SettingsModel, which is why we use them for the number of bins and the column on which the values should be binned:

	// the settings model for the number of bins 
	private final SettingsModelIntegerBounded m_numberOfBins =
		new SettingsModelIntegerBounded(NumericBinnerNodeModel.CFGKEY_NR_OF_BINS,
                    NumericBinnerNodeModel.DEFAULT_NR_OF_BINS,
                    1, Integer.MAX_VALUE);
	
	// the settings model storing the column to bin
	private final SettingsModelString m_column = new SettingsModelString(
            NumericBinnerNodeModel.CFGKEY_COLUMN_NAME, "");
	

In order to obtain the settings from the dialog, they must be written into a NodeSettings object. The NodeSettings transfer the settings from the dialog to the model and vice versa. A key is needed for each field to identify and retrieve it from the NodeSettings. It is good practice to define the static final string used as the key in the NodeModel.

    /** The config key for the number of bins. */ 
    public static final String CFGKEY_NR_OF_BINS = "numberOfbins"; 
    /** The config key for the selected column. */
    public static final String CFGKEY_COLUMN_NAME = "columnName";
	

Transfer of the settings from the NodeModel to the NodeDialog is realized by implementing the validateSettings, loadValidatedSettings and saveSettings methods. All this methods can be safely delegated to the SettingsModels. In the validateSettings method a check is made to see if the values are present and valid (for example in a valid range, etc.).

    /**
     * @see org.knime.core.node.NodeModel
     *      #validateSettings(org.knime.core.node.NodeSettingsRO)
     */
     @Override
    protected void validateSettings(final NodeSettingsRO settings)
            throws InvalidSettingsException {
            
    	// delegate this to the settings models
    	
        m_numberOfBins.validateSettings(settings);
        m_column.validateSettings(settings);
    }
	

When the loadValidatedSettings method is called, the settings are already validated and can be loaded into the local fields, which in this case is the SettingsModels of the number of bins and the selected column.

    /**
     * @see org.knime.core.node.NodeModel
     *      #loadValidatedSettingsFrom(org.knime.core.node.NodeSettingsRO)
     */
     @Override
    protected void loadValidatedSettingsFrom(final NodeSettingsRO settings)
            throws InvalidSettingsException {
            
    	// loads the values from the settings into the models.
        // It can be safely assumed that the settings are validated by the 
        // method below.
        
        m_numberOfBins.loadSettingsFrom(settings);
        m_column.loadSettingsFrom(settings);

    }
	

In the saveSettings method the local fields are written into the settings such that the dialog displays the current values.

    /**
     * @see org.knime.core.node.NodeModel
     *      #saveSettingsTo(org.knime.core.node.NodeSettings)
     */
     @Override
    protected void saveSettingsTo(final NodeSettingsWO settings) {

        // save settings to the config object.
    	
        m_numberOfBins.saveSettingsTo(settings);
        m_column.saveSettingsTo(settings);
    }
	

The above described methods are only one step to check whether the node is executable with the current settings. It is also very important to check whether or not it might work with the incoming data table. This is accomplished by the configure method. The configure method is executed as soon as the inport has been connected. In the small example of our numeric binner, a check is performed to see if at least one numeric column is available and if the incoming data table contains a column with the selected column name. otherwise the node is not executable. The DataTableSpec contains the required information and is passed to the configure method.

    /**
     * @see org.knime.core.node.NodeModel
     *      #configure(org.knime.core.data.DataTableSpec[])
     */
    protected DataTableSpec[] configure(final DataTableSpec[] inSpecs)
            throws InvalidSettingsException {
        // first of all validate the incoming data table spec
        
        boolean hasNumericColumn = false;
        boolean containsName = false;
        for (int i = 0; i < inSpecs[IN_PORT].getNumColumns(); i++) {
            DataColumnSpec columnSpec = inSpecs[IN_PORT].getColumnSpec(i);
            // we can only work with it, if it contains at least one 
            // numeric column
            if (columnSpec.getType().isCompatible(DoubleValue.class)) {
                // found one numeric column
                hasNumericColumn = true;
            }
            // and if the column name is set it must be contained in the data 
            // table spec
            if (m_column != null 
                    && columnSpec.getName().equals(m_column.getStringValue())) {
                containsName = true;
            }
            
        }
        if (!hasNumericColumn) {
            throw new InvalidSettingsException("Input table must contain at " 
                    + "least one numeric column");
        }
        
        if (!containsName) {
            throw new InvalidSettingsException("Input table contains not the " 
                    + "column " + m_column.getStringValue() + " . Please (re-)configure " 
                    + "the node.");
        }
        
        
        // so far the input is checked and the algorithm can work with the 
        // incoming data
        ...
	

Just as we rely on the incoming specification of the data, the successor nodes also require information about the data format, which is provided after execution. For this reason, a specification for the output of our node must also be created in the configure method.

    ...
	// now produce the output table spec, 
    // i.e. specify the output of this node
    DataColumnSpec newColumnSpec = createOutputColumnSpec();
    // and the DataTableSpec for the appended part
    DataTableSpec appendedSpec = new DataTableSpec(newColumnSpec);
    // since it is only appended the new output spec contains both:
    // the original spec and the appended one
    DataTableSpec outputSpec = new DataTableSpec(inSpecs[IN_PORT],
            appendedSpec);
    return new DataTableSpec[]{outputSpec};
	...
	

Since a DataColumnSpec must be created for the newly appended column in both the configure and the execute method, the code for the creation of the DataColumnSpec is extracted in a separate method:

    private DataColumnSpec createOutputColumnSpec() {
        // we want to add a column with the number of the bin 
        DataColumnSpecCreator colSpecCreator = new DataColumnSpecCreator(
                "Bin Number", IntCell.TYPE);
        // if we know the number of bins we also know the number of possible
        // values of that new column
        DataColumnDomainCreator domainCreator = new DataColumnDomainCreator(
                new IntCell(0), new IntCell(m_numberOfBins.getIntValue() - 1));
        // and can add this domain information to the output spec
        colSpecCreator.setDomain(domainCreator.createDomain());
        // now the column spec can be created
        DataColumnSpec newColumnSpec = colSpecCreator.createSpec();
        return newColumnSpec;
    }
	

Once this has been completed and implemented, the actual algorithm for equidistant binning can be written. The algorithm operating on the data must be placed in the execute method. In this example only one column is appended to the original data. For this purpose the so-called ColumnRearranger is used. It requires a CellFactory, which returns the appended cells for a given row.

        ...        
	    // instantiate the cell factory
        CellFactory cellFactory = new NumericBinnerCellFactory(
               createOutputColumnSpec(), splitPoints, colIndex);
        // create the column rearranger
        ColumnRearranger outputTable = new ColumnRearranger(
                inData[IN_PORT].getDataTableSpec());
        // append the new column
        outputTable.append(cellFactory);
	    ...
	

Having created the ColumnRearranger, it can be transferred together with the input table to the ExecutionContext to create a BufferedDataTable which is returned by the execute method, i.e. provided at the outport. Each node buffers the data in a BufferedDataTable. In order to avoid redundant buffering of the same data the ColumnRearranger is used. In this way only the appended column is buffered in our node. That is why we have to retrieve the BufferedDataTable from the ExecutionContext:

	    ...
        // and create the actual output table
        BufferedDataTable bufferedOutput = exec.createColumnRearrangeTable(
                inData[IN_PORT], outputTable, exec);
        // return it
        return new BufferedDataTable[]{bufferedOutput};	
        ...
	

For purposes of the CellFactory it is necessary to implement a NumericBinnerCellFactory. This extends the SingleCellFactory and only implements the getCell method. The passed row is checked to find out which bin contains the value from the selected column. It returns the number of the bin as a DataCell.

    /**
     * @see org.knime.core.data.container.SingleCellFactory#getCell(
     * org.knime.core.data.DataRow)
     */
    @Override
    public DataCell getCell(DataRow row) {
        DataCell currCell = row.getCell(m_colIndex);
		// check the cell for missing value
        if (currCell.isMissing()) {
            return DataType.getMissingCell();
        }
        double currValue = ((DoubleValue)currCell).getDoubleValue();
        int binNr = 0;
        for (Double intervalBound : m_intervalUpperBounds) {
            if (currValue <= intervalBound) {
                return new IntCell(binNr);
            }
            binNr++;
        }
        return DataType.getMissingCell();
    }
	

NodeDialog:

When the NumericBinnerNodeDialog is created you will see that the constructor already contains some exemplary code. You may delete it and add instead the code for your desired control elements. For the NumericBinnerNodeDialog we need two GUI elements: one to set the number of bins and one to select the column for the binning. The KNIME framework provides a very convenient setting to apply standard dialog elements to the NodeDialog. Thus, your NumericBinnerNodeDialog extends the DefaultNodeSettingsPane by default. If the default dialog components do not suit your needs, for example if some components should be enabled or disabled depending on the user's settings, you may extend the NodeDialogPane directly. In our case a DialogComponentNumber for the number of bins and a DialogComponentColumnSelection need to be added. Each component's constructor requires a new instance of a SettingsModel. The SettingsModel expects a string identifier, which it uses to store and load the value of the component, and a default value, which it holds until a new value is loaded. Additional parameters are necessary, depending on the type of component. The loading from and saving to the settings is executed automatically via the key passed in the constructor. We recommend using the key defined in the NodeModel. If you do this, you must make it public at this point.

public class NumericBinnerNodeDialog extends DefaultNodeSettingsPane {

    /**
     * New pane for configuring NumericBinner node dialog.
     * Contains control elements to adjust the number of bins 
     * and to select the column to bin.
     * Suppress warnings here: it is unavoidable since the 
     * allowed types passed as an generic array. 
     */
	@SuppressWarnings ("unchecked")
    protected NumericBinnerNodeDialog() {
        super();
        // nr of bins control element
        addDialogComponent(new DialogComponentNumber(
                new SettingsModelIntegerBounded(
                    NumericBinnerNodeModel.CFGKEY_NR_OF_BINS,
                    NumericBinnerNodeModel.DEFAULT_NR_OF_BINS,
                    1, Integer.MAX_VALUE),
                    "Number of bins:", /*step*/ 1));
        // column to bin
        addDialogComponent(new DialogComponentColumnNameSelection(
                new SettingsModelString(
                    NumericBinnerNodeModel.CFGKEY_COLUMN_NAME,
                    "Select a column"),
                    "Select the column to bin",
                    NumericBinnerNodeModel.IN_PORT,
                    DoubleValue.class));                    
    }
}

After you have created your node and have implemented the NodeModel and the NodeDialog don’t forget to edit your node description in the XML file (with exactly the same name as your NodeFactory). Describe your node, the dialog settings, the in- and outports and later on, the view. This is explained in detail in Section 8

↑ Back to top

Section 4:

How to implement a NodeView for your Node

In this section a NodeView for our node is implemented. In order to display information about the work of this node we display the bins in a histogram, where the height of each bin indicates the number of rows in this bin.

Internal Representation:

A model is required to represent the outcome of our algorithm. Obviously a model representing a bin is appropriate. A bin contains a number of rows and is graphically represented by a rectangle. Therefore, we create this data structure.

public class NumericBin {
    
    private final Set<DataCell> m_containedRowIds;
    
    /**
     * 
     *
     */
    public NumericBin() {
        m_containedRowIds = new HashSet<DataCell>();
    }
    
    /**
     * Adds another row to this bin.
     * @param rowId the row to add to this bin.
     */
    public void addRowToBin(final DataCell rowId) {
        m_containedRowIds.add(rowId);
    }
    
    /**
     * 
     * @return the number of rows in this bin.
     */
    public int getSize() {
        return m_containedRowIds.size();
    }
	

To obtain the bins filled with the referring row IDs, an array with empty bins must be passed to the NumericBinnerCellFactory, which adds the row to the referring bin in the getCell method:

    /**
     * @see de.unikn.knime.core.data.container.SingleCellFactory#getCell(
     * de.unikn.knime.core.data.DataRow)
     */
    @Override
    public DataCell getCell(DataRow row) {
        DataCell currCell = row.getCell(m_colIndex);
		...
        int binNr = 0;
        for (Double intervalBound : m_intervalUpperBounds) {
            if (currValue <= intervalBound) {
                m_bins[binNr].addRowToBin(row.getKey().getId());
                return new IntCell(binNr);
            }
            binNr++;
        }
		...
	

Drawing Component:

Before we can start to implement the NodeView we have to implement a component that actually draws our bins. For this purpose we create our own JPanel using a quite simple paint method:

public class NumericBinnerViewPanel extends JPanel 
...
    /**
     * @see javax.swing.JComponent#paint(java.awt.Graphics)
     */
    @Override
    public void paint(Graphics g) {
        super.paint(g);
        if (m_bins != null && m_bins.length > 0) {
            int maxNr = 0;
            // determine the largest bin
            for (int i = 0; i < m_bins.length; i++) {
                maxNr = Math.max(m_bins[i].getSize(), maxNr);
            }
            // if no size information available (creation) set default size
            int width = getWidth();
            if (width == 0) {
                width = SIZE;
            }
            int height = getHeight();
            if (height == 0) {
                height = SIZE;
            }
            // calculate the bin width
            int binWidth = width / m_bins.length;
            for (int i = 0; i < m_bins.length; i++) {
                // the left side of the rectangle
                int x = i * binWidth;
                // the height of the bin
                int binHeight = height;
                // the larger the bin the higher the rect
                double sizeFactor = ((double)(maxNr - m_bins[i].getSize())
                        / (double)maxNr); 
                // since y-axis starts on top subtract 
                binHeight -= sizeFactor * height;
                Rectangle rect = new Rectangle(x, height - binHeight, binWidth, 
                        binHeight);
                m_bins[i].setViewRepresentation(rect);
                // draw the bin in black
                g.setColor(Color.BLACK);
                g.fillRect(rect.x, rect.y, rect.width, rect.height);
                // draw a border in white to make the bins distinguishable
                g.setColor(Color.WHITE);
                g.drawRect(rect.x, rect.y, rect.width, rect.height);
            }
        }
    }

NodeView:

Implementation of the NodeView is very simple. In the constructor we create the drawing component (our NumericBinnerViewPanel) using the bins retrieved from the model. We set it as our view content via the setComponent method:

public class NumericBinnerNodeView extends NodeView {

    // panel which actually paints the bins
    private NumericBinnerViewPanel m_panel;
    
    /**
     * Creates a new view.
     * 
     * @param nodeModel the model class: {@link NumericBinnerNodeModel}
     */
    protected NumericBinnerNodeView(final NodeModel nodeModel) {
        super(nodeModel);
        // get the bins
        NumericBin[] bins = ((NumericBinnerNodeModel)getNodeModel())
        	.getBinRepresentations();
        if (bins != null && bins.length >0) {
            // create the panel that draws the bins
            m_panel = new NumericBinnerViewPanel(bins);
        } 
        // sets the view content in the node view
        setComponent(m_panel);
    }
	

Note that the panel might be null. If this is the case, the view displays a message that no data is available by default. Another situation might be that the view is open and the model changes (due to a different input, or a different configuration). Then the view has to be updated. This is realized by the modelChanged method in the NodeView:

    /**
     * @see org.knime.core.node.NodeView#modelChanged()
     */
    protected void modelChanged() {
        // if the model had changed get the new bins
        NumericBin[] bins = ((NumericBinnerNodeModel)getNodeModel())
        .getBinRepresentations();
        if (bins != null && bins.length > 0 && m_panel != null) {
            // and paint the bins
            ((NumericBinnerViewPanel)m_panel).updateView(bins);
        }
    }
	

The NumericBinnerViewPanel also needs an update method, as follows:

    /**
     * If the view is updated the new bins are set and then painted.
     * 
     * @param bins the new bins to display.
     */
    public void updateView(final NumericBin[] bins) {
        m_bins = bins;
        repaint();
    }
	

↑ Back to top

Section 5:

How to save and load your internal representation

Now try the following: execute the node, save, close and re-open the workflow. Your node is still executed but when you open the view no data is available. This is because your internal representation, i.e. the bins, is not saved automatically. To ensure that your internal representation can be stored and loaded, the KNIME framework provides two methods in the NodeModel: loadInternals and saveInternals. The following explains how to implement these methods for our numeric binner node.

  1. Create a ModelContent object. To the ModeContent you can add the raw types of Java, such as int, double, String, etc. Also DataCells and subconfigs can be stored. Subsequently, the NodeSettings can be written to XML with saveToXML and loaded with the static method loadFromXML.
  2. If you have a DataTable, DataArray or similar as your internal model you can use the convenient static method:
    DataContainer.writeToZip(DataTable yourTable, File zipFile);
  3. If neither of the above-mentioned methods fit you can serialize your internal model and write it to file.

Since the serialization of objects is slow and prone to errors, we use the ModelContent approach to store our internal model. Since a numeric bin is not a raw type, it cannot be stored directly by ModelContent. However, each NumericBin consists only of raw types. Moreover, it is enough to store the contained row IDs, because the visual representation can be restored from this information: the size of a drawn bin depends on the width and height of the panel (which is evaluated in the paint method) and the number of contained rows. We provide two methods to let each NumericBin save and load itself from the ModelContent and afterwards store the ModelContents in the main ModelContent in the NodeModel. First of all each NumericBin must provide methods to save itself to and load itself from ModelContent:

    /**
     * Adds the IDs of the contained rows to the settings. This is sufficient in order 
     * to later on restore the visual representation, since that only depends on
     * the dimension of the panel and the number of contained rows per bin.
     * 
     * @param modelContent the model content object to save to.
     */
    public void saveTo(final ModelContentWO modelContent) {
        DataCell[] cellArray = new DataCell[m_containedRowIds.size()]; 
            m_containedRowIds.toArray(cellArray);
        modelContent.addDataCellArray(CFG_KEY_CELLS, cellArray);
    }

    /**
     * Loads the contained row IDs.
     *  
     * @param modelContent
     * @throws InvalidSettingsException
     */
    public void loadFrom(final ModelContentRO modelContent) 
        throws InvalidSettingsException{
        DataCell[] cellArray = modelContent.getDataCellArray(CFG_KEY_CELLS);
        m_containedRowIds.addAll(Arrays.asList(cellArray));
    }
	

Again, we need an internal key to identify the stored field - in this case the CFG_KEY_CELLS. It only has to be unique in this class, as each object receives its own ModelContent object. Now we can save and load our bins in the NodeModels saveInternals and loadInternals methods. In the saveInternals we get the directory to store our files. We create a new ModelContent object and for each bin we create a sub model content which is passed to the bin. The bin itself writes the necessary information in the sub model content. Afterwards we create a new file in the given directory, create an output stream and let the main model content write itself to XML.

    /**
     * 
     * @see org.knime.core.node.NodeModel#saveInternals(java.io.File, 
     * org.knime.core.node.ExecutionMonitor)
     */
    protected void saveInternals(final File internDir,
            final ExecutionMonitor exec) throws IOException,
            CanceledExecutionException {
       // create the main model content
       ModelContent modelContent = new ModelContent(INTERNAL_MODEL);
       for (int i = 0; i < m_bins.length; i++) {
           // for each bin create a sub model content
           ModelContentWO subContent = modelContent.addModelContent(
                   NUMERIC_BIN + i);
           // save the bin to the sub model content
           m_bins[i].saveTo(subContent);
       }
       // now all bins are stored to the model content
       // but the model content must be written to XML
       // internDir is the directory for this node
       File file = new File(internDir, FILE_NAME);
       FileOutputStream fos = new FileOutputStream(file);
       modelContent.saveToXML(fos);
    }
    

The loading of the internal model works accordingly. Create the file and an input stream and let the main model content load from the XML file. Then fetch the sub model content for every bin and let each bin load itself from this sub model content. Add the bin to your field.

    /**
     * 
     * @see org.knime.core.node.NodeModel#loadInternals(java.io.File, 
     * org.knime.core.node.ExecutionMonitor)
     */
    protected void loadInternals(final File internDir,
            final ExecutionMonitor exec) throws IOException,
            CanceledExecutionException {
        m_bins = new NumericBin[m_numberOfBins.getIntValue()];
        File file = new File(internDir, FILE_NAME);
        FileInputStream fis = new FileInputStream(file);
        ModelContentRO modelContent = ModelContent.loadFromXML(fis);
        try {
            for (int i = 0; i < m_numberOfBins.getIntValue(); i++) {
                NumericBin bin = new NumericBin();
                ModelContentRO subModelContent = modelContent
                        .getModelContent(NUMERIC_BIN + i);
                bin.loadFrom(subModelContent);
                m_bins[i] = bin;
            }
        } catch (InvalidSettingsException e) {
            throw new IOException(e.getMessage());
        }
    }
    

When you now try again to execute, save, close and re-open the workflow, you will see, that the view displays the desired information.

↑ Back to top

Section 6:

How to implement HiLiting

In the KNIME framework the technique known as linking and brushing is called HiLiting. This means that whenever a datapoint is hilited in one view it is immediately also hilited in all other views displaying this data point. If we had a view that displayed the datapoints directly we would have to implement the HiLiteListener interface to be informed about any change in the hiliting. The HiLiteListener interface has three methods:

    /** 
     * Invoked when some item(s) were hilit. 
     * 
     * @param event contains a list of row keys that were hilit
     */
    void hiLite(final KeyEvent event);

    /** 
     * Invoked when some item(s) were unhilit.
     * 
     * @param event contains a list of row keys that were unhilit
     */
    void unHiLite(final KeyEvent event);
    
    /**
     * Invoked, when everything (all rows) are unhilit.
     */
    void unHiLiteAll();
	

But since we have an aggregated view of the datapoints it does not make much sense to implement the HiLiteListener interface. We would rather hilite a bin and see all the datapoints in that bin hilited in other views. In the following we explain how this is implemented. First of all we have to prepare our NumericBin to know when it has been selected and if it is hilited. Therefore we simply introduce two flags to indicate the status:

    /**
     * @param isHilite sets the hilite status of this bin.
     */
    public void setHilited(final boolean isHilite) {
        m_isHilite = isHilite;
    }
    
    /**
     * 
     * @return true if this bin contains hilited keys, false otherwise.
     */
    public boolean isHilited() {
        return  m_isHilite;
    }
    
    /**
     * 
     * @return true if this bin is selected. false otherwise.
     */
    public boolean isSelected() {
        return m_isSelected;
    }
    
    /**
     * 
     * @param selected true, if the bin is selected, false otherwise.
     */
    public void setSelected(final boolean selected) {
        m_isSelected = selected;
    }
	

The next step is to listen to the mouse events to be informed about whether a bin is selected or not. Then the bin has to know its graphical representation, i.e. the painted rectangle (otherwise we cannot know if it is clicked or not):

    /**
     * 
     * @return the graphical representation as a rectangle.
     */
    public Rectangle getViewRepresentation() {
        return m_viewRepresentation;
    }
    
    /**
     * The graphical representation can only be calculated outside with the
     * knowledge of the number of bins, the maximal and minimal size 
     * and the available width and height. This is done in the 
     * {@link NumericBinnerViewPanel#paint(java.awt.Graphics)}
     * 
     * @param rectangle the graphical representation
     */
    public void setViewRepresentation(final Rectangle rectangle) {
        m_viewRepresentation = rectangle;
    }
	

In order to listen to mouse events we have to add a MouseListener in the NodeView's constructor to the drawing component. The selected bins are stored in a local datastructure m_selected:

        ...
		m_selected = new HashSet<NumericBin>();
		m_panel.addMouseListener(new MouseAdapter() {

            /**
             * @see java.awt.event.MouseAdapter#mouseReleased(
             * java.awt.event.MouseEvent)
             */
            @Override
            public void mouseReleased(final MouseEvent e) {
                if (!e.isControlDown()) {
                    m_selected.clear();
                    for (NumericBin bin : m_panel.getBins()) {
                        bin.setSelected(false);
                    }
                }
                for (NumericBin bin : m_panel.getBins()) {
                    if (bin.getViewRepresentation() != null &&
                    		bin.getViewRepresentation().contains(
                    				e.getX(), e.getY())){
                        bin.setSelected(true);
                        m_selected.add(bin);
                        break;
                    }
                }
            ...
	

So far we are able to select one or more bins. If we want to hilite them we have to add a menu to the NodeView, to enable us to hilite or unhilite the selected bins, or clear the hiliting.

        // create the hilite menu 
        // the HiliteHandler provides standard names 
        m_hilite = new JMenuItem(HiLiteHandler.HILITE_SELECTED);
        m_hilite.setEnabled(false);
        m_hilite.addActionListener(new ActionListener() {

            /**
             * @see java.awt.event.ActionListener#actionPerformed(
             * java.awt.event.ActionEvent)
             */
            public void actionPerformed(final ActionEvent e) {
                Set<DataCell> toBeHilited = new HashSet<DataCell>();
                for (NumericBin bin : m_selected) {
                    // store all row ids from the selected bin
                    toBeHilited.addAll(bin.getContainedRowIds());
                    // set the bin hilited
                    bin.setHilited(true);
                    // count the number of hilited bins for a 
                    // correct menu display (see below)
                    m_numberOfHilitedBins++;
                }
                // now get the hilite handler and hilite the rows
                getNodeModel().getInHiLiteHandler(
                        NumericBinnerNodeModel.IN_PORT).fireHiLiteEvent(toBeHilited);
                // and repaint to have the hilited bins displayed correctly
                m_panel.repaint();
            }
            
        });
        m_unhilite = new JMenuItem(HiLiteHandler.UNHILITE_SELECTED);
        m_unhilite.setEnabled(false);
        m_unhilite.addActionListener(new ActionListener() {

            /**
             * @see java.awt.event.ActionListener#actionPerformed(
             * java.awt.event.ActionEvent)
             */
            public void actionPerformed(final ActionEvent e) {
                Set<DataCell> toBeUnhilited = new HashSet<DataCell>();
                for (NumericBin bin : m_selected) {
                    // store all row ids that should be unhilited
                    toBeUnhilited.addAll(bin.getContainedRowIds());
                    // unhilite the bin
                    bin.setHilited(false);
                    // decrease the number of hilited bins
                    m_numberOfHilitedBins--;
                }
                // get the hilite handler and unhilite the rows
                getNodeModel().getInHiLiteHandler(
                        NumericBinnerNodeModel.IN_PORT).fireUnHiLiteEvent(toBeUnhilited);
                // repaint to have the bins displayed correctly
                m_panel.repaint();
            }
            
        });
        
        JMenuItem clear = new JMenuItem(HiLiteHandler.CLEAR_HILITE);
        clear.addActionListener(new ActionListener() {

            /**
             * @see java.awt.event.ActionListener#actionPerformed(
             * java.awt.event.ActionEvent)
             */
            public void actionPerformed(final ActionEvent e) {
                // get the hilite handler and unhilite all
                getNodeModel().getInHiLiteHandler(
                        NumericBinnerNodeModel.IN_PORT).fireClearHiLiteEvent();
                // unhilite all bins
                for (NumericBin bin : m_panel.getBins()) {
                    bin.setHilited(false);
                }
                // no bin is hilited anymore
                m_numberOfHilitedBins = 0;
                // repaint to display the bins correctly
                m_panel.repaint();
            } 
        });
        // create the menu and all the menu items to it
        JMenu menu = new JMenu(HiLiteHandler.HILITE);
        menu.add(m_hilite);
        menu.add(m_unhilite);
        menu.add(clear);
        // get the JMenu bar of the NodeView and add this menu to it
        getJMenuBar().add(menu);
        ...
	

The HiLiteHandler provides standard names for the menu items. The getJMenuBar method returns the MenuBar of the NodeView to which the additional menu can be added. To further improve our small human computer interface we can enable and disable the menu items dependent on the current selection and hilite status, i.e. the hilite menu entry should only be enabled when some bins are selected. And accourdingly should the unhilite menu entry only be enabled when some bins are selected and hilited. We add this functionality to the MouseListener. (By the way this is the reason, why the two menu items are local fields.)

            public void mouseReleased(final MouseEvent e) {
		    ...	
                // update the hilite menu
                if (m_selected.size() > 0) {
                    m_hilite.setEnabled(true);
                } else {
                    m_hilite.setEnabled(false);
                }
                
                if (m_numberOfHilitedBins > 0 && m_selected.size() > 0) {
                    m_unhilite.setEnabled(true);
                } else {
                    m_unhilite.setEnabled(false);
                }
                m_panel.repaint();
            }
        });
	

So far we are able to select and hilite (unhilite) the bins and the contained rows. But if you run the code implemented so far you immediately encounter the frustrating fact, that in our view you cannot distinguish between selected, hilited and normal bins. Thus, we have to add this to the paint method of the drawing component, the NumericBinnerViewPanel (it also shows how the graphical rectangle of the bins is updated in every paint):

    ...          
	Rectangle rect = new Rectangle(x, height - binHeight, binWidth, 
		binHeight);
	m_bins[i].setViewRepresentation(rect);
	// draw a border in white to make the bins distinguishable
	Color color = Color.BLACK;
	if (m_bins[i].isHilited()) {
		color = ColorAttr.HILITE;
	}
	if (m_bins[i].isSelected()) {
		color = ColorAttr.SELECTED;
	}
	if (m_bins[i].isHilited() && m_bins[i].isSelected()) {
		color = ColorAttr.SELECTED_HILITE;
	}
	Graphics2D g2 = (Graphics2D)g;
	g2.setColor(color);
	g2.fillRect(rect.x+2, rect.y+2, rect.width-2, rect.height-2);
	g2.setColor(Color.WHITE);
	g2.setStroke(new BasicStroke(2));
	g2.drawRect(rect.x, rect.y, rect.width, rect.height);
	...
	

Now the view looks good and is correctly displayed if a bin is selected, hilited, both, or none. We use the KNIME standard colors defined in the ColorAttr to have a uniform coloring in all views.

↑ Back to top

Section 7:

How to save and load an external model

We would like to assume you would like to implement a learner that learns a certain model from the data and that you would subsequently like to have a predictor node that uses the learned model in order to classify new data. To accomplish this, you need a facility to pass the learned model from the learner to the predictor. The KNIME framework provides this functionality using the ModelPort concept. To roughly explain this concept we will now create the learned model for our numeric binner and write the interval bounds for every bin into a model. For this purpose it is necessary to overwrite the saveModelContent method of the NodeModel. If we had a node with a ModelInport we would have to overwrite the loadModelContent method in order to be able to load the model when it is connected to the inport. Although it is possible to write your model directly to the ModelContent object it is good practice and highly recommended to have an object that represents the external model of your node and which is solely responsible for the loading from and saving to the ModelContent. In our case we create a class NumericBinModel. It is then very easy to share your model with other nodes. To increase the usability of the model we store the lower and upper bound as an interval for every bin. Since we only have a ModelOutport we only have to implement the saveTo method, as can be seen below:

    /**
     * Saves this model to the model content
     * @param modelContent the model content to save to.
     */
    public void saveTo(final ModelContentWO modelContent) {
        modelContent.addInt(NUMBER_OF_BINS, m_intervals.size());
        int intervalNr = 0;
        for (Interval interval : m_intervals) {
            ModelContentWO intervalModel = modelContent.addModelContent(
                    INTERVAL + intervalNr++);
            intervalModel.addDouble(LOWER_BOUND, interval.getLowerBound());
            intervalModel.addDouble(UPPER_BOUND, interval.getUpperBound());
        }
    }
    

The interval simply stores the lower bound upper bound pair. In addition the model provides some methods to add intervals and retrieve the intervals:

    /**
     * 
     * @param binNumber the number of the bin for which the lower bound 
     * should be returned
     * @return the lower bound of the specified interval.
     */
    public double getLowerBoundForInterval(final int binNumber) {
        return m_intervals.get(binNumber).getLowerBound();
    }

    /**
     * 
     * @param binNumber the number of the bin for which the upper bound 
     * should be returned
     * @return the upper bound of the specified interval.
     */
    public double getUpperBoundForInterval(final int binNumber) {
        return m_intervals.get(binNumber).getUpperBound();
    }
    
    /**
     * 
     * @return the number of bins, i.e. intervals.
     */
    public int getNumberOfBins() {
        return m_intervals.size();
    }
    

    /**
     * Adds an interval to this model.
     * @param lowerBound the lower bound of the interval
     * @param upperBound the upper bound of the interval
     */
    public void addInterval(final double lowerBound, final double upperBound) {
        Interval interval = new Interval(lowerBound, upperBound);
        m_intervals.add(interval);
    }
	

Now, we have to fill the model in the for-loop of the NodeModel's execute method:

        ...
        double intervalUpperBound = lowerBound;
        // create the external model
        m_model = new NumericBinModel();
        double intervalLowerBound = lowerBound;
        for (int i = 0; i < m_numberOfBins.getIntValue(); i++) {
            intervalLowerBound = intervalUpperBound;
            intervalUpperBound += interval;
            // fill the external model
            m_model.addInterval(intervalLowerBound, intervalUpperBound);
            splitPoints.add(intervalUpperBound);
            // fill the bins with empty representations
            m_bins[i] = new NumericBin();
        }
        ...
    

Once the model has been filled with the information about the intervals it is possible to save it in the saveModelContent method of the NodeModel:

    /**
     * Only the saveModelContent method has to be overwritten, since there is 
     * only a ModelOutport.
     * 
     * @see de.unikn.knime.core.node.NodeModel#saveModelContent(int, 
     * de.unikn.knime.core.node.ModelContentWO)
     */
    @Override
    protected void saveModelContent(final int index, 
            final ModelContentWO modelContent) throws InvalidSettingsException {
        m_model.saveTo(modelContent);
    }
    
Since we now provide an external model of our Node we have to add a ModelOutport. This is completed in the constructor of the NodeModel, where the number of DataIn- and Outports is defined with the first two parameters and the number of ModelIn- and Outports with the last two parameters. Since we have no ModelInport and one ModelOutport the resulting new constructor looks like this:
     /**
     * Constructor for the node model.
     */
    protected NumericBinnerNodeModel() {
        // we have one inport for the numeric data to bin 
        // and two outports:
        // one for the original data with the binning information appended
        // and one for the bins and their used interval bounds.
        super(1, 1, 0, 1);
    }
    

Having adapted your constructor in this way you will notice the following error the next time you start your workbench:

ERROR	 NumericBinnerNodeFactory	 CODING PROBLEM	Missing or surplus predictor output port name
    

To avoid this you have to adapt your node description, which is explained in the next section.

↑ Back to top

Section 8

How to adapt your NodeDescription

The NodeDescription is the information displayed in the NodeDescription window of your workbench. You can edit the information in the XML file, which is named exactly like your NodeFactory. In our case it is the NumericBinnerNodeFactory.xml. The NodeDescription explains the functionality of the node, the configuration options, the node's view and the meaning of the in- and outports. Since we added a ModelOutport we also have to add a description of that port. When you look at the NumericBinnerNodeFactory.xml file you will notice that there are already some fields but most of them filled with placeholders. We will go through each of the fields and add the appropriate information.

  • fullDescription: The full description should explain the functionality of your node and help a user to use it. For example: "divides the domain of a selected numeric column into the selected number of equidistant bins and puts the input data into these bins accourdingly. The binning information (which row is in which bin) is provided in an extra column at the output."
  • option: for each control element in your dialog describe the meaning of it here. The option name is the label of it used in the dialog. We add:
    <option name="Number of bins">Define the number of bins</option>
    <option name="Column to bin">Select the numeric column which should be binned</option>
  • ports:
    • dataIn: the data inports with the index - a name displayed in the tooltip and a description, which is displayed in the NodeDescription,
    • dataOut: the data outports with the index, a name displayed in the tooltip and in the context menu and a description, which is displayed in the NodeDescription,
    • modelIn: the model inports with the index, a name displayed in the tooltip and a description for the NodeDescription,
    • modelOut: the model outports with the index, a name displayed in the tooltip and the context menu and a description for the NodeDescription.
    We add:
    <ports>
    <dataIn index="0" name="Data to bin">Data to bin</dataIn>
    <dataOut index="0" name="Binned data">The input data with an additional column containing the referring bin number for each row.</dataOut>
    <modelOut index="0" name="Bin intervals">Bounds of the intervals</modelOut>
    </ports>
  • views: explains what is displayed in the view with an index of the view (since a node may have several views), a name displayed in the context menu and a description for the NodeDescription. We add:
    <views>
    <view index="0" name="Histogram">Displays the relative size of each bin in a histogram</view>
    </views>

↑ Back to top

Section 9:

How to test and use your Node

In order to test our new node implementation we need to start a new "runtime workbench" (which is basically an eclipse instance started from our running eclipse). To do so, go to "Run" menu and select "Run ...". In the dialog, select "Eclipse Application" on the left and click the "New launch configuration" button as shown in the following screenshot:

screenshot of the launch configuration dialog

You may want to give the new configuration a meaningful name and/or change the runtime parameters of the configuration such as enabling assertions or giving the newly spawned Java process more memory (e.g. in the arguments tab under "VM arguments" you have to enter "-ea -Xmx512M" in order to use assertions(-ea) and increase the available heap space to 512MB.

Launch the configuration using the "Run button". In the opened runtime workbench open the KNIME perspective (i.e. "Window" - "Open Perspective" - "Konstanz Information Miner"). You will notice that it also contains your newly defined node in the node repository. Please note that you do all these steps only once, the run configuration is saved and available in the run history, i.e. you restart the workbench by clicking the run button.

Export and deploy your Node as a plugin

Once you have tested your new plugin, using the runtime workbench, it is time to deploy your Node as a plugin. This is done by using the "Export deployable plug-ins and fragments" wizard. Just right-click your plugin and choose "Export...". In the window that appears select the "Plug-in Development" folder and within this, the "Deployable plug-ins and fragments" wizard. After pressing the "Next" button the wizard displays a list with all available plugins in your workspace and the one you right-clicked is selected. Thus, you just have to select a directory to deploy your plugin to or an archive file. In case you select a directory the plugin is exported to this destination within a directory called "plugin". This directory then contains the plugin as a .jar file. In case you select an archive file, the "plugin" folder containing the .jar file is zipped to the specified archive.

Once, the plugin is exported it can be installed in any KNIME installation or any Eclipse installation that contains the basic KNIME plugins by copying the content of the "plugin" folder to the plugin folder of the KNIME / Eclipse installation. Note that you have to restart your installation in case it is currently running.

↑ Back to top

Download:

The complete source code

↑ Back to top

Valid XHTML 1.0 Strict

Clicky Web Analytics