How to read this tutorial
This document describes in detail how to extend the KNIME framework with your own node. Although this manual decribes the development of one simple node it is not necessary to read everything in consecutive order. Use the content index to jump to the section you are interested in. You would rarely implement all the described functionality in one single node.
Contents:
- Section 1: Overview of the Eclipse plugin concept
- Section 2: The first steps on how to create your own node with the extension wizard
- Section 3: How to implement your own algorithm in a NodeModel with a NodeDialog
- Section 4: How to implement a NodeView for your node
- Section 5: How to save and load your internal representation
- Section 6: How to implement HiLiting
- Section 7: How to save and load an external model
- Section 8: How to adapt your NodeDescription
- Section 9: How to test and use your node
- Download: The complete source code.
Section 1:
Overview of the Eclipse plugin concept
Although the core KNIME functionality does not depend on Eclipse, Eclipse is used as the workbench framework to provide a professional graphical user interface. For this reason all KNIME components are built up as so-called Eclipse plugins. In Eclipse everything is a plugin plugged into the extension points of other plugins. At its base, there is just a small runtime engine that executes the plugins and determines their dependencies. The required structure of an Eclipse plugin comprises a "plugin.xml" file, which contains the dependencies to other plugins and the extension points to which the plugin wants to connect. Furthermore, a plugin can provide its own extension points to which additional new plugins can be connected.
The KNIME workbench itself connects to several extension points of the Eclipse
workbench (e.g. the editor-, preferences page-, perspective-extension point,
etc.). KNIME itself also provides two extension points to which
external providers can contribute to the functionality of KNIME. These extension
points are the "Categories" extension point
(org.knime.workbench.repository.categories) and the "Nodes" extension
point (org.knime.workbench.repository.nodes). The "Categories" extension
point allows you to introduce new category folders displayed in the node repository by
simply adding an entry to the plugin.xml. The "Nodes" extension point enables you to
contribute a new functional node to the node repository that is connected to the
Java code by registering the corresponding NodeFactory in the plugin.xml.
Besides the plugin.xml file, an Eclipse plugin requires a so-called "Bundle Activator",
which inherits from the Eclipse class "Plugin". This is a housekeeping class that
contains practically no functionality in terms of KNIME extensions but is nevertheless required by eclipse. The "Bundle Activator" is registered in the obligatory "MANIFEST.MF"
file located in the META-INF directory. Furthermore, the "MANIFEST.MF" file contains
information about the class path, required plugins, the vendor, which java packages
should be made visible outside, etc.
The final file required is the "build.properties" file. It configures the way a plugin is exported (deployed). It defines the name of the jar file in which the classes of the plugin should be stored as well as the files to be included in deployment; it also enables you to define two build settings for a build that includes the source code and for one that does not. A source folder (mostly "src") is also provided that contains the java sources of the corresponding plugin. All this infrastructure is automatically created by the extension wizard described below to enable a developer to immediately focus on the real problem and not have to worry about the Eclipse infrastructure.
Section 2:
The first steps on how to create your own node
If you want to extend the functionality of KNIME you can implement your own Nodes and contribute them to the KNIME Node Repository. A node consists of four basic classes:
- The NodeModel: contains the main algorithm and administrates the data flow.
- The NodeDialog (optional): provides the means to configure the algorithm of the NodeModel.
- The NodeView (optional): displays information about the result of the NodeModel's algorithm.
- The NodeFactory: Bundles all the classes together.
In addition an XML file is mandatory (with exactly the same name as the factory). It specifies the name and type of the node, and describes the node, the configuration options, the in- and output, and the view(s) of the node.
So that the user does not have to create all files by hand, the KNIME platform provides a new-node-extension-wizard, which collects necessary information from the user and automatically creates a KNIME plugin.
In the KNIME perspective, select File->New->Other-> Konstanz Information Miner->Create a new KNIME Node-Extension. Continue by clicking the Next button. In the next screen enter the details about your node. Let's assume you want to write a numeric binner that assigns the data into equidistant bins according to the value of a specific attribute value. We will call that node NumericBinnerNode. Your new-node-extension-wizard would look similar to the one shown in Illustration 1:
Enter a project name with which you will find it in the KNIME
workflow navigator. The node class name is the name of your node.
The necessary classes described above all start with this
name (e.g. NumericBinnerNodeModel, NumericBinnerNodeDialog, etc.).
In the Package name
field specify the package name of the node.
In the Node vendor
field enter your name. Your name appears as
the author name in the java doc. In the Node description text
field provide a short description of your node. Finally specify the
node type: this assigns a specific background color to the
node in the KNIME workflow.
After clicking the Finish button the new-node-extension-wizard creates the project and all necessary directories and files required for a proper KNIME plugin as depicted in Illustration 2:
If the project contains errors concerning the Java build path try the following: Select from the menu Projekt->Clean... Select "Clean projects selected below" and select the newly created project or "Clean all projects". This fixes the errors in most of the cases.
The created classes contain only exemplary method stubs that can be completely deleted. You must add the desired functionality. The following sections explain how to do this step-by-step.
Section 3:
How to implement your own algorithm in a NodeModel with a NodeDialog
We start off by explaining a very simple binner. The bins are equally spaced, such that the whole range of a certain attribute is divided into n intervals. The data points with an attribute value within the k-th interval are considered to belong to the k-th bin. Therefore, the output is the original table with the binning information appended for each instance, i.e. row. The node also requires a dialog, as the user should be able to determine the number of bins and also specify the column on which the values should be binned.
NodeModel:
Before we start to implement the actual binning algorithm in the execute method, we have to define the fields we need in the NodeModel. (After creation the NodeModel already contains exemplary code which can be deleted). A convienent way to exchange the settings from the NodeModel to the NodeDialog is provided by the SettingsModel. As you will see later on, the NodeDialog also works with the SettingsModel, which is why we use them for the number of bins and the column on which the values should be binned:
// the settings model for the number of bins
private final SettingsModelIntegerBounded m_numberOfBins =
new SettingsModelIntegerBounded(NumericBinnerNodeModel.CFGKEY_NR_OF_BINS,
NumericBinnerNodeModel.DEFAULT_NR_OF_BINS,
1, Integer.MAX_VALUE);
// the settings model storing the column to bin
private final SettingsModelString m_column = new SettingsModelString(
NumericBinnerNodeModel.CFGKEY_COLUMN_NAME, "");
In order to obtain the settings from the dialog, they must be written into a NodeSettings object. The NodeSettings transfer the settings from the dialog to the model and vice versa. A key is needed for each field to identify and retrieve it from the NodeSettings. It is good practice to define the static final string used as the key in the NodeModel.
/** The config key for the number of bins. */
public static final String CFGKEY_NR_OF_BINS = "numberOfbins";
/** The config key for the selected column. */
public static final String CFGKEY_COLUMN_NAME = "columnName";
Transfer of the settings from the NodeModel to the NodeDialog is realized by implementing the validateSettings, loadValidatedSettings and saveSettings methods. All this methods can be safely delegated to the SettingsModels. In the validateSettings method a check is made to see if the values are present and valid (for example in a valid range, etc.).
/**
* @see org.knime.core.node.NodeModel
* #validateSettings(org.knime.core.node.NodeSettingsRO)
*/
@Override
protected void validateSettings(final NodeSettingsRO settings)
throws InvalidSettingsException {
// delegate this to the settings models
m_numberOfBins.validateSettings(settings);
m_column.validateSettings(settings);
}
When the loadValidatedSettings method is called, the settings are already validated and can be loaded into the local fields, which in this case is the SettingsModels of the number of bins and the selected column.
/**
* @see org.knime.core.node.NodeModel
* #loadValidatedSettingsFrom(org.knime.core.node.NodeSettingsRO)
*/
@Override
protected void loadValidatedSettingsFrom(final NodeSettingsRO settings)
throws InvalidSettingsException {
// loads the values from the settings into the models.
// It can be safely assumed that the settings are validated by the
// method below.
m_numberOfBins.loadSettingsFrom(settings);
m_column.loadSettingsFrom(settings);
}
In the saveSettings method the local fields are written into the settings such that the dialog displays the current values.
/**
* @see org.knime.core.node.NodeModel
* #saveSettingsTo(org.knime.core.node.NodeSettings)
*/
@Override
protected void saveSettingsTo(final NodeSettingsWO settings) {
// save settings to the config object.
m_numberOfBins.saveSettingsTo(settings);
m_column.saveSettingsTo(settings);
}
The above described methods are only one step to check whether the node is executable with the current settings. It is also very important to check whether or not it might work with the incoming data table. This is accomplished by the configure method. The configure method is executed as soon as the inport has been connected. In the small example of our numeric binner, a check is performed to see if at least one numeric column is available and if the incoming data table contains a column with the selected column name. otherwise the node is not executable. The DataTableSpec contains the required information and is passed to the configure method.
/**
* @see org.knime.core.node.NodeModel
* #configure(org.knime.core.data.DataTableSpec[])
*/
protected DataTableSpec[] configure(final DataTableSpec[] inSpecs)
throws InvalidSettingsException {
// first of all validate the incoming data table spec
boolean hasNumericColumn = false;
boolean containsName = false;
for (int i = 0; i < inSpecs[IN_PORT].getNumColumns(); i++) {
DataColumnSpec columnSpec = inSpecs[IN_PORT].getColumnSpec(i);
// we can only work with it, if it contains at least one
// numeric column
if (columnSpec.getType().isCompatible(DoubleValue.class)) {
// found one numeric column
hasNumericColumn = true;
}
// and if the column name is set it must be contained in the data
// table spec
if (m_column != null
&& columnSpec.getName().equals(m_column.getStringValue())) {
containsName = true;
}
}
if (!hasNumericColumn) {
throw new InvalidSettingsException("Input table must contain at "
+ "least one numeric column");
}
if (!containsName) {
throw new InvalidSettingsException("Input table contains not the "
+ "column " + m_column.getStringValue() + " . Please (re-)configure "
+ "the node.");
}
// so far the input is checked and the algorithm can work with the
// incoming data
...
Just as we rely on the incoming specification of the data, the successor nodes also require information about the data format, which is provided after execution. For this reason, a specification for the output of our node must also be created in the configure method.
...
// now produce the output table spec,
// i.e. specify the output of this node
DataColumnSpec newColumnSpec = createOutputColumnSpec();
// and the DataTableSpec for the appended part
DataTableSpec appendedSpec = new DataTableSpec(newColumnSpec);
// since it is only appended the new output spec contains both:
// the original spec and the appended one
DataTableSpec outputSpec = new DataTableSpec(inSpecs[IN_PORT],
appendedSpec);
return new DataTableSpec[]{outputSpec};
...
Since a DataColumnSpec must be created for the newly appended column in both the configure and the execute method, the code for the creation of the DataColumnSpec is extracted in a separate method:
private DataColumnSpec createOutputColumnSpec() {
// we want to add a column with the number of the bin
DataColumnSpecCreator colSpecCreator = new DataColumnSpecCreator(
"Bin Number", IntCell.TYPE);
// if we know the number of bins we also know the number of possible
// values of that new column
DataColumnDomainCreator domainCreator = new DataColumnDomainCreator(
new IntCell(0), new IntCell(m_numberOfBins.getIntValue() - 1));
// and can add this domain information to the output spec
colSpecCreator.setDomain(domainCreator.createDomain());
// now the column spec can be created
DataColumnSpec newColumnSpec = colSpecCreator.createSpec();
return newColumnSpec;
}
Once this has been completed and implemented, the actual algorithm for equidistant binning can be written. The algorithm operating on the data must be placed in the execute method. In this example only one column is appended to the original data. For this purpose the so-called ColumnRearranger is used. It requires a CellFactory, which returns the appended cells for a given row.
...
// instantiate the cell factory
CellFactory cellFactory = new NumericBinnerCellFactory(
createOutputColumnSpec(), splitPoints, colIndex);
// create the column rearranger
ColumnRearranger outputTable = new ColumnRearranger(
inData[IN_PORT].getDataTableSpec());
// append the new column
outputTable.append(cellFactory);
...
Having created the ColumnRearranger, it can be transferred together with the input table to the ExecutionContext to create a BufferedDataTable which is returned by the execute method, i.e. provided at the outport. Each node buffers the data in a BufferedDataTable. In order to avoid redundant buffering of the same data the ColumnRearranger is used. In this way only the appended column is buffered in our node. That is why we have to retrieve the BufferedDataTable from the ExecutionContext:
...
// and create the actual output table
BufferedDataTable bufferedOutput = exec.createColumnRearrangeTable(
inData[IN_PORT], outputTable, exec);
// return it
return new BufferedDataTable[]{bufferedOutput};
...
For purposes of the CellFactory it is necessary to implement a NumericBinnerCellFactory. This extends the SingleCellFactory and only implements the getCell method. The passed row is checked to find out which bin contains the value from the selected column. It returns the number of the bin as a DataCell.
/**
* @see org.knime.core.data.container.SingleCellFactory#getCell(
* org.knime.core.data.DataRow)
*/
@Override
public DataCell getCell(DataRow row) {
DataCell currCell = row.getCell(m_colIndex);
// check the cell for missing value
if (currCell.isMissing()) {
return DataType.getMissingCell();
}
double currValue = ((DoubleValue)currCell).getDoubleValue();
int binNr = 0;
for (Double intervalBound : m_intervalUpperBounds) {
if (currValue <= intervalBound) {
return new IntCell(binNr);
}
binNr++;
}
return DataType.getMissingCell();
}
NodeDialog:
When the NumericBinnerNodeDialog is created you will see that the constructor already contains some exemplary code. You may delete it and add instead the code for your desired control elements. For the NumericBinnerNodeDialog we need two GUI elements: one to set the number of bins and one to select the column for the binning. The KNIME framework provides a very convenient setting to apply standard dialog elements to the NodeDialog. Thus, your NumericBinnerNodeDialog extends the DefaultNodeSettingsPane by default. If the default dialog components do not suit your needs, for example if some components should be enabled or disabled depending on the user's settings, you may extend the NodeDialogPane directly. In our case a DialogComponentNumber for the number of bins and a DialogComponentColumnSelection need to be added. Each component's constructor requires a new instance of a SettingsModel. The SettingsModel expects a string identifier, which it uses to store and load the value of the component, and a default value, which it holds until a new value is loaded. Additional parameters are necessary, depending on the type of component. The loading from and saving to the settings is executed automatically via the key passed in the constructor. We recommend using the key defined in the NodeModel. If you do this, you must make it public at this point.
public class NumericBinnerNodeDialog extends DefaultNodeSettingsPane {
/**
* New pane for configuring NumericBinner node dialog.
* Contains control elements to adjust the number of bins
* and to select the column to bin.
* Suppress warnings here: it is unavoidable since the
* allowed types passed as an generic array.
*/
@SuppressWarnings ("unchecked")
protected NumericBinnerNodeDialog() {
super();
// nr of bins control element
addDialogComponent(new DialogComponentNumber(
new SettingsModelIntegerBounded(
NumericBinnerNodeModel.CFGKEY_NR_OF_BINS,
NumericBinnerNodeModel.DEFAULT_NR_OF_BINS,
1, Integer.MAX_VALUE),
"Number of bins:", /*step*/ 1));
// column to bin
addDialogComponent(new DialogComponentColumnNameSelection(
new SettingsModelString(
NumericBinnerNodeModel.CFGKEY_COLUMN_NAME,
"Select a column"),
"Select the column to bin",
NumericBinnerNodeModel.IN_PORT,
DoubleValue.class));
}
}
After you have created your node and have implemented the NodeModel and the NodeDialog don’t forget to edit your node description in the XML file (with exactly the same name as your NodeFactory). Describe your node, the dialog settings, the in- and outports and later on, the view. This is explained in detail in Section 8
Section 4:
How to implement a NodeView for your Node
In this section a NodeView for our node is implemented. In order to display information about the work of this node we display the bins in a histogram, where the height of each bin indicates the number of rows in this bin.
Internal Representation:
A model is required to represent the outcome of our algorithm. Obviously a model representing a bin is appropriate. A bin contains a number of rows and is graphically represented by a rectangle. Therefore, we create this data structure.
public class NumericBin {
private final Set<DataCell> m_containedRowIds;
/**
*
*
*/
public NumericBin() {
m_containedRowIds = new HashSet<DataCell>();
}
/**
* Adds another row to this bin.
* @param rowId the row to add to this bin.
*/
public void addRowToBin(final DataCell rowId) {
m_containedRowIds.add(rowId);
}
/**
*
* @return the number of rows in this bin.
*/
public int getSize() {
return m_containedRowIds.size();
}
To obtain the bins filled with the referring row IDs, an array with empty bins must be passed to the NumericBinnerCellFactory, which adds the row to the referring bin in the getCell method:
/**
* @see de.unikn.knime.core.data.container.SingleCellFactory#getCell(
* de.unikn.knime.core.data.DataRow)
*/
@Override
public DataCell getCell(DataRow row) {
DataCell currCell = row.getCell(m_colIndex);
...
int binNr = 0;
for (Double intervalBound : m_intervalUpperBounds) {
if (currValue <= intervalBound) {
m_bins[binNr].addRowToBin(row.getKey().getId());
return new IntCell(binNr);
}
binNr++;
}
...
Drawing Component:
Before we can start to implement the NodeView we have to implement a component that actually draws our bins. For this purpose we create our own JPanel using a quite simple paint method:
public class NumericBinnerViewPanel extends JPanel
...
/**
* @see javax.swing.JComponent#paint(java.awt.Graphics)
*/
@Override
public void paint(Graphics g) {
super.paint(g);
if (m_bins != null && m_bins.length > 0) {
int maxNr = 0;
// determine the largest bin
for (int i = 0; i < m_bins.length; i++) {
maxNr = Math.max(m_bins[i].getSize(), maxNr);
}
// if no size information available (creation) set default size
int width = getWidth();
if (width == 0) {
width = SIZE;
}
int height = getHeight();
if (height == 0) {
height = SIZE;
}
// calculate the bin width
int binWidth = width / m_bins.length;
for (int i = 0; i < m_bins.length; i++) {
// the left side of the rectangle
int x = i * binWidth;
// the height of the bin
int binHeight = height;
// the larger the bin the higher the rect
double sizeFactor = ((double)(maxNr - m_bins[i].getSize())
/ (double)maxNr);
// since y-axis starts on top subtract
binHeight -= sizeFactor * height;
Rectangle rect = new Rectangle(x, height - binHeight, binWidth,
binHeight);
m_bins[i].setViewRepresentation(rect);
// draw the bin in black
g.setColor(Color.BLACK);
g.fillRect(rect.x, rect.y, rect.width, rect.height);
// draw a border in white to make the bins distinguishable
g.setColor(Color.WHITE);
g.drawRect(rect.x, rect.y, rect.width, rect.height);
}
}
}
NodeView:
Implementation of the NodeView is very simple. In the constructor we create the drawing component (our NumericBinnerViewPanel) using the bins retrieved from the model. We set it as our view content via the setComponent method:
public class NumericBinnerNodeView extends NodeView {
// panel which actually paints the bins
private NumericBinnerViewPanel m_panel;
/**
* Creates a new view.
*
* @param nodeModel the model class: {@link NumericBinnerNodeModel}
*/
protected NumericBinnerNodeView(final NodeModel nodeModel) {
super(nodeModel);
// get the bins
NumericBin[] bins = ((NumericBinnerNodeModel)getNodeModel())
.getBinRepresentations();
if (bins != null && bins.length >0) {
// create the panel that draws the bins
m_panel = new NumericBinnerViewPanel(bins);
}
// sets the view content in the node view
setComponent(m_panel);
}
Note that the panel might be null. If this is the case, the view displays a message that no data is available by default. Another situation might be that the view is open and the model changes (due to a different input, or a different configuration). Then the view has to be updated. This is realized by the modelChanged method in the NodeView:
/**
* @see org.knime.core.node.NodeView#modelChanged()
*/
protected void modelChanged() {
// if the model had changed get the new bins
NumericBin[] bins = ((NumericBinnerNodeModel)getNodeModel())
.getBinRepresentations();
if (bins != null && bins.length > 0 && m_panel != null) {
// and paint the bins
((NumericBinnerViewPanel)m_panel).updateView(bins);
}
}
The NumericBinnerViewPanel also needs an update method, as follows:
/**
* If the view is updated the new bins are set and then painted.
*
* @param bins the new bins to display.
*/
public void updateView(final NumericBin[] bins) {
m_bins = bins;
repaint();
}
Section 5:
How to save and load your internal representation
Now try the following: execute the node, save, close and re-open the workflow. Your node is still executed but when you open the view no data is available. This is because your internal representation, i.e. the bins, is not saved automatically. To ensure that your internal representation can be stored and loaded, the KNIME framework provides two methods in the NodeModel: loadInternals and saveInternals. The following explains how to implement these methods for our numeric binner node.
- Create a ModelContent object. To the ModeContent you can add the raw types of Java, such as int, double, String, etc. Also DataCells and subconfigs can be stored. Subsequently, the NodeSettings can be written to XML with saveToXML and loaded with the static method loadFromXML.
- If you have a DataTable, DataArray or similar as your internal
model you can use the convenient static method:
DataContainer.writeToZip(DataTable yourTable, File zipFile);
- If neither of the above-mentioned methods fit you can serialize your internal model and write it to file.
Since the serialization of objects is slow and prone to errors, we use the ModelContent approach to store our internal model. Since a numeric bin is not a raw type, it cannot be stored directly by ModelContent. However, each NumericBin consists only of raw types. Moreover, it is enough to store the contained row IDs, because the visual representation can be restored from this information: the size of a drawn bin depends on the width and height of the panel (which is evaluated in the paint method) and the number of contained rows. We provide two methods to let each NumericBin save and load itself from the ModelContent and afterwards store the ModelContents in the main ModelContent in the NodeModel. First of all each NumericBin must provide methods to save itself to and load itself from ModelContent:
/**
* Adds the IDs of the contained rows to the settings. This is sufficient in order
* to later on restore the visual representation, since that only depends on
* the dimension of the panel and the number of contained rows per bin.
*
* @param modelContent the model content object to save to.
*/
public void saveTo(final ModelContentWO modelContent) {
DataCell[] cellArray = new DataCell[m_containedRowIds.size()];
m_containedRowIds.toArray(cellArray);
modelContent.addDataCellArray(CFG_KEY_CELLS, cellArray);
}
/**
* Loads the contained row IDs.
*
* @param modelContent
* @throws InvalidSettingsException
*/
public void loadFrom(final ModelContentRO modelContent)
throws InvalidSettingsException{
DataCell[] cellArray = modelContent.getDataCellArray(CFG_KEY_CELLS);
m_containedRowIds.addAll(Arrays.asList(cellArray));
}
Again, we need an internal key to identify the stored field - in this case the CFG_KEY_CELLS. It only has to be unique in this class, as each object receives its own ModelContent object. Now we can save and load our bins in the NodeModels saveInternals and loadInternals methods. In the saveInternals we get the directory to store our files. We create a new ModelContent object and for each bin we create a sub model content which is passed to the bin. The bin itself writes the necessary information in the sub model content. Afterwards we create a new file in the given directory, create an output stream and let the main model content write itself to XML.
/**
*
* @see org.knime.core.node.NodeModel#saveInternals(java.io.File,
* org.knime.core.node.ExecutionMonitor)
*/
protected void saveInternals(final File internDir,
final ExecutionMonitor exec) throws IOException,
CanceledExecutionException {
// create the main model content
ModelContent modelContent = new ModelContent(INTERNAL_MODEL);
for (int i = 0; i < m_bins.length; i++) {
// for each bin create a sub model content
ModelContentWO subContent = modelContent.addModelContent(
NUMERIC_BIN + i);
// save the bin to the sub model content
m_bins[i].saveTo(subContent);
}
// now all bins are stored to the model content
// but the model content must be written to XML
// internDir is the directory for this node
File file = new File(internDir, FILE_NAME);
FileOutputStream fos = new FileOutputStream(file);
modelContent.saveToXML(fos);
}
The loading of the internal model works accordingly. Create the file and an input stream and let the main model content load from the XML file. Then fetch the sub model content for every bin and let each bin load itself from this sub model content. Add the bin to your field.
/**
*
* @see org.knime.core.node.NodeModel#loadInternals(java.io.File,
* org.knime.core.node.ExecutionMonitor)
*/
protected void loadInternals(final File internDir,
final ExecutionMonitor exec) throws IOException,
CanceledExecutionException {
m_bins = new NumericBin[m_numberOfBins.getIntValue()];
File file = new File(internDir, FILE_NAME);
FileInputStream fis = new FileInputStream(file);
ModelContentRO modelContent = ModelContent.loadFromXML(fis);
try {
for (int i = 0; i < m_numberOfBins.getIntValue(); i++) {
NumericBin bin = new NumericBin();
ModelContentRO subModelContent = modelContent
.getModelContent(NUMERIC_BIN + i);
bin.loadFrom(subModelContent);
m_bins[i] = bin;
}
} catch (InvalidSettingsException e) {
throw new IOException(e.getMessage());
}
}
When you now try again to execute, save, close and re-open the workflow, you will see, that the view displays the desired information.
Section 6:
How to implement HiLiting
In the KNIME framework the technique known as linking and brushing
is
called HiLiting. This means that whenever a datapoint is hilited in one
view it is immediately also hilited in all other views displaying this data
point. If we had a view that displayed the datapoints directly we
would have to implement the HiLiteListener interface to be informed about
any change in the hiliting. The HiLiteListener interface has three methods:
/**
* Invoked when some item(s) were hilit.
*
* @param event contains a list of row keys that were hilit
*/
void hiLite(final KeyEvent event);
/**
* Invoked when some item(s) were unhilit.
*
* @param event contains a list of row keys that were unhilit
*/
void unHiLite(final KeyEvent event);
/**
* Invoked, when everything (all rows) are unhilit.
*/
void unHiLiteAll();
But since we have an aggregated view of the datapoints it does not make much sense to implement the HiLiteListener interface. We would rather hilite a bin and see all the datapoints in that bin hilited in other views. In the following we explain how this is implemented. First of all we have to prepare our NumericBin to know when it has been selected and if it is hilited. Therefore we simply introduce two flags to indicate the status:
/**
* @param isHilite sets the hilite status of this bin.
*/
public void setHilited(final boolean isHilite) {
m_isHilite = isHilite;
}
/**
*
* @return true if this bin contains hilited keys, false otherwise.
*/
public boolean isHilited() {
return m_isHilite;
}
/**
*
* @return true if this bin is selected. false otherwise.
*/
public boolean isSelected() {
return m_isSelected;
}
/**
*
* @param selected true, if the bin is selected, false otherwise.
*/
public void setSelected(final boolean selected) {
m_isSelected = selected;
}
The next step is to listen to the mouse events to be informed about whether a bin is selected or not. Then the bin has to know its graphical representation, i.e. the painted rectangle (otherwise we cannot know if it is clicked or not):
/**
*
* @return the graphical representation as a rectangle.
*/
public Rectangle getViewRepresentation() {
return m_viewRepresentation;
}
/**
* The graphical representation can only be calculated outside with the
* knowledge of the number of bins, the maximal and minimal size
* and the available width and height. This is done in the
* {@link NumericBinnerViewPanel#paint(java.awt.Graphics)}
*
* @param rectangle the graphical representation
*/
public void setViewRepresentation(final Rectangle rectangle) {
m_viewRepresentation = rectangle;
}
In order to listen to mouse events we have to add a MouseListener in the NodeView's constructor to the drawing component. The selected bins are stored in a local datastructure m_selected:
...
m_selected = new HashSet<NumericBin>();
m_panel.addMouseListener(new MouseAdapter() {
/**
* @see java.awt.event.MouseAdapter#mouseReleased(
* java.awt.event.MouseEvent)
*/
@Override
public void mouseReleased(final MouseEvent e) {
if (!e.isControlDown()) {
m_selected.clear();
for (NumericBin bin : m_panel.getBins()) {
bin.setSelected(false);
}
}
for (NumericBin bin : m_panel.getBins()) {
if (bin.getViewRepresentation() != null &&
bin.getViewRepresentation().contains(
e.getX(), e.getY())){
bin.setSelected(true);
m_selected.add(bin);
break;
}
}
...
So far we are able to select one or more bins. If we want to hilite them we have to add a menu to the NodeView, to enable us to hilite or unhilite the selected bins, or clear the hiliting.
// create the hilite menu
// the HiliteHandler provides standard names
m_hilite = new JMenuItem(HiLiteHandler.HILITE_SELECTED);
m_hilite.setEnabled(false);
m_hilite.addActionListener(new ActionListener() {
/**
* @see java.awt.event.ActionListener#actionPerformed(
* java.awt.event.ActionEvent)
*/
public void actionPerformed(final ActionEvent e) {
Set<DataCell> toBeHilited = new HashSet<DataCell>();
for (NumericBin bin : m_selected) {
// store all row ids from the selected bin
toBeHilited.addAll(bin.getContainedRowIds());
// set the bin hilited
bin.setHilited(true);
// count the number of hilited bins for a
// correct menu display (see below)
m_numberOfHilitedBins++;
}
// now get the hilite handler and hilite the rows
getNodeModel().getInHiLiteHandler(
NumericBinnerNodeModel.IN_PORT).fireHiLiteEvent(toBeHilited);
// and repaint to have the hilited bins displayed correctly
m_panel.repaint();
}
});
m_unhilite = new JMenuItem(HiLiteHandler.UNHILITE_SELECTED);
m_unhilite.setEnabled(false);
m_unhilite.addActionListener(new ActionListener() {
/**
* @see java.awt.event.ActionListener#actionPerformed(
* java.awt.event.ActionEvent)
*/
public void actionPerformed(final ActionEvent e) {
Set<DataCell> toBeUnhilited = new HashSet<DataCell>();
for (NumericBin bin : m_selected) {
// store all row ids that should be unhilited
toBeUnhilited.addAll(bin.getContainedRowIds());
// unhilite the bin
bin.setHilited(false);
// decrease the number of hilited bins
m_numberOfHilitedBins--;
}
// get the hilite handler and unhilite the rows
getNodeModel().getInHiLiteHandler(
NumericBinnerNodeModel.IN_PORT).fireUnHiLiteEvent(toBeUnhilited);
// repaint to have the bins displayed correctly
m_panel.repaint();
}
});
JMenuItem clear = new JMenuItem(HiLiteHandler.CLEAR_HILITE);
clear.addActionListener(new ActionListener() {
/**
* @see java.awt.event.ActionListener#actionPerformed(
* java.awt.event.ActionEvent)
*/
public void actionPerformed(final ActionEvent e) {
// get the hilite handler and unhilite all
getNodeModel().getInHiLiteHandler(
NumericBinnerNodeModel.IN_PORT).fireClearHiLiteEvent();
// unhilite all bins
for (NumericBin bin : m_panel.getBins()) {
bin.setHilited(false);
}
// no bin is hilited anymore
m_numberOfHilitedBins = 0;
// repaint to display the bins correctly
m_panel.repaint();
}
});
// create the menu and all the menu items to it
JMenu menu = new JMenu(HiLiteHandler.HILITE);
menu.add(m_hilite);
menu.add(m_unhilite);
menu.add(clear);
// get the JMenu bar of the NodeView and add this menu to it
getJMenuBar().add(menu);
...
The HiLiteHandler provides standard names for the menu items. The getJMenuBar method returns the MenuBar of the NodeView to which the additional menu can be added. To further improve our small human computer interface we can enable and disable the menu items dependent on the current selection and hilite status, i.e. the hilite menu entry should only be enabled when some bins are selected. And accourdingly should the unhilite menu entry only be enabled when some bins are selected and hilited. We add this functionality to the MouseListener. (By the way this is the reason, why the two menu items are local fields.)
public void mouseReleased(final MouseEvent e) {
...
// update the hilite menu
if (m_selected.size() > 0) {
m_hilite.setEnabled(true);
} else {
m_hilite.setEnabled(false);
}
if (m_numberOfHilitedBins > 0 && m_selected.size() > 0) {
m_unhilite.setEnabled(true);
} else {
m_unhilite.setEnabled(false);
}
m_panel.repaint();
}
});
So far we are able to select and hilite (unhilite) the bins and the contained rows. But if you run the code implemented so far you immediately encounter the frustrating fact, that in our view you cannot distinguish between selected, hilited and normal bins. Thus, we have to add this to the paint method of the drawing component, the NumericBinnerViewPanel (it also shows how the graphical rectangle of the bins is updated in every paint):
...
Rectangle rect = new Rectangle(x, height - binHeight, binWidth,
binHeight);
m_bins[i].setViewRepresentation(rect);
// draw a border in white to make the bins distinguishable
Color color = Color.BLACK;
if (m_bins[i].isHilited()) {
color = ColorAttr.HILITE;
}
if (m_bins[i].isSelected()) {
color = ColorAttr.SELECTED;
}
if (m_bins[i].isHilited() && m_bins[i].isSelected()) {
color = ColorAttr.SELECTED_HILITE;
}
Graphics2D g2 = (Graphics2D)g;
g2.setColor(color);
g2.fillRect(rect.x+2, rect.y+2, rect.width-2, rect.height-2);
g2.setColor(Color.WHITE);
g2.setStroke(new BasicStroke(2));
g2.drawRect(rect.x, rect.y, rect.width, rect.height);
...
Now the view looks good and is correctly displayed if a bin is selected, hilited, both, or none. We use the KNIME standard colors defined in the ColorAttr to have a uniform coloring in all views.
Section 7:
How to save and load an external model
We would like to assume you would like to implement a learner that learns a certain model
from the data and that you would subsequently like to have a predictor node
that uses the learned model in order to classify new data. To accomplish
this, you need a facility to pass the learned model from the learner to the
predictor. The KNIME framework provides this functionality using the ModelPort concept. To roughly explain this concept we will now create the
learned model
for our numeric binner and write the interval bounds
for every bin into a model. For this purpose it is necessary to overwrite
the saveModelContent method of the NodeModel. If we had a node with a
ModelInport we would have to overwrite the loadModelContent method in order to be
able to load the model when it is connected to the inport. Although it is
possible to write your model directly to the ModelContent object it is good
practice and highly recommended to have an object that represents the
external model of your node and which is solely responsible for the loading
from and saving to the ModelContent. In our case we create a class NumericBinModel.
It is then very easy to share your model with other nodes.
To increase the usability of the model we store the
lower and upper bound as an interval for every bin.
Since we only have a ModelOutport we only have to implement the saveTo
method, as can be seen below:
/**
* Saves this model to the model content
* @param modelContent the model content to save to.
*/
public void saveTo(final ModelContentWO modelContent) {
modelContent.addInt(NUMBER_OF_BINS, m_intervals.size());
int intervalNr = 0;
for (Interval interval : m_intervals) {
ModelContentWO intervalModel = modelContent.addModelContent(
INTERVAL + intervalNr++);
intervalModel.addDouble(LOWER_BOUND, interval.getLowerBound());
intervalModel.addDouble(UPPER_BOUND, interval.getUpperBound());
}
}
The interval simply stores the lower bound upper bound pair. In addition the model provides some methods to add intervals and retrieve the intervals:
/**
*
* @param binNumber the number of the bin for which the lower bound
* should be returned
* @return the lower bound of the specified interval.
*/
public double getLowerBoundForInterval(final int binNumber) {
return m_intervals.get(binNumber).getLowerBound();
}
/**
*
* @param binNumber the number of the bin for which the upper bound
* should be returned
* @return the upper bound of the specified interval.
*/
public double getUpperBoundForInterval(final int binNumber) {
return m_intervals.get(binNumber).getUpperBound();
}
/**
*
* @return the number of bins, i.e. intervals.
*/
public int getNumberOfBins() {
return m_intervals.size();
}
/**
* Adds an interval to this model.
* @param lowerBound the lower bound of the interval
* @param upperBound the upper bound of the interval
*/
public void addInterval(final double lowerBound, final double upperBound) {
Interval interval = new Interval(lowerBound, upperBound);
m_intervals.add(interval);
}
Now, we have to fill the model in the for-loop of the NodeModel's execute method:
...
double intervalUpperBound = lowerBound;
// create the external model
m_model = new NumericBinModel();
double intervalLowerBound = lowerBound;
for (int i = 0; i < m_numberOfBins.getIntValue(); i++) {
intervalLowerBound = intervalUpperBound;
intervalUpperBound += interval;
// fill the external model
m_model.addInterval(intervalLowerBound, intervalUpperBound);
splitPoints.add(intervalUpperBound);
// fill the bins with empty representations
m_bins[i] = new NumericBin();
}
...
Once the model has been filled with the information about the intervals it is possible to save it in the saveModelContent method of the NodeModel:
/**
* Only the saveModelContent method has to be overwritten, since there is
* only a ModelOutport.
*
* @see de.unikn.knime.core.node.NodeModel#saveModelContent(int,
* de.unikn.knime.core.node.ModelContentWO)
*/
@Override
protected void saveModelContent(final int index,
final ModelContentWO modelContent) throws InvalidSettingsException {
m_model.saveTo(modelContent);
}
Since we now provide an external model of our Node we have to add a
ModelOutport. This is completed in the constructor of the NodeModel, where the number
of DataIn- and Outports is defined with the first two parameters and the
number of ModelIn- and Outports with the last two parameters. Since we have
no ModelInport and one ModelOutport the resulting new constructor looks like this:
/**
* Constructor for the node model.
*/
protected NumericBinnerNodeModel() {
// we have one inport for the numeric data to bin
// and two outports:
// one for the original data with the binning information appended
// and one for the bins and their used interval bounds.
super(1, 1, 0, 1);
}
Having adapted your constructor in this way you will notice the following error the next time you start your workbench:
ERROR NumericBinnerNodeFactory CODING PROBLEM Missing or surplus predictor output port name
To avoid this you have to adapt your node description, which is explained in the next section.
Section 8
How to adapt your NodeDescription
The NodeDescription is the information displayed in the NodeDescription window of your workbench. You can edit the information in the XML file, which is named exactly like your NodeFactory. In our case it is the NumericBinnerNodeFactory.xml. The NodeDescription explains the functionality of the node, the configuration options, the node's view and the meaning of the in- and outports. Since we added a ModelOutport we also have to add a description of that port. When you look at the NumericBinnerNodeFactory.xml file you will notice that there are already some fields but most of them filled with placeholders. We will go through each of the fields and add the appropriate information.
- fullDescription: The full description should explain the functionality of your node and help a user to use it. For example: "divides the domain of a selected numeric column into the selected number of equidistant bins and puts the input data into these bins accourdingly. The binning information (which row is in which bin) is provided in an extra column at the output."
- option: for each control element in your dialog describe the meaning of it here.
The option name is the label of it used in the dialog. We add:
<option name="Number of bins">Define the number of bins</option>
<option name="Column to bin">Select the numeric column which should be binned</option> - ports:
- dataIn: the data inports with the index - a name displayed in the tooltip and a description, which is displayed in the NodeDescription,
- dataOut: the data outports with the index, a name displayed in the tooltip and in the context menu and a description, which is displayed in the NodeDescription,
- modelIn: the model inports with the index, a name displayed in the tooltip and a description for the NodeDescription,
- modelOut: the model outports with the index, a name displayed in the tooltip and the context menu and a description for the NodeDescription.
<ports>
<dataIn index="0" name="Data to bin">Data to bin</dataIn>
<dataOut index="0" name="Binned data">The input data with an additional column containing the referring bin number for each row.</dataOut>
<modelOut index="0" name="Bin intervals">Bounds of the intervals</modelOut>
</ports> - views: explains what is displayed in the view with an index of the view (since a node may have several views),
a name displayed in the context menu and a description for the NodeDescription. We add:
<views>
<view index="0" name="Histogram">Displays the relative size of each bin in a histogram</view>
</views>
Section 9:
How to test and use your Node
In order to test our new node implementation we need to start a new
"runtime workbench" (which is basically an eclipse instance
started from our running eclipse). To do so, go to "Run" menu and
select "Run ...". In the dialog, select "Eclipse Application"
on the left and click the "New launch configuration" button as shown
in the following screenshot:
You may want to give the new configuration a meaningful name and/or change the runtime parameters of the configuration such as enabling assertions or giving the newly spawned Java process more memory (e.g. in the arguments tab under "VM arguments" you have to enter "-ea -Xmx512M" in order to use assertions(-ea) and increase the available heap space to 512MB.
Launch the configuration using the "Run button". In the opened runtime workbench open the KNIME perspective (i.e. "Window" - "Open Perspective" - "Konstanz Information Miner"). You will notice that it also contains your newly defined node in the node repository. Please note that you do all these steps only once, the run configuration is saved and available in the run history, i.e. you restart the workbench by clicking the run button.
Export and deploy your Node as a plugin
Once you have tested your new plugin, using the runtime workbench, it is time to deploy your Node as a plugin. This is done by using the "Export deployable plug-ins and fragments" wizard. Just right-click your plugin and choose "Export...". In the window that appears select the "Plug-in Development" folder and within this, the "Deployable plug-ins and fragments" wizard. After pressing the "Next" button the wizard displays a list with all available plugins in your workspace and the one you right-clicked is selected. Thus, you just have to select a directory to deploy your plugin to or an archive file. In case you select a directory the plugin is exported to this destination within a directory called "plugin". This directory then contains the plugin as a .jar file. In case you select an archive file, the "plugin" folder containing the .jar file is zipped to the specified archive.
Once, the plugin is exported it can be installed in any KNIME installation or any Eclipse installation that contains the basic KNIME plugins by copying the content of the "plugin" folder to the plugin folder of the KNIME / Eclipse installation. Note that you have to restart your installation in case it is currently running.