KNIME: FAQ
Sections:
This is only a small selection of possible FAQ. If you have any question which is not answered in the FAQ, the quickstart guide and the extension howto don't hesitate to write us an email and we will try to answer your question and augment our FAQ section.
General
Questions:
- What is KNIME, what does KNIME stand for and who has developed KNIME?
- Can I modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part?
- How many data can I process with KNIME?
- I'm getting errors like
java.lang.OutOfMemoryError: PermGenSpace. What is wrong? - I try to import a KNIME workflow from a directory or zip-file but after I browsed for the directory (zip-file) the project list is still empty. Why is the project I want to import not listed?
- The Node Description window doesn’t work on Linux, it shows the error "System browser cannot be initialized. No nodedescription will be displayed."
- When coping Meta nodes such as Cross Validation or Meta Nodes x:x the inner nodes are not copied.
- I'm working with Windows Vista. In the Node-dialogs, I can not open a FileChooser to select a data file.
- I get the following error message when trying to update from the v1.1 KNIME Product to version 1.2: "Resulting configuration does not contain the platform."
- The chemistry extensions have changed a lot in version 1.2.0, in particular the molecular parsing nodes, the 2D structure rendering, and the CDK integration. Why?
- I'm running the Gnome window manager under Linux and can't open any Weka Dialog. Why is that and how can I fix it?
- Is there any way to run KNIME in batch mode, i.e. only on command line and without graphical user interface?
- Since KNIME 1.3.1 I've had problems with the node description window; it shows an error "Unable to create view: Plug-in org.knime.workbench.helpview was unable to load class org.knime.workbench.helpview.view.HelpView."
KNIME is available under a dual licensing model. An open source license is available for non-profit use. For commercial usage of KNIME, please contact us at contact@knime.org Please refer to the Copyright for more information.
-XX:MaxPermSize=128m to
the java command. The KNIME product is already using this setting by default. You can also try another
Java implementation like the ones from IBM or BEA.See also Eclipse's and Sun's bug reports.
mozilla-xulrunner and exporting an environment variable
(export MOZILLA_FIVE_HOME=/usr/lib/xulrunner) fixes the problem.
Please note that exporting this variable to, for instance /usr/lib/firefox
or /usr/lib/mozilla is likely not going to work. For other distributions
refer to
The SWT FAQ.
Ah, right we changed some stuff along those lines. The reasoning behind this was that we wanted to stop to automatically (=hiddenly) convert between formats and leave control over such conversions to the user. Before we converted SMILES automatically into a CDK representation which enabled us to show 2D representations.
Now you can read in, for example a SMILES string and explicitly translate it into a CDK representation (or something else using Open Babel). (Note that the FileReader now allows you to choose SMILES directly for columns containing string-like elements.)
The reason why you do not see structures in the CDK cell is that we split this functionality. A lot of functions in CDK are somewhat, let's say "premature", so we are trying to not combine too much into one node. So the SMILES->CDK converter does not automatically trigger the 2D coordinate generation - you need to use the appropriate node to do this. Afterwards the CDK cell does contain this information and the 2D renderer is available. Similarly for 3D representation but the CDK 3D coordinate generator crashes frequently, so we did not include it in our standard CDK distribution. However, if you, for example, read SDF files containing 3D coordinate information, convert this to CDK you will sequentially be able to see those 3D structures.
So, to summarize. In order to see structures in 2D, you need to read in SMILES or SDF (or generate something via Open Babel), translate those to CDK types using the appropriate translator, add 2D coordinates (if not already included in the SDF representation) and then connect the Interactive View.
Note that the Tripos tools add a specific renderer for SMILES, so once their tools come out you will also be able to see 2D representations for columns holding "just" SMILES.
The Look and Feel (L&F) implementation that is used
under Gnome (com.sun.java.swing.plaf.gtk.GTKLookAndFeel)
has a bug, which causes a NullPointerException to be thrown at:
weka.gui.PropertySheetPanel.addPropertyChangeListener(PropertySheetPanel.java:160)
The workaround for this problem is to specify another default L&F when
launching the java process by providing the commandline argument
-Dswing.systemlaf=javax.swing.plaf.metal.MetalLookAndFeel
Our next release 1.2.1 will have this fix included. In the current
release of KNIME you can enable this option depending on the
product you use as follows:
- KNIME Product
- Either add this line to the .knime.ini file
(located in the KNIME directory) or launch the
knime.sh executable with the option
-vmargs -Dswing.systemplatform... - KNIME Developer Version
- Either add this line to the eclipse.ini file or
launch the executable eclipse with the option
-vmargs -Dswing.systemplatform... - KNIME Runtime Workspace (the one you get when testing your own node implementation)
- Add the above mentioned line to the "Arguments" tab in the run configuration, i.e. when you choose "Run" -> "Run..." and then selecting the appropriate run configuration on the left.
There is a (experimental(!) and therefore undocumented) command line option allowing the user to run KNIME 1.2.1 in batch mode. To see a list of possible arguments execute the following line on a command prompt (for Linux):
knime.sh -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION
On a Windows system, you need to add two more options to enable system messages (by default any message to System.out gets suppressed):
knime.exe -consoleLog -noexit -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION
The -consoleLog option causes a new window to be opened containing
the log messages and -noexit will keep the window open after the execution
has finished - you will need to shutdown the window manually, and, unfortunately, get an error
message from the java process, which you can safely ignore. (If you happen to find out how
this procedure can be avoided or simplified, please let
us know.)
Windows users: please remember to add these two options to the
command line examples below in order to see KNIME's output messages.
In order to run a (pre-configured) workflow, say Knime_project, contained in the
workspace directory, execute (in one line)
knime.sh -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION
-workflowDir="workspace/Knime_project"
It's also possible to change configuration options of individual nodes contained
in the workflow. This becomes handy if you, for instance want to change the
URL of the file that the file reader node reads from. You will need to find out
what's the ID of the target node and the name and type of the option you wish
to change. To do so browse the project directory (for instance
workspace/Knime_project) and find the configuration directory
of the target node (e.g. "File Reader (#4)", whereby
4 is the ID of the node). Open the file settings.xml and look
for the xml element "model" (all elements nested in
"model" can be changed). The individual options should be more or
less self-explanatory, in this example we would look for the option
"DataURL", which contains a String value. The command line then
looks as follows:
knime.sh -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION
-workflowDir="workspace/Knime_project"
-option=4,DataURL,"file:/home/wiswedel/benchmarks/iris/data.tst",String
This is a bug in KNIME 1.3.1. It occurs in a multi-user environment, when KNIME is installed under a different user name (for instance root) than it is being used as. KNIME is trying to write to the installation directory, which causes problems as it does not have write permissions.
This bug was fixed in KNIME 1.3.3.
Getting Started
Questions
- How can I create/use another workspace location?
- After the orange splash screen I get an error message referring to a log file and that's it. What does this mean?
- The workspace is empty how do I create a new project?
- The Node Repository shows only a few nodes (or none at all) - is that all?!?
- Under Linux I cannot see the table in the output port view, the window shows only blank content. What is wrong here?
- Install the libXP package, by e.g.
yum -y install libXp. - Add
export AWT_TOOLKIT=MToolkitto your ~/.bashrc (or whatever file your shell executes upon startup).
Developers FAQ
Questions:
- How do I implement my own Node?
- In the KNIME Developer version of KNIME the auto-completion does not work when editing source code (using Ctrl-Space). Why is that?
- What version of Java do I need?
- How do I use KNIME Developer version? I only see an ordinary Eclipse IDE...
- I used the KNIME Node-Extension wizard which created a new plugin and one Node. Now I want to add a Node in this plugin. How can I do this?
- My new node appears on the top level of the node repository How do I put it into a (new) category?
- What is exactly the idea behind the node model, node dialog and the node view? Where are the differences between the node dialog and the node view? Why doesn’t the node dialog write directly to the node model?
- When should the
modelChanged()function be called explicitly? - Why there are different types (for example in the chem plugin the SMILES data type, etc.)? Wouldn’t it be easier to have only strings and the node to care about the content of the string?
- My node generates values based on the values of the input data. Should I add this information or simply output the new values?
- How do I handle errors and exceptions during execution of the node model?
- How do I set the progress bar correctly?
- Where do I set default values for my user settings?
- What is the difference between the
configure()and thevalidateSettings()method? - Why is
validateSettings()andloadValidatedSettings()split into two methods. Isn’t that duplicating code? - How can I show debug messages for selected packages only?
- If I use the
javax.swing.JFileChooseror KNIME'sDialogComponentFileChooser, it takes very long to open the dialog or the dialog does not open at all causing KNIME to hang. - I downloaded the KNIME developer version and the it doesn't seem to use my default java installation. Why?
- How do I include and use external java libraries in my new KNIME plugin?
Select "New", "Other", and in the category "Konstanz Information Miner" select "Create new KNIME Node-Extension". Follow the instructions of the wizard. For further information refer to the extension howto where it is explained in detail.
A new category is defined using
the extension point org.knime.workbench.repository.categories.
Use the Eclipse editor of the "plugin.xml",
which is contained in the plugin project, to register this extension point.
Then go to the tab Extensions and click the
"add..." button to add the org.knime.workbench.repository.categories
extension (if not already there). On the new list entry, do a right click
and follow "New", "category" as shown in the following
screenshot:
In the "Extension Element Details" section on the right,
you need to enter information to the new category, that is:
name:
The name as it appears in the node repository
path:
The path in the node repository. Put '/' to put it into the root.
level-id:
The internal identifier for this category. Enter this id in the node
extension to put the node into this category or use it in another category's 'path'
element to get sub categories.
after:
Allows to order the categories. Find out which category identifiers
are available by looking into the KNIME plugins.
description:
The tooltip text.
icon: The icon of this category.
To finally add nodes to this category, go to each node's extension
definition and enter the category identifier (along with a preceding '/')
in the category-path.
- The views: they require access to the entire model as we do not know (and cannot know) which pieces of the (node) model are needed to display it properly. In some instances (Scatterplot...) it can be very useful to see different plots at the same time (imagine looking at three different combinations of variables).
- The dialog: it changes settings of the (node) model that control
its operation (what to compute...). It has nothing to do
with what is subsequently contained in the model itself; this is
determined during execution, based on the settings that were changed
by the dialog. So why do we not enable the dialog to write directly
into the (node)model?
There are two reasons:
- We want to be able to store those settings when the workflow is saved. If we make sure everything is transported from the dialog to the model in a clearly defined container (the NodeSettings object) we can serialize this object and be sure that we do not lose anything.
- More importantly: we want to be able to cancel a dialog and check before writing everything to the model that the new settings are correct. In order to do this we cannot have the dialog write directly into the model because then we would not be able to reverse to the previous settings. This is also the reason for the separate validate and apply methods. When a dialog wants to apply the new settings, the model validates them first (and rejects them if incorrect) and only afterwards are the entire settings written to the model.
-
The (node)model: in order for us to not only load the
configuration (nodes, connections, and node settings) but also the
content of the models and the data itself, we need to store what
the node created during execution. Since we don't really have
control over what happens inside the model during execute() we
leave it to the user to write this out to a specific directory.
For some nodes it may be sufficient to do nothing because all
information is already contained in the NodeSettings (which are
stored automatically) - for instance a column filter or a node
computing some properties.
Also the data provided at the data outport is stored automatically.
For other nodes (such as a decision
tree) we do need to store the entire tree. Note that this is not
necessarily the same as what is transported to the Model-Ports -
the tree inside the node also needs to remember which Rows to
highlight when a branch is selected. In almost all instances you
only need to worry about writing and reading NodeContents if that
node provides views.
To summarize it:
- The
validateSettings(), loadValidatedSettings() and saveSettings()methods have to be implemented if the node has a dialog (and therefore settings). - Use the
load/saveInternals()if the node provides views. - If the node has a model outport implement the
load/and saveModelContent().
- The
modelChanged() function is essentially the notification to all views
that a model has changed (reset or execute) inside the MVC-model and is
called internally by the framework. Therefore there is no need to call it
explicitly.
- If the error is severe that no data can be provided at the outport throw an exception. Then the node stays unexecuted and an error icon with the message of that exception is displayed.
-
If something unusual happened or you want to inform the user about some
implicitly made decisions you can set a warning with
setWarningMessage(String message)in the execute method. The node will be executed but with a warning icon displaying the text of the warning.
exec.setProgress(currentRowNr/numberOfRows, "Processing row nr: " + currentRowNr);
If the task to be done is divided into some subtasks then you
can create a subprogress with the fraction of the whole task.
Having to equally long subtasks the code would be:
ExecutionMonitor exec1 = exec.createSubProgress(0.5);
ExecutionMonitor exec2 = exec.createSubProgress(0.5);
task1(input, exec1);
// and task two
exec2.createBufferedDataTable(result, exec2);
loadSettings method
is probably the appropriate place to put in these values. If the
model has no (default) values, it will not write values into the
settings object (in its saveSettings method), thus
the dialog will miss these values when it tries to load settings.
In that case it needs to set some default (or initial) values to
be displayed in its components.
validateStettings() you do basic checks on the new values. In
configure() you check whether the node can run with the current
settings and the values are consistent with the incoming table spec.
In validateSettings() settings are rejected only if required values
are missing or values are obviously invalid (e.g. you read a negative
number when you know the value must be positive or you get a null
or empty string for a column name). You can also check the consistency
of the values to each other (like a lower bound value should be
smaller than an upper bound value). At this point in time you
can’t check the consistency of the settings with respect
to the incoming data table. This will be done in the configure()
method. Here you complain if a chosen column name doesn’t
exist in an incoming table spec. Or a selected column is of
incorrect datatype. If configure() goes through the node will be
in the executable state.
validateSettings() and
loadValidatedSettings() split into two methods.
Isn’t that duplicating code?
.metadata directory of your runtime workspace (not your developing workspace), there is
a subdirectory called knime with the default log4j configuration in it (log4j.xml).
Inside the file there is a small comment about how to enable debug messages for selected packages only.
However, enabling debug messages in that way only affects the output written to stdout which will show up
in the Console of your Eclipse IDE, but not in the KNIME Console.
javax.swing.JFileChooser or
KNIME's DialogComponentFileChooser, it takes
very long to open the dialog or the dialog does not open
at all causing KNIME to hang.
NodeDialogPane.
Two possible solutions are: (1) Initialize the file chooser
on demand, that is, the first time you need to access the file
system, or (2) Add the creation to the event dispatching thread
by using
SwingUtilities.invokeAndWait(new Runnable() { ... });.
- Create a
libdirectory in your KNIME plugin. - Copy the file(s) into the
libdirectory. (Java libraries are packed either as zip or jar archive.) - Edit the
plugin.xmlfile with "Plug-in Manifest Editor". - Go to the tab "Runtime" and add all necessary libraries to the "Classpath" list on the right bottom corner using the "Add..." button.
- Go the to tab "Build" and add the files to the list contained in the section "Extra classpath entries".
- Make sure to have the
libdirectory selected in both the "Binary Build" and "Source Build" list (in the same tab). - (Please note, adding the jar files to the plugins build path, i.e. project context menu → "Java Build Path" → "Libraries" is not necessary.)