File Reader Node

The most common way to store relatively small amounts of data is still a text file. Among text files, the most common format has been so far the CSV (Comma Separated Version) format. The “comma” in the CSV acronym is just one of the possible characters to separate data inside the file. Semicolon, colon, dot, tab, and many other signs are equally acceptable.

When dealing with text files, you also need to deal with encoding, possible irregular structure, missing values, full strings containing commas and therefore usually embedded in quotation marks, new lines, etc … A more rigid interpretation of the file structure makes of course for faster reading. However, sometimes you need a more flexible descriptionof the file structure to get to a result, even if it requires a bit of a longer configuration time.

The most versatile node to access a text file, whatever its format, is the File Reader node.

 

Reference workflow is available in the EXAMPLES server under:
01_Data_Access/01_Common_Type_Files/04_Use_the_File_Reader01_Data_Access/01_Common_Type_Files/04_Use_the_File_Reader*

Exercise

  • In workflow named “ETL_Basics”, read the adult.csv dataset using a File Reader node.

 


* The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher)