Parse File Action
parse File Action Node
NOTE THIS NODE HAS BEEN REPLACED WITH THE Flat File Parser
The purpose of this node is to convert data found within a file into a dataset.
The files that we can convert are:-
- RTF Table
- HTML Table
- CSV Table
Parsing Text - An Example
Given the text below, lets say we want to extract specific data from it into a table:-
For the OrderID, we have selected 5 characters after "OrderID=" and set the repeat to true (as there will be more than one).
We do the same for the CutomerID but without the repeat. A regular expression is then generated for this search.
We then set the Telephone column to be the characters between Tel= and \r\n (a new line).
Finally we add a search for County, although you will notice that there is no County text in the Customer.txt file.
On the Error Handling tab we can decide on what action to take when the field cannot be found.
In this case we are looking for a County text that is not there,
We can get Presence to throw an exception, continue and ignore or add the column but set it to be empty.
Finally when we run our test we get the following:-
Parsing HTML, CSV or RTF Tables - An Example
If your data is in a table format then it will be easier to parse as the data is already separated into cells.
Once we have chosen the type of file we are going to parse, and the url we can hit populate.
This pre-parses the table in the file and adds a list of available cells to the table.
You should also see that the drop-down box in the Search1 cell now contains all the available cells.
You then probably need to delete the rows for the cells you aren't interested in, along with the cells that contain data as opposed to the cells that contain fields.
We can then get the parser to search for a cell by giving it either a location, such as pt(0,0) or text to search for, such as TEAM NAME.
When it finds this cell we can then return:-
Cell Above - The cell above this cell will be added to the datatable.
Cell Right - The cell to the right of this cell will be added to the datatable.
Cell Below - The cell below this cell will be added to the datatable.
Cell Left - The cell to the left of this cell will be added to the datatable.
Chars At Cell - This cell itself will be added to the datatable.
Given the the rtf file we have below, let us say we want to extract the 2 Contact Telephone Numbers, the Team Name and the Team Location.
We would set up our scan criteria as follows:-
Notice that "Repeat" is selected for "telephone Numbers"
This is because we know there is more than one telephone number and so we'd like Presence to keep returning them unitl it runs out.
By editing the repeat options, we can relate each telephone number to a new row in the datatable or a new column (in the form of TEL_1,TEL_2 etc).
Here are the results
Explanation of Search Type options
- Cell Above - Searches for the cell defined in Search1, then returns the data in the cell above it.
- Cell Right - Searches for the cell defined in Search1, then returns the data in the cell to the right of it.
- Cell Below - Searches for the cell defined in Search1, then returns the data in the cell below it.
- Cell Left - Searches for the cell defined in Search1, then returns the data in the cell to the left of it.
- Chars At Cell - Searches for the cell defined in Search1, then returns the data in it.
- Chars Before - This will create a regular expression that will return a number of characters specified in Chars1 before the search string
- Chars After - This will create a regular expression that will return a number of characters specified in Chars1 before the search string
- Chars Between - This will create a regular expression that will return the character between the strings specified in Search1 and Search2
- Chars At Positions - This will return the characters from the index specified in chars1 to the index specified in chars2.
In the case of Chars Before,Chars After and Chars Between a regular expression is created in the "Regular Expression Column"
Presence then uses the text in this "Regular Expression" column to search the file.
If you are familiar with regular expressions then you can edit this text for a more specific search.
There is also a button by the regular expression options, that allows you to enable/disable the following:-
CASE_INSENSITIVE Enables case-insensitive matching so Telephone would match with TELephone when checked.
MULTILINE This allows the search to continue over multiple lines.
DOTALL The regular expression '.' matches any character except a line terminator unless the DOTALL flag is specified.
Task Elements : Action Task Elements : Parse File Action
|Send Email | Send SMS | Send Fax | Broadcast Messages|
|Read Text File | Read Binary File | Write Text File | Write Binary File | Parse File Action|
|Rename File | Copy File | Delete File | Parse File Action|
|Generate Bar Code | Read Bar Code|
|Dynamic Task Call | Call Native Program | FTP Upload | Scorecard Collector | Create Graph | AS400 Action|
|Socket Client Action | Socket Server Action|
|JMS Producer | JMS Consumer|