Saturday, April 17, 2021
Home > Blog > Configuring the Sample TextConverter 4 Project (S62)
 
July 26, 2020
 
Next >< Previous
Configuring the Sample TextConverter 4 Project (S62)

This article describes the basic steps of configuring the SiMX TextConverter 4 data extraction project, outlining the following specific features:

> Tag based field extraction
> Multiple usage of a field tag
> Using field headers to name fields
> Multi-line fields
> Positional field extraction
> Multiple output tables
> Using script

The configured sample project could be downloaded from here.

Below is the input text file contains two sample invoices with two levels of information: Invoice level (pink) and the Items level (blue).

 

 

  1. Loading the ‘S62 - Invoices.txt’ input file by drag/dropping it into TextConverter.

  2. Configuring the ‘Invoices’ template.
    - Rename the default ‘D1’ template to ‘Invoices’ by editing the ‘Template name’ property.
    - Highlight the ‘INVOICE:’ string in the input preview and use the ‘Add tag’ fly menu command (or the corresponding button on the toolbar) to add the ‘INVOICE’ input field. 
     
     
     
    - Now let’s use the same tag (‘INVOICE:’) for extracting the invoice recipient’s name and address. But this time we will use a slightly different technique. Instead of generating input fields using fly menu commands we will use Ctrl+C/Ctrl+V (or Ctrl+drag) to create two copies of the ‘InvoiceNo’ field and rename them to ‘Name’ and ‘Address’.
    - Now we can set the areas in the input text that should be used for extracting data for each of them.
    - Highlight the part of the line where the recipient’s name is expected to be found and use the ‘Set value for’ > ‘Name’ fly menu command to associate this area with the ‘Name’ input field.
     
     
     
    - Do the same for the address part, but highlight a multiline rectangular area that would cover the longest possible address.
    - To limit the height of the extracted section set the ‘Bottom boundary’ property of the Address input field to the keyword ‘Note:’
    - Now let’s generate the first and last name output fields out of the ‘Name’ input field using the ‘Composite field’ feature. To do so:
    - Right click on the ‘Name’ input field and select the ‘Add ‘Name’ block’ fly menu command. Five output fields have been generated: ‘FName’ (for first name), ‘LName’ (for last name), MName (for middle name), PName (for prefix) and SName (for suffix). Let’s remove the unneeded field leaving on FName and LName.
    - Repeat the same for the ‘Address’ field generating the address related out fields: Address1,  Address2, City, State, Zip and Country. Delete unneeded Country field.
    - Now let’s use another technique to add the tag based ‘TOTAL’ field. This time instead of a tag we highlight the value and use the ‘Add field’ > ‘TOTAL’ fly menu command.
    - Move the added ‘TOTAL’ field to the end of the input fields’ list.

    The result should look like this.

  3. Configuring the ‘Items’ template.
    - Use the ‘Add Detailed template’ fly menu command (or the corresponding toolbar button) to add another Detailed template. Rename it to ‘Items’.
    - Change the Role property of the added template from ‘Normal’ to ‘Sub-detailed’. This will make it work as a lower level Detailed template without producing any output records.
    - Now we will configure the Items data extraction part. For that we will:
    - Select one of the lines containing the Items related data.
    - Use the ‘Set fields from selected line’ fly menu command (or the corresponding toolbar button) to generate four positional input fields.
    - Right click on the line containing the column names (line 10) and call the ‘Assign field names’ command (or use the corresponding toolbar button) to update the input field names.

  4. Adding and configuring the ‘DBItems’ data object.
    - In the Script Editor - right click in the Variables area to call a fly menu and select the ‘Data Properties > DBCreator’ item. The ‘DBCreator’ object gets added to the Variables list.

    - Click twice (with delay) on the ‘DBCreator’ object’s name and rename it to ‘DBItems’
    - Double click on the ‘DBItems’ data object’s name to open the data object’s property editor.
    - On the ‘Data Source’ tab click on the ‘Set Data Source’ button to connect to a destination database or set the destination file.

     

    - Switch to a ‘Dictionary tab’ and set the Items table fields.
    - Add the ‘ItemID’ field and set its type to ‘Integer’ and ‘Field Attributes’ to ‘Primary Key’ and ‘Auto unique’.
    - Drag/drop (or copy/paste) the InvoiceNo field from the ‘Invoices’ template input dictionary and the Description, Quantity, Price, and Total fields – from the ‘Items’ input dictionary. 

     

  5. Configuring script to populate the Items table.
    There are two parts that should be implemented:
       a) Creating the ‘Items’ output table and
       b) Populating this table with the extracted data.
    Here is what should be done.
    - Right click in the Script Editor area to call the fly menu presenting a list of Context Functions.
    - Select the OnStartProcess menu item to add the corresponding context function to the script.
    - Repeat the same for the OnInputRecord.

     

     - Add the ‘if not this.append then dbItems.Create’ operator to the OnStartProcess() function. It will make the ‘dbItems’ data object create the ‘Items’ output table on starting the conversion process, if the Append property of TextConverter was not set to ‘Append to existing table’. Keep in mind that putting a dot next to an object name bring the hint list with sub-objects and methods. In this case there were two objects used: ‘this’ – TextConverter and  ‘dbItems’ – user added data object.


     
     
    -  Add the script that will use the ‘dbItems’ data object to populate and append the output records on each input record generated by the ‘Items’ template.

 

 
Your comment ...
Add Comment