Handling Inline Data
Trailer Lines and Inline Entity Data
In the Footwear Orders sample MAPPER report, the asterisk lines that follow the last line item in an order are known as trailer lines. For example:

Trailer lines often contain data for an entity that is separate from, but related to, the entity represented by the column-formatted, tab data lines. For the Footwear Orders report, the trailer lines represent the customer contact information or "addressee" for the preceding order items:

MJ considers entity data embedded in trailer lines as "inline".
Identifying Inline Data Tuples
Just as with tab lines containing the columnar data fields for the primary entity in the MAPPER report, it's important to identify which trailer lines comprise a tuple (or row) of inline entity data, and how those tuples are organized. It may be that one trailer line represents a distinct, inline data tuple. Or, as in the Footwear Orders sample report, multiple trailer lines may comprise an inline data tuple. Additionally, the trailer lines may be not column-formatted. In Footwear Orders, the trailer lines contain delimited, variable-length fields.
IInlineDataHandler
Interface
Translation of inline MAPPER report data is informed by a Spring bean that
implements
IInlineDataHandler,
an interface
purposely designed for composing tuples from trailer lines. MJ provides IInlineDataHandler
beans useful in handling common occurrences of inline data, such as parsing data fields with regular
expressions and storing inline text as CLOBs. Custom inline
handlers may be devised for specially formatted trailer lines by implementing
the IInlineDataHandler
interface.
Handling Inline Data with Regular Expressions
The
RegexInlineDataHandler
bean recognizes and
extracts fields from one or more trailer lines, mapping and loading the fields into attributes of
a data entity. Noteworthy properties of the RegexInlineDataHandler
bean include:
matchFullInput |
Indicates whether field parsing operates in match-and-advance mode (portion of the input text that was matched is discarded before the next field is matched) or match-full-input mode, where each regular expression is matched against the full text input. |
textStripRegex |
Regular expression to match and strip from input text before parsing fields from input. |
textReplaceRegex |
Regular expression to match and replace in input text before parsing fields from input. |
unixLines |
Controls use of java.util.regex.Pattern.UNIX_LINES for
regular expressions employed by bean. |
multiline |
Controls use of java.util.regex.Pattern.MULTILINE for
regular expressions employed by bean. |
caseInsensitive |
Controls use of java.util.regex.Pattern.CASE_INSENSITIVE set for
regular expressions employed by bean. |
fieldParseRegex<N> |
Regular expression used to parse field N from input text. The field is named and a regular expression supplied to recognize the particular field and select the field contents as a "capturing group". |
For example, this bean is employed to translate customer adddress information in the trailer,
asterisk lines of the Footwear Orders sample report to the FootwearOrderAddressee
entity. Below are the fieldParseRegex
regular expression properties to
recognize and extract the addressee and street from the first trailer, asterisk
line from Footwear Orders:
<bean id="addrInlineHandler"
class="com.arsi.mj.maprpt.parser.tuple.inline.RegexInlineDataHandler"
. . .
p:matchFullInput="false"
p:multiline="true"
p:unixLines="true"
p:textStripRegex="^\*">
<property name="fieldParseRegex1">
<!-- column name, field selector regex -->
<list>
<value>addressee</value>
<value>^[^\t]*\t([^\t]*)</value>
</list>
</property>
<property name="fieldParseRegex2">
<!-- column name, field selector regex -->
<list>
<value>street</value>
<value>\t([^\t]*)$</value>
</list>
</property>
. . .
</bean>
Note the use of match-and-advance mode by setting matchFullInput
to
false
and stripping of the asterisk line type before parsing the trailer line
using the textStripRegex
property.
Referencing the example below of trailer, asterisk lines from the Footwear Orders report:

the following illustrates how the fieldParseRegex1
and fieldParseRegex2
bean properties match fields
in the first trailer line. The screenshots are
from Java regex tester.
Regular expression property fieldParseRegex1
matches
the addressee, "Dave Bennet".
<property name="fieldParseRegex1">
<!-- column name, field selector regex -->
<list>
<value>addressee</value>
<value>^[^\t]*\t([^\t]*)</value>
</list>
</property>\

Regular expression property fieldParseRegex2
matches
the street number and name, "123 Main St".
<property name="fieldParseRegex2">
<!-- column name, field selector regex -->
<list>
<value>street</value>
<value>\t([^\t]*)$</value>
</list>
</property>

Defining the RDBMS Table and Columns For Inline Data
Once the recognition and extraction of inline data fields are specified,
the target database table and columns where those fields are stored must be provided. In the following example
for Footwear Orders, the database table and entity class are listed in
the tableName
and className
properties,
while database columns are defined by collaborating column definition beans:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:p="http://www.springframework.org/schema/p"
xmlns:util="http://www.springframework.org/schema/util"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.springframework.org/schema/util
http://www.springframework.org/schema/util/spring-util-3.0.xsd">
<bean id="addrInlineHandler"
class="com.arsi.mj.maprpt.parser.tuple.inline.RegexInlineDataHandler"
p:className="com.arsi.mj.testapp.hibgen.model.FootwearOrderAddressee"
p:baseClassSuffix="Base"
p:tableName="FOOTWEAR_ORDER_ADDRESSEE"
p:column1-ref="addrInlineHandler.coldefAddressee"
p:column2-ref="addrInlineHandler.coldefStreet"
p:column3-ref="addrInlineHandler.coldefCityState"
p:column4-ref="addrInlineHandler.coldefPostalCode"
p:column5-ref="addrInlineHandler.coldefEmail"
p:column6-ref="addrInlineHandler.coldefPhone"
. . .
>
</bean>
Below are example column definition beans for addressee and street fields in the Footwear Orders report. Each column is defined via arguments passed to the ColumnDef constructor:
<bean id="addrInlineHandler.coldefAddressee"
class="com.arsi.mj.config.atoms.ColumnDef">
<!-- entity attrname, column name, length, NULLABLE, data type -->
<constructor-arg index="0" value="addressee"/>
<constructor-arg index="1" value="addressee"/>
<constructor-arg index="2" value="50"/>
<constructor-arg index="3" value="false"/>
<constructor-arg index="4">
<util:constant static-field="org.hibernate.type.StandardBasicTypes.STRING"/>
</constructor-arg>
</bean>
<bean id="addrInlineHandler.coldefStreet"
class="com.arsi.mj.config.atoms.ColumnDef">
<!-- entity attrname, column name, length, NULLABLE, data type -->
<constructor-arg index="0" value="street"/>
<constructor-arg index="1" value="street"/>
<constructor-arg index="2" value="80"/>
<constructor-arg index="3" value="false"/>
<constructor-arg index="4">
<util:constant static-field="org.hibernate.type.StandardBasicTypes.STRING"/>
</constructor-arg>
</bean>
Columns are defined with the ColumnDef
bean instead
of AnnotatedColumnDef
because the
inline data is not column-formatted and does not contain heading lines that give the MAPPER name
and size of each column. The entity attribute name provided as the first argument
to the ColumnDef
constructor must match the field name specified in the
corresponding fieldParseRegex
property ("addressee" and "street",
for example).
RegexInlineDataHandler Example: FootwearOrderAddressee
See the full Spring XML for
the Footwear Orders sample MAPPER report, including
configuration for translating inline "addressee" entity data into
the FootwearOrderAddressee
persistence class
and FOOTWEAR_ORDER_ADDRESSEE
table.