Release Notes for TM v3.12
Release 3.12 (released June 2006) contained the following new features:
Flat File Support
Version 3.12 of TM provides built-in flat file adapters. These can be configured to read/write a variety of flat file formats (e.g. comma separated, fixed width).
Importing Flat File Models
Select Import Model | Flat File... from the File menu. This launches the Flat File Model Wizard.
A model consists of one or more elements. The Elements page lists the elements and allows users to add, edit or delete elements.
Each element has its own associated Flat File configuration. This is a description of the format of the data in the file, and is used by the read and write adapters.
At least one element must be in the model, before continuing to specify a model name, and then perform the model import.
Adding an Element
On the Elements page of the Flat File Model Wizard, press Add. This launches the New Flat File Element wizard:
1. Select File Page
If an example data file exists, you can point at this, and configure the file.
2. Settings Page - This is where the flat file configuration is specified.
A configuration can be:
- Fixed width or
- Delimited (the data is separated by a known delimiter)
The following details must be specified:
- whether the first row contains column (or field – the terms are interchangeable) names. This is used for reading and writing
- the row number of the first row of actual data in a file (First line is line 1)
- whether to write out a dotted line under the column name row
- whether to validate header row – if this is selected, an exception will be thrown at run-time by the read adapter if:
- the column names are not in the same order as expected by the model
- the number of columns is different from the number of attributes in the model
Additional Delimited settings:
Delimiter
The delimiter is specified – a comma, tab, semicolon, space, pipe, or a user-specified custom character (must be 1 character in length).
Field Quoted
Columns may be quoted always, never or if necessary (i.e. if the data contains the delimiter character).
The quote character can also be specified (either single or double quote)
1. Columns Page
If importing a model based on an actual file, the data from the file will be displayed in the table here, dependant on the settings specified.
Each column represents an attribute within the element being added. The attribute names can be specified by editing the table directly. For fixed width models, the column widths are set here.
Columns may be added, deleted and the order of columns manipulated.
2. Element Name Page
Specify a name for this element. All element names must be unique within a particular model.
Editing a flat file configuration
In an open project, the configuration for a particular element can be changed by right-clicking and selecting Edit Model | Flat File... from the Context menu.
User Defined Function – SML Procedures
Many TM projects have transforms that contain some common transform statements. It is now possible to create a User Defined Function containing SML transform statements that are to be called from other transforms. Such a function is known as an SML Procedure.
To declare a new SML Procedure go into Edit | User Defined Functions… and select New SML Procedure. The following dialog will appear:
Declaring a procedure is very similar to declaring a User define function.
You may specify a name for the SML Procedure by selecting Change Name:
Select the Edit button. An SML editor will appear in which you may add your SML code:
SML Procedures are used in the same way as UDFs in the transforms.
It is possible to pass values to a procedure by declaring a procedure parameter. When a procedure parameter is added, TM will ask the user for a parameter name so that the user can refer to the parameter later in the procedure.
It is also possible to return some calculation inside a procedure. To do so, the Return Type for the procedure must be selected.
To return a value from a procedure, the built-in function RETURN must be used, adding the code to the procedure.
The main difference with the main tool is that the source and target elements are abstract elements that would accept any relationship or attribute. A procedure does not depend on the transform currently displayed. You can drag and drop anything from the models into the procedure editor, the only condition being that the SML inside it has to be well formed, but the validity is not checked at that stage. When the project is built, it is possible to check for every call to the procedure that the attributes and relationships used in the procedure are consistent with the elements of the map.
Recently Added Functions
GETCASCADENAME - a cascade function
GETCASCADENAME(onSrc)
returns the name, if any, of the last relationship caller.
This takes a one Boolean parameter 'onSource'.
If the parameter is true, this returns the name of the source element name of the transform that called this transform. If no transform called this transform (that is, it is an independent transform) then this will return an empty string.
If the parameter is false, this returns the name of the target element name of the transform that called this transform. If no transform called this transform then this will return an empty string.
COUNT - a database function
COUNT(relName[,onSrc])
evaluates a 'count(*)' operation in a named relationship (assumed on source).
COUNT takes a one or two parameters. The first is the quoted path to the table you wish to count. The second optional parameter is a Boolean 'onSrc'. This is true by default.
The COUNT function simply issues a count start call on the underlying database. This call may only be used in database systems. In general, for database systems, this is a much better call to make than SIZEOF or AGGSIZE.
INSERTBUILD - a database function
INSERTBUILD()
returns true if % is interpreted as '-' rather than as *.
It returns at run time the setting used at design time in the flag: SML generation | Generate For Insert ('%' interpretation)
PERFORMFUNCTION - a database function
PERFORMFUNCTION(relName,functionName,sqlString[,onSrc])
evaluates any extended operation on a named relationship (assumed on source).
May be used to perform any SQL function on a database. For example, the following code would reproduce the effect of the COUNT function:
sFUNCTION_COUNT = JAVACONST('net.etltm.TMDHIterator','FUNCTION_COUNT');
sSQL_COUNT_SELECT = JAVACONST('net.etltm.TMConstants',' SQL_COUNT_SELECT');
PERFORMFUNCTION('relPath' ,sFUNCTION_COUNT , sSQL_COUNT_SELECT, onSrc);
PERFORMFUNCTION takes four parameters:
- The path to the source item/table
- The name of the function
- The name SQL for the function
- The Boolean 'onSrc' - by default this is true.
JAVACONST - a Java function
JAVACONST(className,staticField)
JAVACONST requires two parameters className and memberName. The memberName must be a public static final member of the class. It returns the value of the class member. It can be used in place of actual values for greater portability. Thus, instead of defining 'eSELECT' to an actual literal value, one can write
eSELECT = JAVACONST('net.etltm.qp. TMUpdateableElement ', 'eSELECT');
CLEARPERSISTENCE - an other function
CLEARPERSISTENCE(sessionLevel)
resets all persistent variables for the session level.
TM 3.10 and above introduced the concept of persistent local variables. These retain their values between map invocations. In this sense, they resemble global variables but, while global variables always retain all values for all time, persistent local variables are bound to session level; and are cleared at the end of the session. Persistent local variables are also private to the transform in which they are declared (and any SML procedures called from those transforms).
The default Persistence level is zero so Persistent local variables are declared as:
yx : string : persists ;
which is the same as
yx : string : persists[0] ;
where the zero is the session level used to clear the persistence.
You can also write:
-- FL : jobject : persists[2] {<java.lang.Float>} := 2.6;
or
-- LA : jobject persists {<samples.UDFExamples.LoadAll>};
Note that :
-- FL : jobject : persists[2] {<java.lang.Float>} == 2.6;
declaring a local variable to be immutable and persistent is legal but pointless.
CLEARPERSISTENCE may also be used to clear persistent local variables without the issuing of an End Session call. The persistent local variables are then set back to default values (empty String, zero etc) and may be re-assigned.
CLEARPERSISTENCE takes a one or two parameters. The first is the session level. The second optional parameter is the variable name.
GETQPMODE - an other function
GETQPMODE
returns the current inverse qp mode.
This returns the 'BasicOpCode' for the current inverse transform. The opcodes are defined in net.etltm.qp.TMUpdateableElement as :
eSELECT eDELETE eINSERT eUPDATE eMERGE_INTO eMERGE_DELETE eREFERNCE ePURE_UPDATE eMIXEDSET eUNASSIGNED eNO_OP eILLEGAL
RETURN - an other function
RETURN([value])
returns a value from an SML procedure and stops the execution of that procedure.
Example
return ('27.8');
SETARBITRARYMAPDATA - an other function
SETARBITRARYMAPDATA(name,value[,targetDomain,sourceDomain])
sets a name value pair property on the map targetDomain - sourceDomain.
If not defined, it will set the name value pair to the current map.
SETPARAMS - an other function
SETPARAMS(relName,paramNumber,expression[,onSrc])
sets the parameter on a named relationship (assumed on source).
This is used to set a parameterised value (a '$Px' value) defined for relationship sub condition flexibly at run-time. Once set, the value remains set until another call to SETPARAMS is called on the same relationship and 'ParamNumber.'
SETPARAMS takes four parameters:
- relPath – The path to the relationship
- oParamNumber – The parameter number to set
- oValue – The value to set
- The Boolean 'onSrc' – This is true by default
Native Excel File Adapter
A new adapter based around the publicly available JExcelAPI provides basic support for reading Excel files without the use of a JDBC-ODBC bridge. This allows access to Excel files from all Java platforms and does not require a copy of Excel to be installed as the ODBC bridge does.
Alternative XSD Model Loader
An alternative XSD model loader is provided which preserves more of the structure of the original XML Schema. The loader is available from the Import Generic Model dialog from the TM Design Tool, or the project wizard. The loader facilitates reuse via complex types and type hierarchies. The following main features differentiate it from the standard XSD loader.
Complex Types and type derivation
Complex types are available in the model so that transforms can be written to/from complex types. Thus, if many elements have the same type, a common transform can be written. The type hierarchy is preserved, so that common transform code can be written for supertypes, and specialist transform code for particular subtypes.
Element groups
Element groups can optionally be preserved, to allow transforms to/from element groups to be written.
xsi:type support
On the source side, xsi:type is accounted for when selecting the correct transform to run. On the target, xsi:type is generated where appropriate according to the actual subtype created.
Other features of the existing loader are also supported, including simple type derivation, includes and imports, and substitution groups.
Support for these features has been initiated to provide support for use of FpML model (and its derivatives) for our Finance sector clients. However it is envisaged that the Generic Model Loader will be used by clients in all sectors for all XSD models in future.
Automatic Test Set Generation
TM is able to generate pseudo-random test data for any source format (RDBMS, XML, Java, Flat Files etc.) and any entities and attributes required for test purposes. The process is easily applied to developments where transforms are developed iteratively and pseudo-random test data is generated periodically for source data.
TM’s random data generator can generate pseudo-random numbers of type int, long, float or double. These numbers are created using an algorithm that takes a ‘seed’ and ‘grows’ a sequence of numbers from it. Initialising the algorithm twice with the same seed produces the same sequence because the algorithm is deterministic. This repeatability may be very important when testing. In addition, TM’s random data generator can also generate strings as a chosen set of words from a word file of random length. When creating a test data set it will also ensure that relationship cardinality will always be met.
To generate a pseudo-random test data set it is necessary to:
- Create a model of the data that is to be generated. Note that in creating such a copy model we may optionally create a subset of the model containing only the source entities that have been mapped, thus reducing the amount of test data we create each time.
- Create an ‘Identity’ Transform Project, which simply copies values from ‘source’ to the test data set to be populated. We refer to this as an identity transform as we set up the identification seeds within this transform.
- Run the Identity Project, connecting to the generic random source adapter that generates the pseudo-random data set.
Please refer to the whitepaper, Automatic Test Set Generation for further detail.
Performance Improvements
Multi-table Batching
To improve target insertion and update, pattern-batching algorithms that are much more theoretically efficient have been introduced.
Turning on pattern batching allows TM to continue to use JDBC batching operations even where constraints are present between various tables. In order to do this TM performs some very sophisticated dynamic pattern analysis on the items it is updating or inserting. It then groups items according to domain. Thus, the updates are grouped together by domain for efficiency but are still in the correct order to satisfy the constraints.
By default, pattern batching (sometimes referred to as batching across constraints) is disabled when TM is writing to a database target. It is usually called once in the $document transform. It is enabled by calling the function ALLOWBATCHONCONSTRAINTS(true).
Again, these improved algorithms take advantage of the fact that a number of update or insert operations may be batched together in one SQL query. The complexity arises because each batched set of inserts must operate on ONE table at a time only, yet must ensure that the relevant constraints are not violated. In other words, all parent items must be inserted before all child items.
High Speed RDBMS Adapter
A High Speed RDBMS Read Adapter has been introduced that is able to make fewer reads of larger sets of data making using of the IN Query RDBMS feature, thereby increasing performance dramatically. For example, when working with extremely large sets of data, (e.g. in the Geoscience sector), reading fairly complex tables
for 40 Wells and associated Wellbore, Completion and Production Data (nested 4 deep), the performance improvement in the execution time, using the High Speed RDBMS Read Adapter (including batch improvements discussed above) was a factor of 15 on Oracle 9 and 10 on Oracle 10g, compared to the standard RDBMS Read Adapter. In this relatively complex example, this represents approximately 780 rows per second on Oracle 9 when run locally and 285 rows per second when run on a remote server, and roughly 1100 rows per second on Oracle 10g when run locally and 400 rows per second when run on a remote server.
The High Speed RDBMS Read Adapter that capitalizes upon Oracle’s® ability to make complex IN queries has been added. It allows one IN query to be substituted for a very large number of individual queries. The adapter does not attempt to navigate the source piecemeal but reads an entire data set using a small series of automatically constructed, but complex, SQL queries.
To operate via the new mode go to Tools | Preferences… | SML Generation | Generate Fast JDBC Read Adapter Code and ensure the box is checked:

Dramatic improvements have been made when accessing source data using the TM High-Speed Read Adapter on both Oracle 9 (source read time reduced by factor of 29 and execution by a factor of 15) and Oracle 10g (source read time reduced by a factor of 32 and execution by a factor of 11). Furthermore, on Oracle 10g, even greater improvements are available by increasing the MaxRead count up to a maximum of 1000.

Figure 1 – Oracle 9 Summary Chart

Figure 2 – Oracle 10g Summary Chart

Figure 3 – Oracle 10g Increasing MaxRead Chart

Figure 4 – Transformation Examples Summary
GUI Changes
Various options are available on the model tree view and transform source/target tree views. These are available from the View context menu on an element or relationship, or from the preferences dialog.
1. Super types and sub types can be shown or hidden.
2. Inherited relationships can be shown. When selected on a complex type relationships and attributes from the super types will also be shown and can be used directly in a transform.
3. In the model view, local elements can be optionally hidden as it is more common for local elements to be reached via relationships than as the source or target of a main transform.
4. Complex types and element groups have different icons to elements. Super and sub type relationships are shown with up and down arrows respectively.
Miscellaneous
In addition to some bug fixes, this release includes improved TM Design Tool support for the alternative XSD loader, allowing relationships to be followed directly from an element, which avoids the need to map from the complex type in some cases. Some internal restructuring has been done to facilitate further improvements for the next release.
