Transformation Manager Glossary

 

This Glossary contains definitions and explanations of all of the special terms used in the online help. It also contains a few terms that occur in the environment in which TM is used; in these cases the equivalent TM term is indicated.

Section - Welcome To Transformation Manager

Term

Definition

Aggregate

A collection, or array of values. The many side of a one to many relationship returns an aggregate. Each item within an aggregate is an instance.  SML can treat a singleton as an aggregate with one instance.

Alternative Group

A group of alternative elements. Only one of the elements listed must appear in the instance document. Known as xs:choice within XML Schema

Annotation

Provides information regarding the XSD (or components of) to the user or other applications.

Attribute

An element parameter that modifies or refines the meaning of the element.
An XML attribute, such as
title in <BOOK title=’fred’>. This corresponds to a column in a relational database table. Note that the pseudo attribute $Content is also available to represent immediate node content.

BLOB

Binary Large Object - any binary data stored as an entity in a database. Images, sound, and other similar data are typically stored as BLOBs

Build

To build a project is to generate Java source files from the project definition held in the repository, and then compile them into executable programs. These two steps, known as the generation and compilation phases, are usually performed as one step by builder.

Cardinality

A data modelling term describing the number of items a relationship points at. Thus a relationship may have single cardinality - such as 'wife' - I have one wife, or have multiple cardinality such as 'children'- I have one or more children. See also Relationship.

Cascade

A Transformation Manager term used to describe one transform calling another. This is similar to a subroutine call.

For example, consider an assignment
B := E; where B and E are relationship names. The destination element for the B relationship is the D element.  The destination of the E relationship is the F element. When TM reaches the B := E; statement it will call a transform between the destination elements of the B and E relationships, i.e. a transform between D and F.

TM has built-in cycle detection to avoid infinite loops when two or more transforms call each other in a repeated sequence.

Child

The child axis contains the children of the context node. (See XPath.)

Class

A term that describes a specification for a collection of objects with common properties.

Classpath

A classpath is used by a Java application to locate the Java classes it needs to run. The Classpath lists the directories, .jar files, or .zip files that contain the compiled Java classes.

CLOB

Character Large Object - these are data type that accepts character strings of a  large length, up to the maximum length specified in the data type declaration. These are mostly used for columns which can have data more than 4192 characters.

Column

A column in a table in a relational database corresponds to an XML attribute.

Compile

See Build.

Complex Type

Defines the attributes and contents of an element.

Content Group

Lists the child elements belonging to an element. It can further define content groups within itself.

Context Menu

A menu of options or actions which are relevant only to a specific situation. Context menus are displayed by right-clicking on items such as nodes in a model; different options are available depending on the particular node selected.

Data dictionary

In database management systems, a file that defines the basic organisation of a database. A data dictionary contains a list of all files in the database, the number of records in each file, and the names and types of each field.

Data dictionaries do not contain any actual data from the database, only bookkeeping information for managing it. Without a data dictionary,a database management system cannot access data from the database.

Data Model

A model describes data structure. The TM repository stores information about the structures and formats of the source and target databases models. A model may describe a database, a set of XML instances or Java class. A database model consists of the definitions of the tables, columns and constraints. An XML model can be defined in a DTD or XSD. Both describe valid elements and attributes for a set of XML instances.

Data Oriented XML

XML that is primarily used as a data transport mechanism, and designed for machine consumption.  The XML file follows a precise structure, with fine-grained elements, which are often subject to strict constraints. There is little or no mixed content.   The order of sibling elements is usually unimportant.  The XML is usually machine generated.

See Document-oriented XML

Data store

General term describing a mechanism of storing data, with no implications about the kinds of data stored or the mechanisms used for storage or retrieval.

DDL

Database Definition Language. A subset of SQL 92 which is used to create and amend database objects. It includes the structure definition statements such as CREATE and ALTER.

Deployment

Installation and use of completed transformation programs in a production environment. Transformation programs created with TM can be run either as ‘stand-alone’ utilities or be integrated into larger data-processing systems. See also Production System.

Destination

The entity or node to which a relationship is defined. Not to be confused with a target data store in a transform.

Development System

See Production System.

DML

Data Manipulation Language.  A set of statements used to store, retrieve, modify, and delete data from a database.

Document Element

The outermost element in an XML document hierarchy. It is the ancestor of all elements  The Root Element

Document Oriented XML

Document-oriented XML file has little structure with most of the elements being course grained.  The XML  is primarily designed for human consumption, and does not have a particularly regular structure. The ordering of sibling elements is usually very important. The XML is usually written by hand.

See Data-oriented XML

DOM

Document Object Model. A W3C specification. A set of language-independent interfaces for programmatic access to the elements of an XML document.

DTD

Document Type Definition. A DTD can be embedded in an XML document, or be specified in an external document and referenced from an XML document. In either case, it specifies the rules to which the document claims to conform. A document can be validated against the DTD, see valid below. A DTD has its own syntax, it is not an XML document.

Element

An identifiable object in an instance document. It can contain text and/or child elements or be empty. An element in XML is equivalent to a table in a relational database.

Entity

Data modelling - a unit of data that can be classified and have stated relationships to other entities

XML - a named object that can be referred to.

Expression

A statement in SML which returns a result, such as a calculation or a true/false test. All functions in SML which accept a variable or attribute as an input argument will also accept an expression which evaluates to a value of the correct data type.

External Function

See Java Function

Facet

Restrictions are used to control acceptable values for XML elements and attributes.  Restrictions on XML elements are called facets.

Field

A general term for a single piece of data, corresponding to a column in a relational database table or an ‘attribute’ in XML.

Function

Sometimes used as a short form of Java Function

Generate

See Build.

Global Variable

Declarations (usually of variables) that are available to the entire project. TM and SML support scoping .

Group

See Inclusive Group, Alternative Group, and Sequential Group.

GUI

Graphical User Interface. The TM TM Design Tool can only be used through its graphical interface; the TM Test Tool has a graphical interface, but it can also be used by typing commands at a command prompt.

Identifying attribute

An identifying attribute corresponds to a key column in a relational database. SML provides control over the way multiple occurrences of the same data in the source are handled; an identifying attribute can be used to ensure that only a single instance occurs in the target.

Inclusive Group

Either none or all of the elements listed must appear in the instance document. The elements may be in any order.

Instance

Broadly speaking, a ‘record’ in a data store. An instance in an XML document corresponds approximately to a row in a relational database table. An instance usually conforms to some model data held as meta data.

Inter-Project Relationship (IPR)

Inter-Project Relationships (IPRs) link the primary project with its associated secondary projects. The secondary projects are invoked by calling IPRs in the transformation statements.

Iterator

The control variable in a FOR_EACH loop.

Java

Java is an object-oriented language.  Java source code files (files with a .java extension) are compiled into bytecode (files with a .class extension), which can then be executed by a Java interpreter. Compiled Java code can run on most computers because Java interpreters and run-time environments, known as Java Virtual Machines (JVMs), exist for most operating systems.

Java Function

A function defined in Java which provides features not available in SML. Java functions use Java instead of SML. They are used to fulfil unusual requirements which can not be easily satisfied with normal SML functions. Often Java functions are simply calls to standard Java library routines.

TM TM Design Tool enables Java source to be entered to define the function. Note that these function are static globals and must have a return value.

Java run-time

 

Java Scalar Types

Also known as primitive data types.  These are:

byte

short

int

long

float

double

char

boolean

JDBC

JDBCTM is the mechanism that Transformation Manager uses to access tabular data source such as relational database tables.

JDK

JDK is the short name for the set of Java development tools, consisting of the API classes, a Java compiler, and the Java Virtual Machine interpreter.  The JDK is used to compile Java applications and applets.

JRE

Stands for Java run-time Environment.  The JRE enables Java applications to be run on a native machine.

JVM

The Java virtual machine executes instructions that a Java compiler generates. This run time environment, or JVM, is embedded in various products, such as web browsers, servers, and operating systems

Java was designed to allow application programs to be built that could be run on any platform without having to be rewritten or recompiled by the programmer for each separate platform. A Java virtual machine makes this possible because it is aware of the specific instruction lengths and other particularities of the platform.

LiteralPath

A term used to describe a path through related nodes.  The literalPath may be an aggregate path - ie moving to the many side of a one to many relationship.  It may also be a combination of aggregate and single cardinality relationships.

Locale

A Locale represents a specific geographical, political, or cultural region.

By holding a default locale, computer systems can formatted certain information according to the customs/conventions of the user's native country, region, or culture . A good example is difference in date format between the UK and USA.  The UK convention is dd/MM/yyyy, the US convention is MM/dd/yyyy

Meta-data

A data modelling term simply meaning "data about data". A DTD or Oracle catalogue, for example, is normally described as meta data. However note that the term is relative - an XML instance document which contained SQL DDL statements is an instance document from one perspective and yet meta data from another perspective. The same is true of the actual data rows contained in an Oracle catalogue or other system tables.

Migration

See Transformation. Usually used when the transform is used to define a route from one model to a previous or successive version of itself.

Mixed Content

An XML term.  A combination of text and elements. For example, a paragraph may contain text and emphasis elements.

Namespace

A namespace is defined by a URI, and is an ‘environment’ within which element and attribute names are guaranteed to be unique.

Nesting (functions)

Using a function ‘within’ a function so that the result from the ‘inner’ function becomes an input argument for the ‘outer’ function.

Node

A term used to represent every item of significance in an XML document, including attributes, elements or textual data.

Object

In object-oriented programming, for example, an object is a self-contained entity that consists of both data and procedures to manipulate the data.

Object-orientated programming

A type of programming in which developers define both the data type of  a data structure and the functions that can be applied to the data structure. In this way, the data structure becomes an object  that includes both data and functions. In addition, programmers can create relationships between one object and another. For example, objects can inherit characteristics from other objects.

ODBC

Abbreviation of Open DataBase Connectivity, a standard database access method developed by Microsoft Corporation. The goal of ODBC is to make it possible to access any data from any application, regardless of which database management system (DBMS) is handling the data. ODBC manages this by inserting a middle layer, called a database driver , between an application and the DBMS. The purpose of this layer is to translate the application's data queries into commands that the DBMS understands. For this to work, both the application and the DBMS must be ODBC-compliant -- that is, the application must be capable of issuing ODBC commands and the DBMS must be capable of responding to them.

Optionally

A data modelling term describing the number of items a relationship points at. Thus relationships such as 'wife' or 'children' are optional - the relationship to my ''National Insurance Number'' is mandatory - (at least in the UK and USA).

Origin

The entity or node from which a relationship is defined. Not be confused with the source data store in a transform.

Parent

The parent axis contains the parent of the context node, if there is one. (See XPath.)

Particle

Specifies whether the list of elements in a content group belongs to a Sequential Group, Alternative Group or Inclusive Group.

Polymorphism

Polymorphism refers to a programming language's ability to process objects differently depending on which class of object and data type they are.

Production system

The computer system containing the programs and data which a business or organisation uses for its day-to-day operations. It is distinct from a development system, which is used to create new programs and amend existing ones. There may also be a separate testing system. It is not normally necessary for TM to be installed on a production system where transformations are performed on ‘live’ data.

Property

Characteristic of an object.

Pseudo-attribute

Pseudo attributes are special attributes that provide attribute like access to non-attribute data. For example $Content represents an elements immediate text content and $Name represents the local name of the element. Pseudo attributes are used in assignment statements to simplify the copying of data. For example, to reference all text content in an XML node and its sub-nodes the pseudo-attribute $BigContent can be used in a single assignment statement instead of writing several separate statements. See the Language Reference for full details of all of the pseudo-attributes available in SML.

RDBMS

Short for Relational Database Management System. Data is stored in the form of related tables. Relational databases are powerful because they require few assumptions about how data is related or how it will be extracted from the database. As a result, the same database can be viewed in many different ways.

Record

A general term for a group of related data, usually referring to a single row in a relational database table or an ‘element’ in an XML document.

Re-generate

To re-create Java source files for a changed project. See also Build.

Relation

An almost obsolete term for a table in a relational database. TM uses the XML term ‘element’.

Relational Database Management System

See RDBMS.

Relationship

The definition of the correspondence between one or more instances of an origin element to one or more instances of a destination element. The existence of a destination element can be optional. Useful Relationships usually have a specific set of properties that determine the semantic meaning. These can be positional (some XPath tells me how to get from source to destination) rely on value equivalence (some combination of values in the source has the same combination of values in the destination) or be intrinsic (i.e. in Java object models).

The cardinality of a relationship defines whether the origin element relates to one destination element, or to many. The optionally defines whether the relationship must exist.

Repository

The database used by TM to store projects, transforms and models. The repository stores meta data, it does not store actual data. The TM repository is a relational database. It is completely separate from the data stores which are sources and targets for transformation projects.

Root Element

The element that encloses an entire XML document, and as such has no parent.  Another term for Document Element

Row

TM uses the XML term ‘element’.

Schema

The structure of a database system, described in a formal language supported by the database management system (DBMS). In a relational database, the schema defines the tables, the fields in each table, and the relationships between fields and tables.

Schemas are generally stored in a data dictionary. Although a schema is defined in text database language, the term is often used to refer to a graphical depiction of the database structure.

Segment

TM uses the XML term ‘element’.

Sequential Group

The elements listed must appear in the instance document in sequential order.

Simple Type

Defines the datatype associated with an attribute or elements with text-only content.

Singleton

An item for which they can only be "one of''. For example, an attribute value for the destination end of a relationship with single cardinality.

Source

The XML document or RDBMS containing the data which is to be transformed or migrated.

SQL and SQL92

Structured Query Language. Originally designed for maintaining and reporting on data held in relational databases, ‘SQL grammar’ is now widely used in extracting and filtering data from many different kinds of data stores. ‘SQL92’ is the international standard version of the language established in 1992. Most databases provide partial provision for SQL92 but also extend the system with their own types.

Stack

This is a Transformation Manager term.  Where a project cascades through a number of dependent transforms, those transforms can be referred to as a stack.

stderr

In DOS this will be seen as output to the console.

stdout

In DOS this will be seen as output to the console.

Table

A table in a relational database corresponds to an XML element.

Target

The RDBMS to which data is to be written, or the XML document to be created, when a transformation is performed.

Terminal Element

An element that only includes $content - i.e. it has no child elements or attributes.

TM Adapter

A TM component used to connect to source and target models and instance data.

TM Design Tool

TM Design Tool. The TM component which is used for developing transformation projects and creating projects.

TM Test Tool

Test Tool. The TM component which provides facilities for running transformation projects which have been developed with the TM Design Tool. Intended primarily for testing transformation projects during development.

Tools.jar

A java file provided by Sun, which added to the JRE allows Java code to compile.

Transform

A transform specifies how elements in the source are transformed to elements in the target. For relational models, the source or target element is a table. For XML models, the source or target elements are specified using XPath. A transform can contain simple assignments, or it can include processing instructions to change a format or a value (for example, a number of litres in the source may need to be converted to gallons in the target). It describes how a specific part of the source model will transform to a specific part of the target model. Each transform comprises one or more transformation statements, most of which are assignment statements. Assignment statements can simply copy attributes from the source or target or specify sophisticated paths through relationships

Transform Project

The complete specification of a data transformation or migration process defined in SML. A project contains one or more transforms plus references to the source and target models, global declarations, Java functions. Projects are stored in the TM repository.

Transformation

The conversion of data from one form to another. Often used as an approximate synonym for ‘migration’, which generally means the movement of data from one database to another. Migration can mean straightforward copying, but it most often involves transformation, where the data inserted or merged into the target database is different in some way from the data in the source database.

Tuple

An almost-obsolete term for a row in a relational database. TM uses the XML term ‘element’.

UDF

User Defined Function - See Java Function.

Unicode

Computers store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different encoding systems for assigning these numbers.

Unicode provides a unique number for every character, which is platform, program and language independent.

URI

Uniform Resource Identifier. A generic term which includes URL (Uniform Resource Locator) and URN (Uniform Resource Name). URL is now regarded as ‘an informal term (no longer used in technical specifications) associated with popular URI schemes: http, ftp, mailto, etc.’ [W3C].

URL

Uniform Resource Locator. The URL is the address of a resource, or file, available on the Internet. The URL contains the protocol of the resource (e.g. http:// or ftp://), the domain name for the resource, and the hierarchical name for the file(address). For example, a page on the internet may be at the URL http://www.etlsolutions.com/Product/productStrengths.htm.  The beginning part, http:// provides the protocol, the next part www.etlsolutions.com is the domain, the www is a pointer to a computer or a resource, while the main domain is etlsolutions.com.  The rest, /Product/productStrengths.htm is the pointer to the specific file on that server

A URL could point to other things such as programs, graphic files, or other resources available on the Internet.

User-Defined Function

See Java Function

Valid

A valid XML document is both well-formed and conforms to its DTD or XSD. A document that does not have a DTD can only be well-formed.

Variable

See Global.

W3C

World Wide Web Consortium. An organisation which develops open standards for interoperable technologies on the World-Wide Web.

Well-formed

A well-formed XML document conforms to the generic XML rules for how a document should be structured. Specifically, each start tag must have an end tag, and elements must be properly nested.

White Space

A character used to separate words in text, and parameters in markup. Characters can include space, horizontal tab, ' ' and end of line codes.

XML Parser

Software designed for checking for syntactical and logical errors within an XML file

XML Schema (XSD)

A standard for modelling data-oriented XML documents that overcomes the limitation imposed by DTDs. It is an XML document in its own right.

XPath

A specification for traversing XML nodes, specifying either a relative position from the node in context, or an absolute position from the root node.

XSD Facet

A restriction used to control the acceptable values for XML elements and attributes.  Defined as part of the XSD

XSLT

Extensible Stylesheet Language Transformations. A language for transforming XML documents from one form to another.