Data Source Sync

The Data Source Sync system is a mechanism for updating metadata from a remote data source. There are currently two types of data sources supported: database (via JDBC), and XML. The sync system is controlled by a single configuration file, called syncedMetadata.xml, which is located in the .../netx/config directory of the NetX installation. Sync events are logged in exogen.log.

Sync tasks

There are currently three different types of sync tasks that the system can perform: scalar, multivalue, and valuelookup.

Scalar Sync Task

This type of task updates the metadata for a group of assets, categories, or views. The incoming data is linked to attributes in the DAM on one or more fields. A search is done for the assets, categories or views to update for each row of incoming data (either directly from a database, or from XML), based on the linked fields. If one or more objects are found that match each row, the metadata for those objects are updated from that data row.

Multivalue Sync Task

This type of task updates the group of values in a multivalued attribute (i.e. a pulldown or tag.) Note that this type of sync task doesn't update specific assets, just updates the group of valid values for a particular attribute.

Valuelookup Sync Task

This type of task updates a single attribute of a single asset, category, or view. It is useful for setting initial values on linked attributes when new assets are imported into the system.

Scalar Sync Type

When you set up the sync system, you will define one or more sync tasks. Each task defines a set of source data, a set of destination objects, and how the source data is to be matched to the destination objects. You can define several different types of sync tasks.

Lets look at a simple example:

<?xml version="1.0" encoding="UTF-8" ?>
<metadatasync version="1.0">
<sync type="scalar" log="debug" name="UpdateTeams">
	<source name="DB1" type="database" jdbc="jdbc:mysql://127.0.0.1/synced_metadata" query="SELECT * FROM teams" username="sa" password="secret">
		<field name="teamId" column="id"/>
		<field name="teamName" column="name"/>
		<field name="teamHome" column="teamhome"/>
		<field name="league" column="league"/>
		<field name="conference" column="conference"/>
		<field name="division" column="division"/>

		<callback type="success" sql="UPDATE teams SET synced = 1 WHERE id = %1" param1="teamId" frequency="row"/>

		<callback type="failure" sql="UPDATE teams SET sync_failed = 1 WHERE id = %1" param1="teamId" frequency="row"/>
	</source>

	<destination type="DAM">
			<records name="Hockey teams" type="asset">
				<link field="teamId" attribute="Team Id" />

				<map field="conference" attribute="Conference"/>
				<map field="division" attribute="Division"/>
				<map field="teamHome" attribute="Team Home"/>
				<map field="league" attribute="League"/>
				<map field="teamName" attribute="Team Name"/>
 
				<setvalue attribute="Games Won" value="0"/>
			</records>
		</destination>
	</sync>
</metadatasync>

 This configuration file defines a single sync task, named “UpdateTeams”. Each sync task must contain a single source and a single destination.

 In this example, we're going to get data from a database table and use it to update the metadata for some assets in our DAM. We're going to get data from the teams table in our database, which looks something like this:

In the source tag, we set the JDBC URL to connect to the database, as well as the username and password to use. The SQL query is also specified that will return the rows of data that we want to use. This can be any query that returns a result set from the database, including calling a stored procedure that returns a result set.

Inside the source tag, we define the fields that we want to take from the result set in a set of field tags. These are the values that will be available in the destination. The column attribute of the field tag contains the column name as returned by the SQL query, and the name attribute is how you will refer to this field in the rest of the configuration file (i.e. this is a name you specify, essentially a variable; it can be the same as the column name if you want.)

The last items defined inside the source tag of this example are two callbacks. Callbacks are available in the database data source type, and will be called upon the failure or success of the data update. A SQL query is defined, as well as the callback type (success or failure). Up to five parameters may be sent to the callbacks, specified inside the query as %1 - %5. The param1param5 attributes contain the fields used for the parameters (these must be fields defined in a field tag.) The frequency attribute controls whether the callback is called for each row, or once after the entire data set has been processed.

Currently, the only valid type of destination is DAM. There could be others defined in the future. Each destination tag contains one or more records tags. The records tag defines the mapping between database fields and attributes, as well as specifies how each row of source data will be matched to assets, categories, or views. It also specifies which type of attributes to update (asset, category, or view).

The link tag defines how source data rows are matched to assets. There can be multiple link tags, but there must be at least one link tag. This tag has two attributes, a field attribute that will specify one of the fields defined in the data source (must match value set for name in the field tag), and an attribute attribute that names the metadata attribute in the DAM that the field will be matched to. It is important to note that it is possible for more than one asset to match each row of data in the source data (it is also possible that 0 assets will match.)

In the above example, the id column in the database is linked to the Team ID attribute through the teamId field . Lets say that the data source contains five rows, like these:

When the data is processed, each row will be processed in turn. So starting with the first row, the sync system will search for all assets which have the value “35” in their Team ID attribute. If eight assets are found with this value, then all eight assets will be updated from the first data row.

We define the attributes which will be updated in these eight assets with the map tags. Each map tag maps a field to an attribute in the DAM. So for each of the eight matching assets, the attributes named in the map tags will be updated to the matching values in the source data row.

Additionally, attributes can be initialized with the setvalue tag. This sets a specified attribute to an explicit value defined in the tag (see above example).

Once all eight assets have been updated, and any appropriate callbacks have been called, the next row will be processed (the one where id = 36.)

A sync task can contain a third type of tag, which is the value_map tag. This allows you to map discrete values in the source data to new values in the destination data. For example, you might define the following value map:

<value_map name="league_map">
	<map_value from="WHL" to="Western Hockey League"/>
	<map_value from="NHL" to="National Hockey League"/>
	<map_value from="OHL" to="Ontario Hockey League"/>
	<map_value from="QMJHL" to="Quebec Major Junior Hockey League"/>
	<map_value from="CHL" to="Canadian Hockey League"/>
</value_map>

<destination type="DAM">
	<records name="Example" type="asset">
		<link field="teamId" attribute="Team Id" />
		<map field="league" attribute="League Name" value_map="league_map"/>
	</records>
</destination>

In this example, values in the source data will be checked to see if they match anything in the value map, with new values being substituted if they match, before updating the destination attribute. You can define multiple mappings within a value map and each value map relates only the task in which it's defined. When configuring the destination for the data, you can then also specify which value map you want to use for each mapped attribute (express within the map element, as illustrated in the example above).

Multivalue Sync Type

The multivalue sync type allows you to update a pulldown or tag attribute with a new set of allowed values (i.e. update the domain for the attribute.) Let's have an example!

Suppose you have a pulldown attribute that displays a list of hockey league acronyms:

Now let us suppose that you have a database table with hockey league information, with this data:

You might like to periodically update the available list of leagues in your pulldown attribute from the data in the table, so that when new leagues are added, they will appear in the pulldown, and when leagues are removed, they are removed from your pulldown.

This is how to set up a multivalue sync type:

<sync type="multivalue" log="debug" name="UpdateLeagues">
	<source name="DB1" type="database" jdbc="jdbc:mysql://127.0.0.1/synced_metadata" query="SELECT league_name_short FROM leagues WHERE visible = 1 ORDER BY league_name_short" username="sa" password="secret">
		<field name="leagueNames" column="league_name_short"/>
		<callback type="success" sql="UPDATE leagues SET synced = 1" frequency="end"/>
		<callback type="failure" sql="UPDATE leagues SET sync_failed = 1" frequency="end"/>
	</source>
	<destination type="DAM">
		<records name="Hockey leagues" type="asset">
			<map field="leagueNames" attribute="League"/>
		</records>
	</destination>
</sync>

This database query will produce a single-column result set with multiple values. These values will be used to set the list of values allowed in the pulldown mentioned in the destination records block.

Value Lookup

A third type of sync task is available, which allows you to update a single asset with values looked up from the data source. This is useful in situations where you need to set an id field on a newly imported asset so that subsequent sync operations will update that asset.

This type of sync task must be triggered by an AutoTask, because it must have context information available, which is supplied by the AutoTask. The following is an example of a value lookup task definition.

<sync type="valuelookup" log="debug" name="GetTeamID">
	<source name="Hockey DB" type="database" jdbc="jdbc:mysql://127.0.0.1/synced_metadata" query="SELECT getTeamId(%1, %2) AS team_id" username="sa" password="secret">
		<param type="ASSET_FILE_BASENAME" position="1"/>
		<param type="CATEGORY_NAME" position="2"/>
		<field name="TeamID" column="team_id"/>
	</source>
	<destination type="DAM">
		<records name="Imported Asset" type="asset">
			<map field="TeamID" attribute="Team Id"/>
 
			<setvalue attribute="Games Won" value="0" valueLookup="success"/>
		</records>
	</destination>
</sync>

The param elements in the data source definition use context data supplied by the AutoTask to build the query used to look up the desired data. Please note: the query used for this sync task type should result in a single row with a single column of data, i.e. a single value with which to update a single attribute on a specific asset. The type attribute of the param element names the context data item to be used, and the position attribute specifies the location in the query where that item should be substituted (if position is “1”, then the string “%1” will be replaced with that value in the query.)

These are the context values which are defined:

Name

Description

ASSET_FILENAME

Full asset file name

ASSET_FILE_BASENAME

Asset file name without extension

ASSET_FILENAME_EXTENSION

Asset file name extension

ASSET_ID

Asset ID

ASSET_ATTRIBUTE

Value of named attribute on the asset

CATEGORY_NAME

Name of category

CATEGORY_PATH

Full path name of category

CATEGORY_ID

Category ID

CATEGORY_ATTRIBUTE

Not currently implemented

USER_ID

ID of user who triggered the autotask

DATASET_ID

ID of dataset which triggered the autotask

Placeholders in the SQL query should NOT be wrapped in quotes. When the placeholders are substituted with data, the type of the data is taken into account and wrapped with quotes if necessary.

The autotask job for using a ValueLookup type task is different than the one for other tasks, because you must specify which task is the one that looks up the ID value, and then list the other tasks you want to run using the new ID. Here is an example of an autotask configuration for ValueLookup:

<autotask>
	<task name="ValueLookupJob Task">
		<matchCriteria type="or">
		<criteria type="action" value="import" />
		<criteria type="action" value="resync" />
		</matchCriteria>
		<customJob idSyncName="lookupIdTask" initMetadata="task1,task2,task3" className="com.netxposure.products.imageportal.module.metadata.sync.job.ValueLookupJob"/>
	</task>
</autotask>

 

Filters

There is a filtering mechanism to limit the set of assets updated by a particular sync task. There are currently two types of filters available. The first is a Solr query filter, where a Solr query is specified that limits the assets that can be matched to a specific row of data. In terms of sets, the set of assets that will be updated if there is a Solr query filter is the intersection of the Solr query result set and the set of assets that match on the linked fields for that data row.

Filters are defined in the records block. For example, for a particular sync task, you may want to limit the assets updated to the specific assets where the league = “NHL”:

<destination type="DAM">
	<records name="Hockey teams" type="asset">
		<filter type="solrquery" query="league:NHL"/>
		<link field="teamId" attribute="Team Id" />
		<map field="conference" attribute="Conference"/>
		<map field="division" attribute="Division"/>
		<map field="teamHome" attribute="Team Home"/>
		<map field="league" attribute="League"/>
		<map field="teamName" attribute="Team Name"/>
		<map field="league" attribute="League Name" value_map="league_map"/>
	</records>
</destination>

The other type of filter is a context filter. This type of filter will only work if the sync task is being run from an AutoTask where there is context information available (i.e. an AutoTask triggered by import/resync/reimport, etc.) This type of filter is useful, for example, if you need to give a newly imported asset initial values for a set of its attributes. For example:

<sync type="scalar" log="debug" name="InitialValues" include_in_default="false">
	<source name="DB1" type="database" jdbc="jdbc:mysql://127.0.0.1/synced_metadata" query="SELECT * FROM teams WHERE id = %1" username="sa" password="secret">
		<param type="ASSET_ATTRIBUTE" position="1" attribute="Team Id" datatype="integer"/>
		<field name="teamId" column="id"/>
		<field name="teamName" column="name"/>
		<field name="teamHome" column="teamhome"/>
		<field name="league" column="league"/>
		<field name="conference" column="conference"/>
		<field name="division" column="division"/>
	</source>
	<value_map name="league_map">
		<map_value from="WHL" to="Western Hockey League"/>
		<map_value from="NHL" to="National Hockey League"/>
		<map_value from="OHL" to="Ontario Hockey League"/>
		<map_value from="QMJHL" to="Quebec Major Junior Hockey League"/>
		<map_value from="CHL" to="Canadian Hockey League"/>
	</value_map>
	<destination type="DAM">
		<records name="Imported Asset Initial Values" type="asset">
			<filter type="context">
				<context type="ASSET_ID" />
			</filter>
			<link field="teamId" attribute="Team Id" />
			<map field="conference" attribute="Conference"/>
			<map field="division" attribute="Division"/>
			<map field="teamHome" attribute="Team Home"/>
			<map field="league" attribute="League"/>
			<map field="teamName" attribute="Team Name"/>
			<map field="league" attribute="League Name" value_map="league_map"/>
		</records>
	</destination>
</sync>

This task is designed to set initial values for a set of attributes after an asset has been imported. You would want to run this after a value lookup sync task to make sure that the Team Id attribute was given the correct value. The context filter ensures that the only asset updated by this task is the one with the asset id given by the context. ASSET_ID is the only context value supported currently, but this is easily expanded to support all of the context values that are availabe as query parameters.

Note the use of the ASSET_ATTRIBUTE parameter to the SQL query. For this parameter type, the name of the attribute must be given in the attribute XML attribute, and you must specify a data type in the datatype attribute so that the query can be formatted properly.

Note that only one filter per records block is allowed.

Callbacks

There are currently two types of callbacks supported. The first type is a database datasource callback, where a query can be executed on the database during or after a sync task. The other type of callback is a mechanism where a job can be run in the DAM during or after the sync task. Both types of callbacks can be configured to either run if the sync task is successful or fails. Multiple callbacks can be defined, so you can define a success callback and a failure callback, or any other set of callbacks you need.

Database Callbacks

Here is an example of database callbacks:

<source name="DB1" type="database" jdbc="jdbc:mysql://127.0.0.1/synced_metadata?useUnicode=true&amp;characterEncoding=utf8" query="SELECT * FROM teams" username="user" password="password">
	<field name="teamId" column="id"/>
	<field name="teamName" column="name"/>
	<field name="teamHome" column="teamhome"/>
	<field name="league" column="league"/>
	<field name="conference" column="conference"/>
	<field name="division" column="division"/>

	<callback type="success" sql="UPDATE teams SET synced = 1 WHERE id = %1" param1="teamId" frequency="row"/>
	<callback type="failure" sql="UPDATE teams SET sync_failed = 1 WHERE id = %1" param1="teamId" frequency="row"/>

	<callback type="success" sql="INSERT INTO sync_journal (result, synctime) VALUES ('succeeded', NOW())" frequency="end"/>
	<callback type="failure" sql="INSERT INTO sync_journal (result, synctime) VALUES ('failed', NOW())" frequency="end"/>
</source>

Parameter substitution is available in the callback query. Allowed parameters are the field names defined in the field elements. Up to five parameters may be defined. Each param should be set with a param attribute, param1 - param5. The symbols %1 - %5 are used in the query to determine where the substitution should take place. The callback system determines whether the value needs to be quoted or not, so please do not include quotes in the query.

Database callbacks can be defined to run either after every row in the data source is processed, or at the end of the task after all rows are processed. Parameter substitution will not work in callbacks that run at the end of the sync task, since they are not executed in the context of a specific row of data.

DAM Callback Jobs

DAM callback jobs are jobs that are run during or after a sync task has completed. These jobs must be classes that inherit from the com.netxposure.products.imageportal.module.metadata.sync.job.callback.CallbackJob class. The jobs can be configured to run at the end of the sync task, after each row of source data is processed, or after each asset has been updated (in this case a job may be executed more than once per source data row, if that row matches multiple assets in the DAM.)

Placeholders

It is possible to create an asset placeholder if no assets match a specific row of data. The placeholder will be given the metadata as mapped for that particular data row, and will be named using the value in a specific source field. Here is an example of how to configure placholders:

<records name="Hockey teams" type="asset">
	<link field="teamId" attribute="Team Id" />
	<map field="conference" attribute="Conference"/>
	<map field="division" attribute="Division"/>
	<map field="teamHome" attribute="Team Home"/>
	<map field="league" attribute="League"/>
	<map field="teamName" attribute="Team Name"/>
 
	<placeholder create="true" placeholderCategoryId="23" nameField="teamName"/>
</records>

A specific category id must be given for the placeholders to be created in.

Placeholders may only be configured for record blocks of type asset, and may only be used in scalar sync types.

XML Data Sources

In addition to using a database as a data source, the system currently supports using XML files as data sources. Two basic forms of XML data are supported. XML with element-contained data, and XML with attribute-contained data. An Xpath expression must be given to grab rows of data from the XML.

XML with element-contained data looks like this:

<?xml version="1.0"?>
<data>
	<rows>
		<row>
			<id>1</id>
			<teamhome>Anaheim</teamhome>
			<name>Ducks</name>
			<league>NHL</league>
			<conference>Western</conference>
			<division>Pacific</division>
			<wins>54</wins>
			<losses>20</losses>
			<overtime_losses>8</overtime_losses>
			<shootout_losses>0</shootout_losses>
		</row>
		<row>
			<id>2</id>
			<teamhome>Boston</teamhome>
			<name>Bruins</name>
			<league>NHL</league>
			<conference>Eastern</conference>
			<division>Atlantic</division>
			<wins>54</wins>
			<losses>19</losses>
			<overtime_losses>9</overtime_losses>
			<shootout_losses>0</shootout_losses>
		</row>
	</rows>
</data>

XML with attribute-contained data looks like this:

<?xml version="1.0"?>
<data>
	<rows>
		<row id="1" teamhome="Anaheim" name="Ducks" league="NHL" conference="Western" division="Pacific" wins="54" losses="20" overtime_losses="8" shootout_losses="0"/>
		<row id="2" teamhome="Boston" name="Bruins" league="NHL" conference="Eastern" division="Atlantic" wins="54" losses="19" overtime_losses="9" shootout_losses="0"/>
	</rows>
</data>

Here is an example of a task definition to sync with XML:

<sync type="scalar" log="debug" name="XML Metadata sync" include_in_default="true">
	<source name="Team XML" type="xml" filepath="/opt/netx/xml/teams.xml" category_id="32" row_xpath="//row" container_type="attributes" delete_data_file="false" file_to_process="newest">
		<field name="teamId" column="id"/>
		<field name="teamName" column="name"/>
		<field name="league" column="league"/>
		<field name="conference" column="conference"/>
		<field name="division" column="division"/>
	</source>
	<destination type="DAM">
		<records name="New Team Data" type="asset">
			<link field="teamId" attribute="Team Id" />
			<map field="conference" attribute="Conference"/>
			<map field="division" attribute="Division"/>
			<map field="league" attribute="League"/>
			<map field="teamName" attribute="Team Name"/>
			<map field="league" attribute="League Name"/>
		</records>
	</destination>
</sync>

The container_type element specifies which format the XML file uses, and the value should either be “attributes” or “elements”. The row_xpath attribute contains the Xpath expression used to retrieve data rows.

There are two options for the location of the XML file. The file can be somewhere in the filesystem, in which case the path to the file must be given in the filepath attribute. Alternatively, the XML file can be an asset in the DAM. In this case, the category_id attribute must have the id of the category containing the file, and the file_to_process attribute must specify which file to process. The options are “newest”, in which case the most recently imported asset is used as the source XML file, or “all”, in which case all contents of the category are treated as XML and used to sync.

If the file should be deleted after the sync, the delete_data_file attribute should be set to true (this does not apply to assets, only to XML files contained in the filesystem.)

Pushing Data From NetX

It is possible to sync data FROM NetX to an external data destination. Currently supported data destination types are XML files and an external database via JDBC.

There are three options for choosing which data to use in a sync task where NetX is the data source: by DAM folder, based on the value of an attribute, or based on the modification date. Here are examples of all three data source configurations:

<!-- ISSUE/NOTE: Only "scalar" sync types make sense for DAM==>JDBC/XML sync -->
<sync type="scalar" log="debug" name="UpdateTeamRecords">
    <!-- This will always dump metadata for all assets in the specific category (must use either categoryName or categoryId, but not both) -->
    <source type="DAM" name="netx dam" objectType="asset" category_id="99" categoryName="some/category/name">
        <!-- NOTE: In this context "column" refers to the attribute name -->
        <!-- NOTE: The following "columns" are also available in most contexts: -->
        <!-- asset.assetId -->
        <!-- asset.name -->
        <!-- asset.description -->
        <!-- asset.attribute.attribute_name (this is not needed if the objectType is "asset" since any names will be interpreted as asset attribute names) -->
        <field name="field.teamId" column="Team ID"/>
        <field name="field.teamName" column="Team Name"/>
        <field name="field.teamCity" column="Team City"/>
    </source>

<sync type="scalar" log="debug" name="AddNewTeamsWhenAttributeChanges">
    <!-- This will always dump metadata for all assets where the attribute "Send To External Database" is set to "true" -->
    <source type="DAM" name="netx dam" objectType="asset" attributeName="Send To External Database" attributeValue="true">
        <field name="field.teamId" column="Team ID"/>
        <field name="field.teamName" column="Team Name"/>
        <field name="field.teamCity" column="Team City"/>
    </source>

<sync type="scalar" log="debug" name="AddNewTeamsByDate">
    <!-- This will always dump metadata for all assets where the asset mod date has been update in the last 2 days -->
    <!-- s - seconds -->
    <!-- m - minutes -->
    <!-- h - hours -->
    <!-- d - days -->
    <!-- w - weeks -->
    <!-- M - months -->
    <!-- These MUST appear in the following order in the string: Mwdhms -->
    <!-- The period string may contain spaces between components, so either of the following is valid: "2d10h30m" or "2d 10h 30m" -->
    <source type="DAM" name="netx dam" objectType="asset" modDatePeriod="2d">
        <field name="field.teamId" column="Team ID"/>
        <field name="field.teamName" column="Team Name"/>
        <field name="field.teamCity" column="Team City"/>
    </source>

XML Destination

For syncing data to an XML file, a path and filename must be given in the records block. Also, the same two basic XML formats that are supported in syncing from source XML are supported for syncing TO XML. Here are a couple of examples:

<destination type="xml">
    <!-- NOTE: Each records block would result in its own XML file -->
    <!-- NOTE: Use $TIMESTAMP in the filename to include date/time in YYYYMMDDHHSS format -->
    <records name="Hockey teams" container_type="attributes" filepath="/opt/netx/xml/teams_$TIMESTAMP.xml">
    <!-- NOTE: See metadata sync docs in confluence for discussion of container_type (attributes vs. elements) -->
        <!-- NOTE: There is no link element, we're just dumping records -->
        <!-- NOTE: In this context attribute is the name of the data field in the XML (either the XML element or the XML attribute, depending on what is specified in the container_type setting -->
        <map field="teamId" attribute="team_id"/>
        <map field="teamName" attribute="team_name"/>
        <map field="teamCity" attribute="team_city"/>
    </records>
</destination>
<destination type="xml">
    <records name="Hockey teams" container_type="elements" filepath="/opt/netx/xml/teams2.xml">
        <map field="teamId" attribute="team_id"/>
        <map field="teamName" attribute="team_name"/>
        <map field="teamCity" attribute="team_city"/>
    </records>
</destination>

JDBC Database Destination

For syncing to an external database, the JDBC connection string, username, password, and query must be provided in the records block. Attribute names are used symbolically in the SQL query. Following are two examples:

<destination type="jdbc">
    <!-- NOTE: Each records block would result in its own external JDBC call -->
    <records name="Hockey teams" jdbc="jdbc:mysql://127.0.0.1/external_database" query="UPDATE teams SET name = :team_name, city = :team_city WHERE id = :team_id" username="sa" password="secret">
        <!-- NOTE: There is no link element, we're just dumping records -->
        <!-- NOTE: In this context attribute is the name of the data field in the SQL -->
        <map field="field.teamId" attribute="team_id"/>
        <map field="field.teamName" attribute="team_name"/>
        <map field="field.teamCity" attribute="team_city"/>
    </records>
</destination>
<destination type="jdbc">
    <records name="Hockey teams" jdbc="jdbc:mysql://127.0.0.1/external_database" query="INSERT INTO teams (id, name, city) VALUES (:team_id, :team_name, :team_city)" username="sa" password="secret">
        <map field="field.teamId" attribute="team_id"/>
        <map field="field.teamName" attribute="team_name"/>
        <map field="field.teamCity" attribute="team_city"/>
    </records>
</destination>

Invoking The Sync System

This system is designed to be invoked from an AutoTask. The job to specify in the AutoTask is com.netxposure.products.imageportal.module.metadata.sync.job.MetadataSyncJob. Optionally a list of specific task names can be specified for specific contexts in which the job should be run. Here is an example of a set of AutoTasks:

<?xml version="1.0" encoding="UTF-8"?>
<autotask>
	<task id="Import Asset Sync Metadata" name="Import Asset Sync Metadata">
		<matchCriteria type="or">
			<criteria type="action" value="import"/>
			<criteria type="action" value="resync"/>
		</matchCriteria>
		<customJob syncTaskNames="getTeamId,setTeamDefaults" className="com.netxposure.products.imageportal.module.metadata.sync.job.MetadataSyncJob"/>
	</task>
	<task id="Morning metadata sync" name="Morning metadata sync">
		<matchCriteria type="and">
			<criteria hour="11" interval="daily" minute="00" type="periodic"/>
		</matchCriteria>
		<customJob 	className="com.netxposure.products.imageportal.module.metadata.sync.job.MetadataSyncJob"/>
	</task>
	<task id="Afternoon metadata sync" name="Afternoon metadata sync">
		<matchCriteria type="and">
			<criteria hour="20" interval="daily" minute="00" type="periodic"/>
		</matchCriteria>
		<customJob 	className="com.netxposure.products.imageportal.module.metadata.sync.job.MetadataSyncJob"/>
	</task>
</autotask>

In this example, the first task, “Import Asset Sync Metadata”, is run when an asset is imported or resynced. A list of sync tasks is included, getTeamId, and setTeamDefaults. The getTeamId task would be a value lookup sync task that would set the Team ID attribute so that the new asset can be associated with data in future sync operations. The second task is the setTeamDefaults task, which uses the newly set Team ID to retrieve initial values for the assets metadata.

The next two tasks run a general metadata sync at specific times every day. In this example, it is done at 11am and 8pm. No specific sync tasks are specified with the job, so all of the default tasks are run (all tasks where the include_in_default attribute is not set to false.)

Notes

It is possible that a database data source may be used that is not using the same database platform that NetX itself uses (for example, NetX might be using SQL Server, but a sync task needs to be defined that gets data from a MySQL database.) The JDBC driver for whatever type of connection is being made from the sync system must be present in the ROOT/WEB-INF/lib directory of the NetX installation. This might mean that there are multiple JDBC driver jars in a single NetX installation, if the database platforms are different.

When I get time, I'll create an XML schema definition or doctype definition or something for the config file, which would make validation a lot easier, since its not a simple configuration.

It is possible to set a version flag on the root element, i.e.:

<metadatasync version="1.0">

It is a really good idea to do this, because the format of this file will undoubtedly change (like I'll think up better names than “scalar” and “multivalue” for those sync task types.) The file reader is written so that it will be easy to have versioned file readers.

Configuration Reference

metadatasync

This is the root element for the configuration file, and contains one or more sync element.

Attribute Description Required Default
version
Config file version No 1.0
 

sync

This is the definition of a sync task. It must contain exactly one source element, one destination element, and zero or more value_map elements.

Attribute Description Required Default Values
type
Sync task type Yes   scalar,multivalue,valuelookup
log
Logging level No summary summary,detail,debug
name
Name of tast No    
include_in_default
Indicates if this task should be included in the default sync call   true true / false

source

This is the data source definition. The list of options depends on what kind of data source it is. It will contain one or more field elements, defining the data fields taken from the source, and can optionally contain one or more param elements and one or more callback elements.

Database

Attribute Description Required Values
name
Name of data source No  
type
Type of data source Yes database,xml
jdbc
JDBC connection string Yes  
query
SQL query to get data Yes  
username
Name of user to connect with Yes  
password
Password to connect with Yes  
 

XML

Attribute Description Required Values
name
Name of data source No  
type
Type of data source Yes database,xml
filepath
Filesystem path to XML file No  
category_id
ID of DAM category with XML file(s) No  
row_xpath
Xpath to get data rows Yes  
container_type
Format of XML file Yes attributes,elements
delete_data_file
Indicates whether or not to delete file (this only applies if the XML file is in the filesystem, NOT when the file is an asset.) No  
file_to_process
Indicates which file to use in a category No  

param

This element defines context parameters for the query in a database data source.

Attribute Description Required Values
type
Type of parameter Yes See discussion
position
Replacement string in query Yes  
attribute
Name of attribute for attribute type No  
datatype
How to format data in query No  
skip_null
Skip null values in response data No  
skip_empty_string
Skip empty string values in response data No  

field

This element defines the fields being retrieved from the data source.

Attribute Description Required
name
Name of the field Yes
column
Name of the column in the source Yes

callback

In a database data source, defines callback queries to be run on success or failure.

Attribute Description Required Default Values
type
Type of callback Yes success success, failure
sql
Query to run Yes    
param1
First parameter (replaces %1 in query) No    
param2
Second parameter (replaces %2 in query) No    
param3
Third parameter (replaces %3 in query) No    
param4
Fourth parameter (replaces %4 in query) No    
param5
Fifth parameter (replaces %5 in query) No    
frequency
When callback should run No    
 

In a records block inside a DAM destination, a callback indicates a job to run in the DAM after a set of assets have been updated.

 

placeholder

Configures placeholder creation. This is only valid for scalar sync tasks, and for asset records.

Attribute Description Required Default Values
create
Set to true to create placeholders Yes false true / false
placeholderCategoryId
ID of category where placeholders are to be created Yes    
nameField
Name of source data field to use to name asset Yes    
 

value_map

Defines a value map. Contains one or more map_value elements.

Attribute Description Required
name
Name of the value map Yes
 

map_value

Contains the value mappings inside a value_map.

Attribute Description Required
from
Value to change Yes
to
Value to replace Yes
 

destination

Definition of data destination. This will contain one or more records elements.

Attribute Description Required Default Values
type
Destination type Yes DAM DAM
 

records

Records block definition. This defines a set of assets to update with the incoming data. This will contain one or more link elements, one or more map elements, and optionally, a single filter element.

Attribute Description Required Default Values
name
Name of records block No    
type
Type of records to be updated Yes asset asset, category, view
assetMatcher
Allows override of default asset matching to use the database to match assets to data rows instead of Solr. This can be useful if a sync task is dependent on another sync task to update an attribute, causing a race condition where Solr may not be updated before the second task runs. Currently, this is only relevant if the type attribute is set to asset. No solr solr, database
 

filter

Defines a destination asset filter to reduce the number of assets updated by the current data row.

Attribute Description Required Values
type
Type of filter Yes solr, context
query
Solr query in a solr filter Yes  
 

context

Defines context fields in a context filter.

Attribute Description Required Values
type
Type of context Yes ASSET_ID
 

Defines the attribute to source column links that match data rows to assets.

Attribute Description Required
field
Name of field in link Yes
attribute
Name of attribute in link Yes
 

map

Maps incoming data columns to attributes.

Attribute Description Required Default Values
field
Name of field in map Yes    
attribute
Name of attribute in map Yes    
datatype
Applies formatting to value before update No   currency
value_map
Name of value map to use on field No    
default
Default value to set attribute if the source data is null or an empty string No    
match
Regular expression; not implemented No    
format
Format of data; not implemented No    
replacement
Not implemented No    
skip_null
Skip null values in response data No    
skip_empty_string
Skip empty string values in response data No    
overwriteExistingValue
Overwrite existing attribute value No true true / false
 

setvalue

Allows attributes to be initialized to a specific value. This is only allowed in the context of scalar or valuelookup sync task types.

Attribute Description Required Default Views
attribute
Name of attribute to initialize Yes    
value
Value to initialize attribute to Yes    
valueLookup
Controls whether values are set in a valuelookup task depending on whether the value lookup succeeded or failed, or either. No both success, fail, both
overwriteExistingValue
Overwrite existing attribute value No true true / false
 
Was this article helpful?
0 out of 0 found this helpful