System Settings: DST > Tagged XML

By creating a template for XML files, you can determine how elements and attributes are used to generate and display paragraphs in crossDesk. Moreover, use of templates is recommended especially in order to ensure user-friendly translation of XML files and to exploit the full potential of the translation of XML files in Across.

In Across, XML documents can be edited in two different ways: as Tagged XML or Visual XML. Information on the differences between these two types is available here.

Across already contains the following Tagged XML standard templates:

Default template
Purpose
INX
For translating Adobe InDesign files with crossTransform
INCX
For translating Adobe InCopy files with crossTransform
Android strings-xml
For localizing Android string resources with crossTransform
BlackBerry resources
For localizing BlackBerry string resources with crossTransform
iPhone strings
For localizing iPhone string resources with crossTransform

To prevent the standard templates from being damaged or functionally impaired by faulty customization, they are write-protected and cannot be customized. If you still want to edit the templates, you can first export it and then re-import it using the respective buttons. These re-imported templates are no longer write-protected and can be processed.

Template management

Click New to create a new template, click Delete to remove the selected template from the list. If a DTD is linked to a template and the template is used in a project, the template cannot be deleted.

Filling templates

In the Customize Tagged XML settings section, you can define how XML elements are to be processed during check-in and subsequently displayed in crossDesk.

To do this manually, click Add and enter the name of the element. Please note that the name must precisely correspond to the name in the XML document. You can load the XML document via Load. The elements contained in the document will automatically be listed in the template. Subsequently, you can select the elements one by one and configure them as needed by means of the Edit button.

The following file formats can be imported:

Format
Description
*.xml
Normal XML file
*.dtd
Document type definition: Defines the structure, the elements and the attributes of XML files.
*.xsd
XML Schema definition. Within XML-Schema it defines the structure, the elements and the attributes of XML files.
*.sta
XML files from the company Schema
*.ini
Configuration file
Tip

You can import additional XML-based document formats by selecting the option All files - process like XML in the drop-down list during the document selection.

Attention

Please note that Across does not support source files that are processed with Tagged XML and that have the "UTF-16 Big Endian" encoding.

The General Tab

In the General tab you define, which content type corresponds to the element.

  • You can choose between:
  • Normal: normal elements
  • CDATA: Sections in XML documents that the parser does not interpret as XML source code. CDATA sections are often used for text sections that contain many special characters (<, >, ", ').
  • EMPTY: elements with empty content (e.g., line breaks, images, etc.)

Assign the element type. The element may be internal or external.

External vs. inline

External elements are located outside the body text, never within a text line. Usually, they cause a line break. In contrast, inline elements are located within the body text; for example, they may cause a certain word within a string to be displayed in bold type. Usually, elements always consist of a start tag (e.g. <i>) and an end tag (e.g. </i>).

The following example shows a string with an external element <p> (p for paragraph) and an inline <b> element embracing the word "boldface" (b for bold):

<p>This is <b>boldface</b>.</p>

The <p> and </p> tags are located outside the body text and mark the beginning and end of the string or paragraph. The <b> and </b> tags, however, are located within the body text and mark the beginning and end of the bold text (in this case, the word boldface).

Additionally, you can determine how the element is to be displayed in crossDesk.

Attention

Please note that the settings for processing an element also apply to all subordinate elements of the element (child elements). For example, if you determine that a higher-ranking element (parent element) is to be displayed as locked in crossDesk, all associated child elements will also be displayed as locked.

To process child elements differently from the parent element, you can use Conditional XML. For this purpose, select the Conditional option in the properties of the parent element, click Settings, and then click Add. Determine the desired processing mode (e.g. Locked) and confirm with OK. Subsequently, select the child element that is to be processed differently from the parent element, select Conditional, and click Settings and Add. Select the desired processing mode (e.g. Normal) and confirm with OK. In this example, the parent element is displayed as locked in crossDesk, but the child element can be processed/translated as usual.

Treating internal elements as white spaces

By enabling the corresponding option, Internal elements between segments (i.e. sentences) of a paragraph can be treated as white spaces. In this way, Across can recognize these elements as segment or sentence delimiters. Normally, Across does not interpret internal elements as segment delimiters.

The following example explains how this option works:

Consider an HTML file containing the following string:

<p>Sentence A.<br>Sentence B.</p>

If the option is enabled, <br> will be interpreted as white space. Thus, Across will detect a sentence delimiter between "Sentence A." and "Sentence B." and split the string into two segments/sentences.

If the option is disabled, <br> will not be treated as white space. Therefore, Across will not detect any sentence delimiter between "Sentence A." and "Sentence B.", as internal tags are not interpreted as sentence delimiters. Thus, the two sentences will be treated as one segment.

External elements

For external elements, you can use regular expressions by activating the respective option. If the external element matches the regular expression, the element will be displayed accordingly in crossDesk. If it does not match the regular expression, the element will be displayed as "Normal". You can select a regular expression from the drop-down list, e.g. the regular expression for e-mail addresses, and click Insert. Alternatively, you can enter the required regular expression manually in the respective input field.

Conditional ML

Select Conditional and click Settings to define conditions that determine which text is to be translated and which parts are to be invisible to the translator.

Conditional ML: Example

The following example demonstrates the use of Conditional ML:

In the properties of an element, click Settings.

Click Add to add a condition.

With Add attribute, you can add an attribute with a specific value. Moreover, you can add parents and children. Finally, you can select a value for the processing and display of the condition.

sysset_dst_tagged-xml_elementeigenschaften_bedingtes-element_wert_no

The result in crossDesk: The text "Do not translate" is hidden from the translator.

Embedded markup code

Sometimes, documents in markup languages contain code snippets in another markup language. For example, some content management systems generate XML documents that also contain HTML sections. The "alien" code can be embedded in two different ways: by tagging the codes as CDATA sections (<![CDATA[ ... ]]>) or by masking the code by means of character entities (e.g. &lt; for <).

To support these forms of mixed code, the respective elements can be defined as embedded code. It is possible to define the type in which the embedded code is masked: with CDATA by means of character entities.

Attention

Please note that tags may only occur in one of the two code areas. For example, the <p> tag may only occur in the XML code or in the alien code (e.g. HTML), but not in the XML code and in the alien code.

By activating the corresponding option, you can define which structure attribute the element belongs to.

Structure attributes contain information on which area of a document contains an element or which area a translation comes from. It may be relevant to the translation of a segment whether the segment is a chapter heading, a list element, or a GUI button. By selecting a structure attribute from the drop-down list, you can, for example, determine that the current element is a heading.

Length-restricted elements

Finally, you can determine the maximum length of the elements. In certain cases, it may be necessary for an element not to exceed a certain number of characters, e.g. to ensure that the contents will be displayed correctly. In this case, activate the corresponding option and enter the maximum number of characters.

When translating external elements with length restriction, the icn_cDesk_te_laengenbeschraenkung icon in the toolbar of the Target Editor shows the remaining number of characters. The permitted number of characters may be exceeded while editing the paragraph. However, to prevent invalid documents, the paragraph cannot be stored. A pop-up will indicate that the maximum length has been exceeded.

When translating internal elements with length restriction, the permitted number of characters may initially be exceeded while editing the paragraph. However, the storage of the paragraph will be prevented in this case, too.

The length restriction of elements can also be implemented via attributes (e.g. maxlength="5") If your XML files contain elements with such attributes, you can have the length restriction taken into consideration automatically during check-in by activating the corresponding option in the Attributes tab.

Attention

If a length restriction is defined for an element both via the element and via the attribute, the length restriction via the attribute will have priority over the length restriction via the element.

Further information on restricting the length via attributes is provided in the explanations concerning the Attributes tab in the following section.

To crossTank hits, which are longer than a predefined number of characters, a penalty of 1% is applied.

During pre-translation, these crossTank entries, which are too long and would violate length restrictions, are not inserted in the target document. In the reports, the respective entries are displayed under the separate category Match/not inserted (paragraph validation failed).

In case several 100% matches are available for an element with length restriction that is to be translated, you can additionally define bonus points.

The Attributes Tab

The attributes of the respective ML element are managed under the Attributes tab. You can use the buttons to add new attributes, edit the name of existing attributes, and delete existing attributes. Using the drop-down list in the Mode column, you can also define how the particular attribute is to be processed in Across.

You can select one of the following options:

Option
Description
Ignored
The attribute and the corresponding attribute value are displayed as read-only and as part of the respective element in crossDesk.
Translatable
The attribute value of the attribute can be edited (see below).
Length restriction
The attribute is length-restricted (see below).
As comment
The attribute value is displayed as read-only commend in the Source View of crossDesk.
Show as paragraph name
In crossView in crossDesk, the attribute (instead of the entire content of the element) is used as paragraph designation.

Attributes with Length Restriction

By selecting Length restriction, you can determine that the attribute is length-restricted. Both external and internal ML elements may contain attributes with length restrictions (e.g. maxlength="5"). These length restrictions indicate the maximum number of characters that the respective element may contain. If the ML documents to be processed contain elements with such length restrictions, select the Length restriction option. In this way, the contained length restrictions will be read during check-in and taken into consideration during the translation in crossDesk.

In the case of external elements, the icn_cDesk_te_laengenbeschraenkung icon in the toolbar of the Target Editor shows the remaining number of characters. The permitted number of characters may be initially exceeded while editing the paragraph. However, to prevent invalid documents, the paragraph cannot be stored. A pop-up will indicate that the maximum length has been exceeded.

In the case of internal elements with length restrictions, the permitted number of characters may also be initially exceeded while editing the paragraph. However, the storage of the paragraph will be prevented in this case, too.

To crossTank hits, which are longer than a predefined number of characters, a penalty of 1% is applied.

During pre-translation, these crossTank entries, which are too long and would violate length restrictions, are not inserted in the target document. In the reports, the respective entries are displayed under the separate category Match/not inserted (paragraph validation failed).

In case several 100% matches are available for an element with length restriction that is to be translated, you can additionally define bonus points.

A length restriction of elements can also be implemented by means of the element settings.

Attention

If a length restriction is defined for an element both via the element and via the attribute, the length restriction via the attribute will have priority over the length restriction via the element.

Tip

Further information on restricting the length via the element settings is provided in the explanations concerning the General tab in the preceding section.

The Formatting Tab

In the Formatting tab, you can determine whether the element content to be translated is to be displayed with special formatting in crossDesk. For this purpose, activate the checkbox Use special font and select the formatting.

When you are finished, click OK.

Click Configure to access the items Splitting Settings, Preview, Advanced, DTD Settings, and XSD Settings for further settings in connection with the processing and display of XML files.

Preview

Using the Preview command, you can generate previews of XML files based on the settings of the current document settings template. In this way, you can see how the differently defined contents of the file are processed during the check-in in Across and subsequently displayed in crossDesk.

Various colors can be selected to display the different contents.

To generate a preview, click Browse and select the desired XML file. Then click OK. The preview will be created and displayed in a preview window.

Advanced

The Advanced command enables you to make additional settings.

Translating scripts

First, you can determine how scripts contained in XML documents are to be handled. Naturally, scripts are not translated. Nevertheless, the script may contain certain passages that need to be translated. Thus, you can enable the translation of all strings, L_ strings only, or no strings at all.

Usually, variable names with contents that need to be translated are marked with an L_ prefix.

Handling of META charsets

For tags for META charsets, i.e. meta information that determines the character encoding, you can determine that a missing <META> tag may be inserted, the value of the <META> tag may be modified, or the value may not be modified.

Target text encoding

If XML files are not encoded in UTF-8 or UTF-16, it may be necessary to change the encoding of the target documents in order to ensure correct display of all characters. For such cases, you can determine the encoding to be used for the target documents.

  • You can choose:
  • that the correct encoding can be auto-detected by Across,
  • that the encoding of the source document can be used in the target document, or
  • that a particular encoding should be used (via a drop-down list).

Processing of undefined tags

For tags not defined in the respective document processing templates, you can define whether these undefined tags are to be treated as inline tags or as external tags.

Especially if the ML structure of the file to be translated is so dynamic that you do not always know in advance which external tags may occur, this setting can be very useful.

Character entities

Furthermore, you can determine whether and which character entities are to be converted automatically. The entities are consolidated to sets. For a conversion to be performed, you must activate the respective checkbox. The contents of the corresponding set will be displayed in the lower window pane.

Click Add to create a new entity set. Subsequently, you can define new entities by clicking Add in the pane below or import an entity set as a file (in the *.ent format).

Handling white spaces

ML editors often insert special white spaces such as soft line breaks and tabs in the documents in order to present a more "plastic" structure, thereby improving the visual makeup of the documents. However, these white spaces are not relevant for the translation and merely generate an unnecessary processing overhead. You can now determine that these special white spaces are to be kept or "normalized". The option Treat boundary space as parts of a paragraph allows you to keep white spaces at the beginning and end of paragraphs. On the other hand, the option Normalize white spaces allows you to convert both external and internal white spaces to a normal blank spaces in the first step, that is to normalize them, and then to summarize several spaces to a single space.

Treatment of invalid tags

In case the ML document contains invalid tags, the new option Treat invalid tags as text can be activated in order for these tags to be interpreted as plain text, thus enabling the ML document to be checked in and processed. For example, the invalid tags may be unmasked XML characters. In the following example, the < ("less than") is mistakenly not masked as &lt: <p>if x < 1 then write ('test')</p>

If the option Treat invalid tags as text is deactivated, a parsing error will occur when the respective document is checked in. In contrast, the document can be checked in if the option is activated.

Treatment of paragraphs without text

For paragraphs that do not contain any translatable text but only placeables or internal tags, a new option can be activated to determine whether or not these paragraphs are to be extracted at check-in and thus displayed in crossDesk.

The Attributes Tab

The attributes of the respective ML element are managed under the Attributes tab. You can use the buttons to add new attributes, edit the name of existing attributes, and delete existing attributes. Using the drop-down list in the Mode column, you can also define how the particular attribute is to be processed in Across.

You can select one of the following options:

Option
Description
Ignored
The attribute and the corresponding attribute value are displayed as read-only and as part of the respective element in crossDesk.
Translatable
The attribute value of the attribute can be edited (see below).
Length restriction
The attribute is length-restricted (see below).
As comment
The attribute value is displayed as read-only commend in the Source View of crossDesk.
Show as paragraph name
In crossView in crossDesk, the attribute (instead of the entire content of the element) is used as paragraph designation.

Example translatable attributes

In this example, the <sample> element contains the lang attribute. If you define the lang attribute as Translatable, you can edit the attribute value in crossDesk. If, however, you select Ignored, it will be displayed in crossDesk as read-only and as part of the respective element.

Selection Translatable:
Selection Ignored:
The value of the lang attribute can be edited in the Target Editor.
The value of the lang attribute can not be edited in the Target Editor, but is displayed as read-only.
cDesk_te_tagged-xml_element-bearbeitbar
cDesk_te_tagged-xml_element-nicht-bearbeitbar

Attributes with Length Restriction

By selecting Length restriction, you can determine that the attribute is length-restricted. Both external and internal ML elements may contain attributes with length restrictions (e.g. maxlength="5"). These length restrictions indicate the maximum number of characters that the respective element may contain. If the ML documents to be processed contain elements with such length restrictions, select the Length restriction option. In this way, the contained length restrictions will be read during check-in and taken into consideration during the translation in crossDesk.

In the case of external elements, the icn_cDesk_te_laengenbeschraenkung icon in the toolbar of the Target Editor shows the remaining number of characters. The permitted number of characters may be initially exceeded while editing the paragraph. However, to prevent invalid documents, the paragraph cannot be stored. A pop-up will indicate that the maximum length has been exceeded.

In the case of internal elements with length restrictions, the permitted number of characters may also be initially exceeded while editing the paragraph. However, the storage of the paragraph will be prevented in this case, too.

To crossTank hits, which are longer than a predefined number of characters, a penalty of 1% is applied.

During pre-translation, these crossTank entries, which are too long and would violate length restrictions, are not inserted in the target document. In the reports, the respective entries are displayed under the separate category Match/not inserted (paragraph validation failed).

In case several 100% matches are available for an element with length restriction that is to be translated, you can additionally define bonus points.

A length restriction of elements can also be implemented by means of the element settings.

Attention

If a length restriction is defined for an element both via the element and via the attribute, the length restriction via the attribute will have priority over the length restriction via the element.

Tip

Further information on restricting the length via the element settings is provided in the explanations concerning the General tab in the preceding section.

The Formatting Tab

In the Formatting tab, you can determine whether the element content to be translated is to be displayed with special formatting in crossDesk. For this purpose, activate the checkbox Use special font and select the formatting.

When you are finished, click OK.

Click Configure to access the items Splitting Settings, Preview, Advanced, DTD Settings, and XSD Settings for further settings in connection with the processing and display of XML files.

Preview

Using the Preview command, you can generate previews of XML files based on the settings of the current document settings template. In this way, you can see how the differently defined contents of the file are processed during the check-in in Across and subsequently displayed in crossDesk.

Various colors can be selected to display the different contents.

To generate a preview, click Browse and select the desired XML file. Then click OK. The preview will be created and displayed in a preview window.

Advanced

The Advanced command enables you to make additional settings.

Translating scripts

First, you can determine how scripts contained in XML documents are to be handled. Naturally, scripts are not translated. Nevertheless, the script may contain certain passages that need to be translated. Thus, you can enable the translation of all strings, L_ strings only, or no strings at all.

Usually, variable names with contents that need to be translated are marked with an L_ prefix.

Handling of META charsets

For tags for META charsets, i.e. meta information that determines the character encoding, you can determine that a missing <META> tag may be inserted, the value of the <META> tag may be modified, or the value may not be modified.

Target text encoding

If XML files are not encoded in UTF-8 or UTF-16, it may be necessary to change the encoding of the target documents in order to ensure correct display of all characters. For such cases, you can determine the encoding to be used for the target documents.

  • You can choose:
  • that the correct encoding can be auto-detected by Across,
  • that the encoding of the source document can be used in the target document, or
  • that a particular encoding should be used (via a drop-down list).

Processing of undefined tags

For tags not defined in the respective document processing templates, you can define whether these undefined tags are to be treated as inline tags or as external tags.

Especially if the ML structure of the file to be translated is so dynamic that you do not always know in advance which external tags may occur, this setting can be very useful.

Character entities

Furthermore, you can determine whether and which character entities are to be converted automatically. The entities are consolidated to sets. For a conversion to be performed, you must activate the respective checkbox. The contents of the corresponding set will be displayed in the lower window pane.

Click Add to create a new entity set. Subsequently, you can define new entities by clicking Add in the pane below or import an entity set as a file (in the *.ent format).

Handling white spaces

ML editors often insert special white spaces such as soft line breaks and tabs in the documents in order to present a more "plastic" structure, thereby improving the visual makeup of the documents. However, these white spaces are not relevant for the translation and merely generate an unnecessary processing overhead. You can now determine that these special white spaces are to be kept or "normalized". The option Treat boundary space as parts of a paragraph allows you to keep white spaces at the beginning and end of paragraphs. On the other hand, the option Normalize white spaces allows you to convert both external and internal white spaces to a normal blank spaces in the first step, that is to normalize them, and then to summarize several spaces to a single space.

Treatment of invalid tags

In case the ML document contains invalid tags, the new option Treat invalid tags as text can be activated in order for these tags to be interpreted as plain text, thus enabling the ML document to be checked in and processed. For example, the invalid tags may be unmasked XML characters. In the following example, the < ("less than") is mistakenly not masked as &lt: <p>if x < 1 then write ('test')</p>

If the option Treat invalid tags as text is deactivated, a parsing error will occur when the respective document is checked in. In contrast, the document can be checked in if the option is activated.

Treatment of paragraphs without text

For paragraphs that do not contain any translatable text but only placeables or internal tags, a new option can be activated to determine whether or not these paragraphs are to be extracted at check-in and thus displayed in crossDesk.

DTD Settings

Using the DTD Settings command, you can import document type definitions. In this way, the imported DTD will automatically be taken into consideration if the corresponding template is used during the project creation and will be used for the XML files to be processed. The validity check QM criterion checks the target text based on the imported Master DTD for its validity.

Click Load DTD to import the desired DTD. Any dependencies referred to in the DTD are also taken into consideration.

The imported DTD serves as master DTD. This means that this DTD will be used even if an XML file that is checked in to Across makes reference to another external DTD or contains another internal DTD.

If the master DTD is updated (by importing a new DTD in the document settings template), this new DTD will automatically be taken into consideration for XML files that have already been checked in. Thus, these XML files do not need to be checked in anew.

XSD settings

Click XSD Settings to import XML schema definitions (XSD). In this way, the imported XSDs will automatically be taken into consideration when using the respective document settings template during project creation. Moreover, the QM criterion for the validity check will check the validity of the target text on the basis of the imported XSD.

In addition to XSD files, XML files in which the XML schema is integrated (XML MetaData; XMD) are also supported.

Click Add XSD to import the desired XSD. Click Add from XML to load an XML schema from an XML file.

The imported XSD serves as master XSD. This means that this XSD will be used even if an XML file that is checked in to Across makes reference to another external XSD or contains another internal XSD.

If the master XSD is updated (by importing a new XSD in the document settings template), this new XSD will automatically be taken into consideration for XML files that have already been checked in. Thus, these XML files do not need to be checked in anew.