Paragraph or Sentence-Based Segmentation

When a text is checked in to Across, the source text is automatically divided into segments. The translation of the texts in crossDesk and the storage of both the source and the target texts in crossTank is then also carried out in segments. This segmentation of the source and target texts is fundamental to the functioning of a translation memory system because it is this that allows the system to find a previously translated sentence or paragraph in the translation memory, to normally avoid having to translate it again.

In Across, there are two different segmentation modes: segmentation by sentences and segmentation by paragraphs.

In Across, you can define the desired segmentation method for every document format. The configuration is carried out under Tools > System Settings > General> Segmentation.

Sentence-Based Segmentation

Activated by default in Across for all document formats.

This method first divides the source texts into paragraphs and subsequently into sentences by means of predefined sentence rules.

Advantage: Sentences are found in crossTank even if the rest of the sentences in the paragraph stored in crossTank do not match the sentences of the paragraph to be translated.


Paragraph-Based Segmentation

Source text is simply divided up into paragraphs.

This can be useful to avoid sentences being inserted in the translation out of context, e.g. in the case of a pre-translation.