Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Decoration takes a text segment as a source with a specific source language and 'decorates' it with term translations to another specific target language. Decorated terms are marked with <mrk> XML tags. Different cases and features during decoration are described here:

XML tags in segment

When the input segment contains itself XML tags, like XLIFF tags, then these are ignored as the decoration is concerned. Firstly they are being parsed and removed from the segment. Then the decoration algorithm adds the translated terms to the segment and finally the removed XML tags are being added back to the decorated segment to the appropriate places. If the XML tags come in conflict with the <mrk> tags, then they get split. The <mrk> tags never get split, because one <mrk> tag always represent one translated term and splitting the <mrk> tags would affect the result of the decoration. Split XML tags, especially XLIFF tags with ids, might create inappropriate formats. This is why they should be handled by the user who receives the response.

Code Block
languagexml
Example Segment: <g id=123>Terminology Management</g> Software
Example Term: Management Software
Decoration: <g id=123>Terminology </g><mrk><g id=123>Management<\g> Software</mrk>

Conflicting terms (sharing common words)

There are cases where multiple terms sharing common words are eligible to be included in the decoration. In this case not all of them can be included. Since TermWeb version 3.18.0.7 the longest one in characters is prioritized.

Code Block
languagexml
Example Segment: Terminology Management Software
Example Terms: Terminology Management (22 characters), Management Software (19 characters)
Decoration: <mrk>Terminology Management</mrk> Software ('Terminology Management' is used instead of 'Management Software', because it is longer in characters)

Homonyms

Homonyms are terms with exactly the same name, but different meaning. Homonyms are usually being created in different concepts. All homonyms are included in the decoration.

Code Block
languagexml
Example Segment: Italian food is good.
Example Terms: italian (the adjective), Italian (the language)
Decoration: <mrk term:sourceTerm="italian"><mrk term:sourceTerm="Italian"/>Italian</mrk> food is good.

Synonyms

Synonyms are also included in the decoration in the same manner.

Code Block
languagexml
Example Segment: Italian food is good.
Example Terms: food (eng), Essen (ger), Lebensmittel (ger)
Decoration: Italian <mrk term:tgt="Essen"><mrk term:tgt="Lebensmittel"/>food</mrk> is good.

Deprecated terms

There is an option to highlight deprecated/forbidden terms identified by a certain field and value. This field and value is configured in the template. The decoration tag in terms identified as deprecated will contain the attribute term:deprecated="true".

Code Block
languagexml
Example Segment: Wireless Network
Example Terms: Wireless (eng), WiFi (ger - identified as deprecated)
Decoration: <mrk term:tgt="WiFi" term:deprecated="true">Wireless</mrk> Network

Accepted terms

There is an option to highlight accepted terms identified by a certain field and value. This field and value is configured in the template. The decoration tag in terms identified as accepted will contain the attribute term:accepted="true".

Code Block
languagexml
Example Segment: Wireless Network
Example Terms: Wireless (eng), kabellos (ger - identified as accepted)
Decoration: <mrk term:tgt="kabellos" term:accepted="true">Wireless</mrk> Network

Terms with no translation

There is an option to show terms in the source language that do not have a translation, i.e. a term in the target language.

Punctuation marks in segment

In order to search for terms during decoration the segment is being stemmed into words. In this process punctuation marks are not being taken into consideration. Since TermWeb version 3.18.0.7 also apostrophe character is considered as a space.

Term search limit during decoration

After stemming into words, the search algorithm is using only a limit of terms to search for translation. For each word of the segment only the first 200 source terms starting with the same word are being used in the search by default. This limit can change by editing the following TermWeb property:

...