- Release Notes v7.0
- Getting Started
- System Management
- General Information
- Users
- Groups & User Crowds
- Softkeys
- Reports for User and Group Information
- User Settings Templates
- Workflows
- Relations
- Languages
- Fonts
- User settings
- System settings
- Editing States
- User Dictionary
- crossGrid
- crossGrid Packaging Templates
- crossTank
- crossWAN Packaging Templates
- Subjects
- Information
- Machine Translation
- Project Settings Templates
- Quality Management v6.3
- Quality Management v7.0
- Reporting
- Segmentation
- Language Settings
- Structure Attributes
- System Attributes
- Search Center
- Concordance Search Results
- Stopwords
- Term Extraction
- Document Settings
- Document Associations
- Display Text
- .NET Resources
- Excel 2000-2003
- Excel 2007-2016
- IDML
- MIF 7
- MIF 8-2019
- PowerPoint 2000-2003
- PowerPoint 2007-2016
- QuickSilver
- Tagged HTML
- Tagged SGML
- Tagged XML
- Tagged XML v2
- Visual XML
- Windows Resources
- Word 2000-2003
- Word 2007-2016
- XLIFF
- Regular expressions
- System attributes
- Project Management
- Projects
- Project View
- Project settings
- Functions of the Module
- Project Search
- Project creation
- Adding attachements
- Releasing Projects
- Document and Project Updates
- Project status
- Exporting projects
- Importing projects
- Activating/Deactivating Projects
- Duplicating Projects
- Archiving Projects
- Change workflow
- Changing Workflows (Several Documents)
- Documents
- Reports
- Tasks
- Quality management
- Formats
- The Project Archive
- crossGrid
- Project Management Cockpit
- The Filter Editor
- crossAnalytics
- Linguistic Supply Chain Management (LSCM)
- crossWAN Project Management
- Partitioning
- Relay Translations
- Document preparation
- Term Extraction and Term Translation
- External Editing of Documents
- The EN 15038 Standard Workflow
- The ISO 17100 Standard Workflow
- crossConnect for External Editing
- Finishing pre-translated tasks automatically
- Projects
- Task Processing
- Working in crossDesk
- Paragraph States
- Empty Paragraphs
- Modes
- Customizing crossDesk
- Tasks in Across
- Comments
- Bookmarks
- Paragraph Numbering
- Sorting Paragraphs
- Context View/Source View
- crossTerm Window in crossDesk
- crossView
- Fuzzy search
- Concordance search
- crossSearch
- Spell-check and User Dictionary
- Pre-translations
- Store Translations Wizard
- The Target Editor
- Preview
- QM Check in crossDesk
- Search and Replace
- Correction
- Reviews
- Redelegation to the Translator
- Quick Translate
- Local Data in the Offline Client
- crossWAN
- TM Management
- Terminology Management
- Concept-Oriented Terminology System
- Definitions
- The crossTerm Manager
- crossTerm settings
- crossTerm Manager User Interface
- Searching for Entries/Terms
- Entry and term elements
- Editing Entries/Terms
- Delete Entries/Term(s)
- Merging Entries
- Duplicating Entries
- Manual correction
- crossTerm Reports
- crossTerm Import
- crossTerm Export
- crossTerm Data Maintenance
- crossTerm Web
- crossMining
- crossSearch
- Browser-based Work
- Editing of Special Formats
- Menus, Icons, and Keyboard Shortcuts
Term Extraction
Stopwords or Terms
When the terminologist responsible for the term extraction opens the task, he gets a term candidate list.
By means of a mouse click, the terminologist determines which term candidates are terms and which ones are stopwords.
To be able to determine stopwords, the corresponding user must have a right for managing stopwords (see the right Stopwords in the System Settings section of the user group rights). Terminologists have this right by default.
Stopwords are words that are filtered out during term extraction and are not offered as term candidates. Typically, stopwords are, for example, articles, expletives or conjunctions. The larger the list of stopwords, the more precise the results of the term extraction will be.
The context often plays a major role in deciding whether a word is a term or a stopword. Therefore, the icon above the term candidate list can be used to display the context of the respective term candidate.
Please note that the extraction of terms from source documents in Asian languages is not possible due to the morphological structure of the languages.
Learning system
Term candidates for which crossTerm entries already exist are displayed in blue and bold type in the list. Thus, the terminologist can concentrate on what matters: terms that are new and that have not yet been translated. In addition, once the term extraction task has been completed, all stopwords are saved in a list and are no longer displayed as term candidates. The more often and the more intensively you use the term extraction feature, the more valuable it will become to you as a translation tool.
Words that are highlighted in bold and in blue font in the term candidate list already exist as terms in crossTerm and therefore only need to be translated and selected as terms if no target-language equivalents exist for these terms in crossTerm.
For words already marked as stopwords in Across, the checkbox is activated and grayed out.
When you double-click a term candidate, it is highlighted in color in the Source View. When you double-click it again, the display goes to the next place that a term candidate has been found in the Source View.
Editing Term Candidates
Term candidates may need to be edited, e.g. to change a plural noun to singular. To do this, simply click the selected term candidate. Subsequently, you can perform the needed changes. Click Enter or change to another term candidate to save the changes.
The source-language terms can no longer be modified during the term translation after the term extraction. Therefore, the source-language terms must always be modified during the term extraction.
List of term candidates
The term candidate list can be customized. For example, it can be sorted alphabetically or by frequency by clicking the respective column head. Furthermore, various filter functions can be used for filtering the following elements from the list:
- Terms: All term candidates that are already marked as terms by activating the respective checkbox are hidden.
- Non-terms: All term candidates not yet marked as terms are hidden. Accordingly, all words marked as terms are displayed.
- Words whose frequency is below a defined threshold.
- Words whose number of characters is below a defined threshold.
- Single words: All term candidates consisting of only one word are hidden.
- Three-word combinations: All term candidates consisting of three words are hidden.
- Stopwords: All term candidates already marked as stopwords are hidden.
In addition to the filter functions, you can use the icon to add individually selected words in the Source View to the term candidate list.
Finish task
Upon completion of the term extraction, i.e. after you have selected all desired term candidates as terms or stopwords, you can finish the task by clicking the icon in the crossDesk toolbar.
After a term extraction task is finished, all term candidates marked as stopwords are automatically added to the respective stopword list under Tools > System Settings > Terminology > Stopwords.
Term candidates marked as terms are offered for translation in the subsequent term translation.