New features in TransTools 3.15

TransTools v.3.15 contains several useful features and fixes.

New features for clean-up & formatting of Word documents

If you work with Word documents converted from PDFs, or convert PDF documents to Word format on your own using OCR or PDF conversion tools, you certainly know that such documents contain a lot of excessive formatting, making it more difficult to translate documents efficiently. Document Cleaner is an integrated tool for tag-cleaning and re-formatting of such documents to make them easier to translate both in CAT tools and without them. The latest version contains several enhancements that should be useful for you.

New bookmark removal features

Bookmarks in Word documents are special invisible elements used to refer to content inside a document. Typically, bookmarks are created automatically when you insert a table of contents in your Word document, although there are a number of other situations when bookmarks are used, and they can also be created manually. When you translate a document converted from a PDF file, unlike a document created from scratch, bookmarks a rarely useful. Most of these bookmarks are used by tables of contents which will need to be regenerated after translation, so there is no need to retain them while you translate the document.

How bookmarks are shown inside memoQ CAT tool
Bookmark inside memoQ CAT tool

Document Cleaner provides several methods to remove excessive bookmarks:

  • You can remove excessive bookmarks as part of the tag-cleaning operation using Tag Cleaner command. To do this, check the option called “Remove excessive bookmarks”. When you enable its suboption called “Not referenced from inside document”, Tag Cleaner will remove all bookmarks which are not referred to from hyperlinks, buttons or fields inside the same document. It is recommended to use this option whenever you process a converted Word document. If you also want to remove bookmarks used by tables of contents, check the box called “TOC bookmarks”. Again, you can safely remove Table of Contents (TOC) bookmarks because you will need to regenerate tables of contents after translation anyway.
    Bookmark cleaning options in Tag Cleaner
  • You can remove the majority of document bookmarks except hidden bookmarks. This is accomplished by a quick action called “Editing – Remove all bookmarks (except hidden bookmarks)” under Other Tools / Quick Actions tab. This command removes most of the bookmarks, including TOC bookmarks, but it does not remove some hidden bookmarks because these bookmarks may be inserted for some special purpose.
  • You can remove absolutely all document bookmarks using a quick action called “Editing – Remove all bookmarks (including hidden bookmarks)” under Other Tools / Quick Actions tab. Most of the time, you can use this command with documents converted from PDF files.

In this version of TransTools, all bookmark cleaning options listed above were improved. To take advantage of the changes, download the latest version of TransTools and install it. Recommendations for further improvements are much appreciated!

Dealing with hyperlinks

Hyperlinks are special text elements that allow navigation to websites or places inside the same document. In a CAT tool, hyperlinks are shown as several tags appearing before and after the hyperlinked text:

Text with hyperlinks in Microsoft Word
Text with hyperlinks in Microsoft Word

How hyperlinks are shown inside memoQ CAT tool
Same hyperlinks inside memoQ CAT tool

In some cases, especially in documents converted from PDFs, hyperlinks do not need to be preserved during translation, or need to be removed for some specific purpose:

  1. If the converted PDF file was not recognized properly and the website URL of many hyperlinks needs to be edited, it is better to convert the hyperlinks to regular text, edit the text, and then re-create the hyperlinks.
  2. If hyperlinks contain readable URL addresses such as http://www.microsoft.com, it may not be necessary to make them clickable, so they can be converted to regular text.
  3. Some documents copied from legal reference systems contain superfluous hyperlinks cross-referencing other areas in the document, and there is no need to preserve these cross-references after translation.

If you would like to convert hyperlinks to regular text, you can use a new command called “Editing – Convert hyperlinks to text (selection)” which is available under Other Tools / Quick Actions tab. To remove hyperlinks, you need to select a portion of the document which contains the hyperlinks to be removed, and then run the command.

To take advantage of this and other changes, download the latest version of TransTools and install it.

Other new features

Here is a list of other new features released in recent updates (since the last newsletter):

TransTools for Word

  • New text separators were added for conversion of selected text to dual-language text in Dual-Language Document Assistant tool [added in version 2.11.1 on 15/06/2017].
  • Remove Highlight tool – you can now remove all highlight colors with the exception of a single highlight color that you specify [added in version 2.11 on 26/02/2017].
  • Autoformat command (Document Cleaner tool) – you can now set text wrapping to None for all tables inside your document, and there are more options for changing cell alignment in all tables [introduced in version 2.11 on 26/02/2017].
  • The last selected profile is now automatically loaded when you open Document Cleaner or Dual-Language Document Assistant tools [introduced in version 2.11 on 26/02/2017].
  • Several bug fixes.

TransTools for AutoCAD

  • When drawing text is extracted into a Microsoft Word document, paragraph and line breaks are represented as paragraph and line breaks, respectively, without using any special codes which were used in previous versions [introduced in version 2.1 on 07/03/2017].
  • When drawing text is extracted into a Microsoft Excel spreadsheet, paragraph breaks are represented as line breaks, and line breaks – as <br/> text sequences [introduced in version 2.1 on 07/03/2017].

For a detailed list of changes in version 3.15 and previous versions, click here. You can download and install the latest version of TransTools from here

New software announcement

Terminology is a very important part of CAT tools. Creating a good terminology database for your translation projects speeds up translation and improves its quality. However, adding terms to your terminology database does not always mean that these terms will be detected during translation and QA. If your CAT tool does not recognize a specific inflection (case, tense, number, etc.) of your term, the term will not always be detected in source or target text. This means that the term's translation will not be suggested to you (and so you can use an inconsistent translation), and the CAT tool will not find potential mistakes during terminology checks as part of the QA operation.

Various CAT tools handle term inflections differently. Some CAT tools (e.g., SDL Trados Studio, Déjà Vu X and Wordfast Pro) use fuzzy approach to detecting term inflections. Other CAT tools – specifically memoQ and Memsource Cloud – need to know how inflections are formed from the main form of the term, so it is the responsibility of the terminologist to add special ‘mark-up’ (* and | symbols) inside the terminology database to allow detection of all term inflections.

Term Morphology Editor is a new standalone Windows application designed for memoQ and Memsource Cloud users who want to speed up the process of morphological preparation of terminology databases in order to make them more efficient. It simplifies this process in several ways:

  1. It provides a visual morphology editor for editing term morphology. Using this editor, you can quickly change the position where the variable part of each word starts:
    Editing morphology of each word using a visual term editor
    To do this, you use the Left button (Left button) to expand the variable part (ending) of each word, and the Right button (Right button) to shrink it. The variable part is shown using red color against a yellow background.
  2. It provides morphological suggestions for each processed term. These suggestions are offered from two sources: Hunspell (spellchecking engine) and Morpher web service (for Russian and Ukrainian only). The best suggestion is applied automatically when the term base is imported into Term Morphology Editor, and the linguist can always apply a different suggestion if it is more suitable. The list of languages for which suggestions are offered is provided here. These suggestions are often very accurate so you do not even need to edit morphology manually. In the screenshot below, you can see the automatic suggestion from Hunspell engine for the Polish term “dobry tłumacz” (good translator):
    Morphology suggestion from Hunspell engine
    Morphology suggestions from Morpher web service (for Russian and Ukrainian languages) are especially accurate, even for common nouns (the example below is for the Russian translation of “Fyodor Mikhailovich Dostoyevsky”):
    Morphology suggestion from Morpher.ru engine
  3. When you update a term base file, the updated terms’ morphology can be saved to a special database. If you process the same terms in the future, the program will automatically offer correct term morphology from the database, so there is no need to edit morphology when you encounter the same term.
  4. It offers additional commands for applying a specific type of matching (Custom, Exact, Fuzzy, etc.) to selected terms. For instance, by applying Custom matching mode to all terms in a memoQ term base, without any manual editing, you will increase the number of detected inflections in many languages. In the example below, when we apply Custom matching to a French term “bandage pneumatique”, memoQ will automatically accept the plural form “bandages pneumatique” during translation:
    Applying Custom matching to a composite French term to allow detection of plural form

Here is a screenshot of Term Morphology Editor in action:

Term Morphology Editor - full screenshot

If you would like to read more about Term Morphology Editor and test it on your computer, feel free to visit its web page.

I hope you have found this information useful. See you in future newsletters. Don't forget to subscribe to TransTools on Facebook, Twitter, Google Plus, LinkedIn or Scoop.it.

November 10, 2017

Developed by Stanislav Okhvat, 2007–2017

Microsoft Word®, Excel®, PowerPoint® and Visio® are registered trademarks of Microsoft Corporation.
Autocad© is copyright of Autodesk, Inc.
SDL Trados® (including SDL Trados Studio, Trados Workbench, TagEditor and Microsoft Word Addin) is a registered trademark of SDL plc.
memoQ is copyright of Kilgray Translation Technologies.
Wordfast© is copyright of Yves Champollion.

