Back to Index | Site map
New features - TransTools version 2.0

Version 2.0 of TransTools includes a number of significant additions. Here is a summary of all the updates.

Document Cleaner - a new suite of tools for formatting of documents produced with OCR tools and PDF-to-text converters and for reducing the amount of tags in CAT software

From time to time, we have to deal with badly formatted documents. They can be produced by word recognition (OCR) software, by PDF-to-text converters, or by inexperienced authors. If you use a CAT tool and need to import a document into the program before translating it, you may get many tags which will make the translation process far from enjoyable. If you have not yet migrated to any CAT tool, you still have to fix some formatting problems in order to create a good-quality translated document.

Document Cleaner, which is part of TransTools for Word, is a new collection of commands that will allow you to minimize the amount of time you spend formatting documents before or during translation. It provides the following commands:

  • Reformat - Use it to remove text and paragraph shading (as well as highlighting), reset uneven character spacing, and remove hyphenation. All of these formatting attributes are often added by OCR tools to match original document formatting as close as possible, or to show which characters were recognized unreliably. Such formatting is one of the biggest sources of tags in CAT software.
  • Resave - Some documents, when imported into CAT tools, contain hundreds or even thousands of tags for no apparent reason. Such tags often appear around spaces between words. The command saves the document to RTF and back to the original format, which often eliminates practically all such 'rogue' tags.
  • Table Column Aligner - When you recognize a document containing multi-page tables, these tables are recognized as several tables, one per page. When you join these tables, however, they will often have misaligned vertical borders. This command helps to format such tables properly, creating a perfectly aligned table.
  • Line Removal - Some OCR tools insert vertical or horizontal floating lines instead of borders. Removing these lines is very tiresome, as they are difficult to find. This command helps you track them down and remove them much easier.
  • UnFrame - Most CAT tools insert frames around tables, images, or text blocks. The only purpose of these frames is to position objects more accurately, but they are often unnecessary, and they present a number of problems for the translator, such as the inability to fit translated text due to frame size restrictions, or to fit it on several pages if the text expands. This command helps you remove such frames, retaining their contents.
  • Bookmark Cleanup - If a document contains bookmarks, they are imported into CAT tools as pairs of tags. This command removes specific types of bookmarks, such as bookmarks which are not referenced from fields or hyperlinks, or 'table of contents' bookmarks, etc.

Screenshot: Document Cleaner dialogue

For more information on Document Cleaner, refer to its online help page.

I am planning to add new commands to Document Cleaner in the future. If you have ideas on how to fix some common formatting problems in Word documents or minimize the amount of tags in CAT software, please feel free to drop me a line and I will consider adding this functionality in Document Cleaner.

Hide / Unhide Text command - selective translation using CAT software

Most CAT tools provide a way to import only parts of Word documents by ignoring all the text formatted as 'hidden'. This is very handy when you translate dual-language documents with two languages appearing side by side, or when you translate new text added in a previously translated document. Hide / Unhide Text command (TransTools for Word) makes it very easy to mark text that needs to be translated or skipped during translation - just highlight the text that needs to be translated or skipped using a highlight color, and then run the command and use one of its options. You can use the same command to unhide text after you export the translated document from your CAT tool.

Screenshot: Hide / Unhide Text dialogue

For more information, go to the tool's online reference page.

What's New in Find Spaces command

A new option in Find Spaces command (TransTools for Word) allows you to remove all spaces at the beginning and end of paragraphs or lines [suggested by Alexander Kormanovsky]. In case you do not use it yet, Find Spaces command makes it very easy to remove excessive spaces from Word documents, making documents "cleaner" and easier to translate.

Updates in TransTools installation program

TransTools installation program has been updated. Roger Chadel provided the Brazilian Portuguese translation of the Automatic Installer interface (originally translated into Portuguese by Joao Albuquerque).

If you would like to help with localising TransTools installation interface into a language (English, Russian, Portuguese and Brazilian Portuguese are already covered), please contact me. The installation program is rather complex, so besides the standard installation program text which I can easily add to the installer in all common languages there are around 920 words (around 5500 characters) of additional custom text with about 6% repetitions. I would appreciate your help.

21st October 2012

