The commands in this section help you properly format Microsoft Word documents prior to translation. If you use a CAT tool, they help minimize various issues that arise during translation, such as excessive tags that make translation difficult, excessive spaces that reduce TM matches, incorrect breaks that prevent proper segmentation, etc. If you do not use a CAT tool, they help to prepare nicely formatted, professional-looking documents.
||Collection of tools for preparation of badly formatted documents for translation
||Use this command:
- to remove excessive tags (in CAT tools) caused by bad formatting in documents produced by OCR and PDF conversion tools
- to remove unnecessary bookmarks in order to minimize tags in CAT tools
- to take text out of frames created by some OCR tools
- to remove vertical and horizontal lines created by OCR tools
- to align columns after joining multi-page tables in documents produced by OCR or PDF conversion tools
- to resolve common formatting problems in documents produced by OCR or PDF conversion tools
||Remove incorrect paragraph and/or line breaks
||This command will help you remove incorrect breaks in order to avoid segmentation problems in CAT tools. The tool is equally suited for finding occasional incorrect breaks in documents prepared by a person (manual correction mode) and for reformatting text with lots of incorrect breaks such as text copied from PDF files, movie scripts, regular text files, etc. (automatic correction mode).
|Find & Replace Excessive Spaces
||Find and remove excessive spaces
||Use this command to find excessive spaces (e.g., more than 1 space between words, or one or more spaces at the start or end of paragraphs or lines). You can replace them with a single space, a tab character, a line break, a paragraph break, or remove them altogether. Doing this will improve translation memory matches in CAT tools and make the document more professional.