Home Home Home
Home

Nov. 7, 2009    23:15 GMT

213 users online (48 registered)
4735 translation agencies
101018 registered users
.
Search
Members Log On

User Name

Password
Click to get help

THE HOW-TO LIBRARY


Jost Oliver Zetzsche

Translating Desktop Publishing Formats: Fiendish Files and Funky Formats


By Jost Oliver Zetzsche. Submitted on August 26, 2005

About the author: Jost Zetzsche is an ATA-certified English-to-German translator and a localization and translation consultant. He co-founded International Writers' Group on the Oregon coast and sends out a free, biweekly technical newsletter for translators (see www.internationalwriters.com/toolkit).



Generally, DTP programs can be categorized into two groups: those created for design-oriented publications and those intended for content-oriented publications.

In the first group are programs such as:


The second group is home to applications such as:


The design-oriented programs provide superior graphic and font management and processing. Text is handled in individual and independent text boxes that can be placed anywhere in the application. They are typically closely integrated with graphic applications, and they offer advanced methods of prepress setup.

While the content-heavy applications also offer good graphics and prepress management (albeit not as advanced as the design-oriented programs), their main focus is on the processing of text, which shows in the advanced TOC (Table of Contents) and index generation, cross-references, page break management (widow and orphan rules), an independent character and paragraph setup, and the ability to output documents in a huge variety of formats. The latter is increasingly done through a tight integration into SGML and XML.

While any of these formats is, of course, directly translatable in its own environment—i.e., you can overwrite the text of a PageMaker file within PageMaker—you will have to save these formats to a non-compiled format (i.e., text-based format) to process them in a computer-assisted translation tool.

Content-Oriented Desktop Publishing

Any of the content-oriented formats—FrameMaker or Ventura—offers a fairly painless way of saving the original compiled format in an interchange format that can be easily processed. Because the emphasis for these files is on text and not on graphics, text is represented in one flow, and can be saved in a simple "Save as" process for each file (which is typically synonymous for one chapter).

The very concept of these programs is that there will be as much automation in the layout as possible. This is achieved, for instance, through fairly sophisticated widow and orphan rules so that there is only a small amount of additional pagination.

In general, these programs are very well suited for translation. There is no problem with non-Western languages even in Western versions of the system (provided that your operating system supports it). The size of the files tends to be relatively small because graphics are usually linked and not inserted, and all of these programs are exceptional in the ways they publish and re-publish text in a great variety of formats, including HTML, XML, PDF, or RTF.

FrameMaker

Open the .FM file within FrameMaker and select File> Save as to save the file as a text-based .MIF file. Usually you do not have just one file but a number of files that are all organized in one .book file. To avoid the individual opening and saving of each file, you can use the well-liked MifSave (see home.comcast.net/~bruce.foster/MifSave.htm) to do this as a batch process for a whole book. (And it's totally okay to ask your client to do this for you if you do not have FrameMaker on your computer.)

Once all your files are preprocessed, they are supported in any of the larger CAT tools (Trados, SDLX, Déjà Vu, Transit), most of whose representatives will tell you that their FrameMaker processing is one of their strongest features—which only goes to show that FrameMaker is a very translator-friendly format.

There are two differences between the way that Trados processes these files in comparison to its competitors. In Trados you need to convert the MIF files with the so-called S-Tagger for FrameMaker into RTF files before you can translate them in either Word or TagEditor. Also, Trados creates an additional file called ancillary.rtf, which contains background information that is repeated in each file. The other tools process the MIF files directly and translate the information that Trados places into the ancillary file individually for each file.

Ventura

The process for translating Ventura files is very similar to translating FrameMaker files with the exception that you need to save the files to a Ventura-specific text format rather than a MIF format. Trados is the only CAT application that supports the Ventura format (with the help of the utility S-Tagger for Ventura)—but don't worry, there are very few translation projects in that format.

Design-Oriented Desktop Publishing

Because in these formats each text block, called a story, is saved in individual text boxes from which the text has to be manually exported into an application-specific text format and re-imported if you want to process it in a translation memory program. While this is theoretically not an issue, it is super time-consuming when you have to do this for tens or even hundreds of stories in one document. This means also that even if CAT programs claim that they can process the native export format of PageMaker, Quark, or InDesign, only a few allow the batch processing of all the text fragments involved.

Another time-consuming task for any of these formats is that due to text-expansion, the stories will have to be resized after translation—so you need to make sure that you take that into consideration when accepting a job or quoting for a job in any of these programs!

This is not where the problems stop, though. Especially QuarkXPress and PageMaker are still very "last century" when it comes to processing multilingual text. While Unicode is a widely accepted standard and the reason why it is so easy to mix and match different writing systems on web pages and all kinds of other documents, these programs are not up to par on this. As I mentioned a few months ago, Quark has now announced that its upcoming version 7 will support Unicode, but PageMaker most likely will not because the folks at Adobe have a better choice when it comes to processing Unicode: InDesign.

But let's start from the beginning and go through each of the programs and the options that they present to translate. (This may all sound very tedious if you aren't familiar or interested in this, but you'll be glad to have this information when you need it.)

InDesign

As the only one of these programs that supports Unicode, InDesign makes it possible to write in all languages, even in its English version. This may not sound too impressive, but wait until we talk about its competitors. . . .  

After a fairly unsuccessful version 1, InDesign really gained traction beginning with version 2. Presently you will encounter InDesign files that are created in versions 2, CS (3), or CS2 (4). To efficiently translate in InDesign you will need a program that exports all the stories (the above-mentioned text boxes) into one large file that can be processed in a computer-assisted translation tool (of course, it is possible to translate directly within InDesign, but the emphasis was on "efficient"). Trados offers little plug-ins as part of all its versions of the Workbench product that support InDesign versions 2 or CS (the plug-ins are stored under C:\Program Files\TRADOS\Txx_xx\FI\IND— follow the instructions in the help file on how to install the plug-ins). Once you have installed the plug-in and opened the InDesign file, you will see a new Trados menu with all the necessary commands to export and re-import your file. The exported text can be translated either within Trados TagEditor or any other tools that support the InDesign export format. It works like a charm, and the import, once the translation is finished, works just as well. SDLX and Star Transit (with a separate plug-in) also offer the option of translating InDesign files, but again just for versions 2 and CS. SDL is working on the development of products that support CS2 files for Trados and SDLX, but these are still in the pre-beta phase. And the only way to down-save an InDesign CS2 file is by exporting it into an InDesign-specific XML format (INX) and importing this into the CS version—for this to happen you would need both versions of InDesign which is quite costly. . .

So, the best bet with InDesign CS2 for now will be the filter that is offered by ECM-Engineering (see www.ecm-engineering.de). This application allows you to export into an RTF format that is supported by Trados, Wordfast, SDLX, or Déjà Vu. Napsys (www.napsys.com) also offers (a rather expensive) filter, but they only support CS2 on the Macintosh platform.

PageMaker

To translate PageMaker files (an increasingly rare occurrence as Adobe is trying to push InDesign over PageMaker), you could either use Star Transit with a separate plug-in with support for PageMaker 6-7 or a plug-in that comes with the Trados product called Story Collector for PageMaker and supports versions 6.5 and 7.

To install the Trados plug-in, open the help file under C:\Program Files\TRADOS\Txx_xx\FI\PM for further instruction. Once the plug-in is installed, open the PageMaker file in PageMaker and you'll find the command Trados Story Collector under Utilities> Plug-ins. Export all the stories into one large PageMaker-specific text file, save the original PageMaker file (important!), and translate the exported text file with TagEditor or any other application that supports the PageMaker format. The import process is virtually the same as the export and should go seamlessly.

Alternatively, you can write your own macro that allows the export and re-import of all stories in and out of PageMaker. Here are some instructions: tinyurl.com/762r8.

All of the above is true for Western languages and to some degree for Eastern European languages. Any of the more complex languages, however, including the bi-directional languages (Hebrew and Arabic) or the Asian double-byte languages, are flat-out not supported in the Western versions of PageMaker. Though you can purchase language-specific versions for these languages, it would make a LOT more sense to convert to InDesign and take it from there. Because InDesign and PageMaker are both Adobe products, the upgrade path is relatively easy (both in terms of purchasing a less expensive competitive upgrade version of InDesign when you already own PageMaker and in terms of converting the files).

QuarkXPress

Here are a couple of reasons why I think that the prize for the Greatest Localization Stinker should easily go to Quark: I already mentioned that the English version of Quark does not support double-byte languages (which means you have to buy Korean, Chinese, and Japanese versions of Quark if you intend to work in these languages) or any bi-directional languages (Arabic and Hebrew). But even for the common FIGS (French, Italian, German, and Spanish) languages, QuarkXPress goes so far as to force you to buy the significantly more expensive Passport edition if you would like to use those spell-checkers and hyphenation rules. And if you dare to not do that, it will refuse to start if you use any default keyboard other than the English keyboard. (I grant you, it's easy to switch your default keyboard back to the English keyboard, but it sends me through the roof every time!)

OK. That being said . . .

Quark is the market leader in desktop publishing so it's not too surprising that there is decent support for different versions of Quark among the CAT tools. Star Transit offers a separate plug-in that supports the batch processing of the English (and Passport) versions 3-6 for both the Windows and Mac platforms; Trados offers plug-ins for versions 4.1-6 for English (and Passport) and version 4.1 for Japanese; and SDLX offers a plug-in for the English (and Passport) versions 4-6 for the Mac. All of these plug-ins were preceded by a program called CopyFlow (see www.napsys.com) which, just like these programs, allows for the batch export and import of text from Quark files. It may still be worthwhile to take a look at Napsys' website—for instance, they offer plug-ins for Asian versions for Quark that no one else does. Also, users of some programs (I know of Déjà Vu but there may be others as well) are eligible for a discount on CopyFlow products.

If you have the Passport edition of Quark and only work in and out of Western and Eastern European languages, you should be pretty well set with the help of any of the tools mentioned above. If you only have the (cheaper) English version, you need to make sure to ask your client to save the file as a "Single Language" file in case he uses the Passport edition—otherwise you will not be able to open the file.

For Middle Eastern languages, there are plug-ins that can be used with the English or Passport versions (www.arabicsoftware.net) so you should be able to work with those languages as well. It becomes much more hairy with the Asian double-byte languages. While the Japanese version 4.1 is supported by the Trados plug-in and several others by CopyFlow, it at least means that you have to have several versions of Quark for different languages, plug-ins, and platforms.

So, again, the easiest would be to convert to InDesign, right? Well, not so fast, my friend. Don't forget that Quark has just been awarded the Greatest Localization Stinker award, and true to its form it makes it very difficult to convert to InDesign. Adobe obviously has tried to allow for a conversion from Quark documents into InDesign and has actually published a guide on how to do it at: www.adobe.com/products/indesign/conversion.html. The problem is that the only version where that is possible is Quark 4.1. Both Quark 5 and Quark 6 have proprietary formats that InDesign cannot get to. While it is possible to down-save from Quark 5 to Quark 4.1, it is not possible to go directly from Quark 6 to 4.1. Instead, you need to first save to 5 and then to 4.1. Unfortunately, hardly anyone owns version 5 because it was not a good version and even Quark lovers rejected it.

Not good.

So what all this ends up meaning is that translations involving Quark and Asian languages are often done without the aid of computer-assisted translation tools and, once they are done, often saved as EPS files and then placed into an English Quark file. While to the outside spectator this may look good, it isn't, because even minor changes will cause great headaches.
Recommend this article: stumbleupon|digg|del.icio.us|reddit|facebook


    

AGENCY OF THE HOUR

Wortschatz Sprachendienste

.
Wortschatz Sprachendienste
Wortschatz Sprachendienste is a young, dynamic translation agency that deals with all language combinations. We will take care of even the most complex of translation projects – quickly and accurately!
.
LINGUIST OF THE HOUR

TC Master Kenneth Lundkvist
TC Master
Kenneth Lundkvist


Reliable delivery of professional translations
.

Wallpapers from TC
wallpaper
Lake Ontario

wallpaper
Hilton Falls Reservoir

wallpaper
Hilton Falls Forest

wallpaper
Lake Ontario

wallpaper
Autumn Forest


All images 1920x1440
.
TranslatorsCafé.com Copyright © ANVICA Software Development 2002—2009. All rights reserved.
Privacy Policy. Terms and Conditions of Use. Use signifies your agreement.
Mail comments and suggestions to TranslatorsCafe.com webmaster
Directory of translators, interpreters and translation agencies.