Notes2Self.net

Stephen McGibbon's Web Journal

Mapping the Office Binary File Formats to OpenXML

Brian Jones provides the text of one of Ecma's responses to the comments received from National Standards Bodies to DIS29500 that pertain to mapping from the binary formats to OpenXML:-

We believe that Interoperability between applications conforming to DIS 29500 is established at the Office Open XML-to- Office Open XML file construct level only.

Prescriptive guidance on, or tools to enable, transformation from Microsoft Office  "binary" file formats (i.e., .doc., .xls, and .ppt) (the "Binary Formats") to Office Open XML formatted files is not the intention or in scope of DIS 29500.  As a result this request is outside the bounds of this process. 

It is important to note that substantial use is being made of both the Binary Formats and Office Open XML in the marketplace today.  Many products (such as OpenOffice.org) support the Binary Formats. Microsoft has indicated that many companies and public institutions have received the documentation for the Binary Formats, and are working with it at this time, and can create mappings between the Binary Formats and Office Open XML. Translators from the Binary Formats  to XML formats such as ODF have already been developed and are in wide use. For example, the Sun ODF Plug-in for Microsoft Office (http://sun.systemnews.com/articles/112/3/sw/18208) states that  "The plug-in allows users the ability to seamlessly convert Microsoft Office documents to and from ODF. The ODF plug-in supports Microsoft Word, Excel and Powerpoint".

Likewise, there is widespread use of Office Open XML in the marketplace today across platforms and applications.  A few examples include the implementations released by Apple (Mac OS X Leopard, iWork 08, iPhone), Adobe (InDesign), Microsoft (Office 2007, Office 2003, Office XP, Office 2000, Office 2008 Mac OS X), Novell (Suse Open Office), Google (Search / Preview), Mindjet (MindManager), Intergen, OpenXML/ODF Translator (Open Source project on Sourceforge), Dataviz (DocumentsToGo on Palm OS, MacLinkPlus on Mac OS X Leopard), NeoOffice, Altova (XMLSpy), MarkLogic (XML Content Server), Datawatch (Monarch Pro), QuickOffice  (QuickOffice Premier 5.0 on Symbian), Altsoft (XML2PDF Server 2007) and those under development by Corel (WordPerfect), AbiWord, Gnome (GNumeric),  Xandros, Linspire, Turbolinux and others.  These implementations are now available on many platforms, including Linux, the Macintosh, Windows, and handheld devices (PalmOS, Symbian, iPhone, and Windows Mobile).

The widespread use of both  Binary Formats and Office Open XML formats indicates that, at this time, 3rd party can use both formats and build mappings between them.

Nonetheless, Ecma International discussed this subject with Microsoft Corporation, the author of the Binary Formats.  To make it even easier for third party conversion of Binary Format-to-DIS 29500, Microsoft agreed to:

  • Initiate a Binary Format-to-ISO/IEC JTC 1 DIS 29500 Translator Project on the open source software development web site SourceForge (http://sourceforge.net/ ) in collaboration with independent software vendors.  The Translator Project will create software tools, plus guidance, showing how a document written using the Binary Formats can be translated to DIS 29500.  The Translator will be available under the open source Berkeley Software Distribution (BSD) license, and anyone can use the mapping, submit bugs and feedback, or contribute to the Project.  The Translator Project will start on February 15, 2008. 
  • Make it even easier to get access to the  Binary Formats documentation by posting it and making it available for a direct download on the Microsoft web site no later than February 15, 2008.  The Binary Formats have been under a covenant not to sue and Microsoft will also make them available under its Open Specification Promise (see www.microsoft.com/interop/osp) by the time they are posted.

We will modify DIS 29500 to include an informative reference to the SourceForge project.