wordpress hit counter
Learning about Open XML on-line - OpenXML Developer - Blog - OpenXML Developer

Learning about Open XML on-line

Blog

Samples, Demos, and Reference Articles

Learning about Open XML on-line

Rate This
  • Comments 3

Open XML is a new standard. So new, in fact, that the schemas are still being edited and haven't been published by Ecma yet. And there are no books out on Open XML development, although that will surely change in the next year.

So for now, the best place to learn about Open XML is on-line. This site will be a growing repository of information, and there is also some great information on blogs already. Here are some links to useful posts for Open XML developers ...

First, for .NET developers, you can use the new WinFX packaging API to read/write/create Open XML documents. Kevin Boske has a post on his blog about "Getting Started with Office Open XML and WinFX" that provides a straightforward overview of what you'll need and how to get started. If you don't have WinFX, you can get the February CTP here.

Kevin also has some other posts of interest to Open XML developers:

"Deleting a part" provides an example of removing the VBA project from any Office Open XML file. This approach can easily be generalized to removing any component from the Open XML package. Includes source code, and some interesting dialog in the comments. Here's a a link to a code snippet that includes the updated ECMA namespaces.

"How to create documents programmatically" is a high-level summary of the issues and options in creating an Open XML document from scratch in your own application.

In learning the Open XML formats, most developers start with word processing documents. They're probably the easiest to understand, and certainly the most widely used in real-world applications. Brian Jones has a great post entitled Introduction to Word documents that covers the basics. He provides a detailed look at how DOCX files store all the pieces that make up a typical Word document: styles, bullets and numbering, font information, document setings, story content, tables, custom-defined XML, sections, and headers/footers. This is worth a very careful read if you're writing code that modifes or creates Word documents. The discussion in the comments also covers some good points.

Another area of great interest for developers is Open XML's support for custom schemas. You can define highly customized schemas for your particular domain or application, and integrate those schemas into documents so that your code can use your own semantics to describe or access the contents of those documents. Brian has three good posts on this topic:

"Custom Defined Schemas" covers the basics of how to use custom schemas in Open XML.

"Create a rich Word document based on your own custom XML" includes a sample ZIP file that provides a great example of how content controls can be bound to a custom XML schema./p>

"Integrating with business data: Store custom XML in the Office XML formats" is a high-level overview of how Office 2007 will support the use of custom schemas.

Brian's blog is the most comprehensive source of Open XML technical details on the web to date. These posts also provide useful information for Open XML developers:

"Inclusion of alternate formats" discusses how to include other documents, such as a PDF representation, in an Open XML document. Some of this doesn't work yet with Office 2007 Beta 1, but Brian explains where it's all headed.

In "Answer to question on package relationships" Brian answers a reader's question about the thinking behind the design of the package relationships within an Office Open XML document.

"Why Office has moved to XML formats" is interesting background information for those who might wonder why Microsoft Office is moving from proprietary binary formats to Open XML file formats in the next release, Office 2007.

Finally, if you're at the "what the heck is Open XML?" phase of learning about this topic, the "Exploring the XML File Formats" post on my blog covers the basic concepts without getting into the technical details.

  • This is again about software and NOT about the format. I do not want to use your platform specific api's, where is the format? How much time should I spend trying to look for them?

    I want something as clear as this:
    http://www.oasis-open.org/committees/download.php/12572/OpenDocument-v1.0-os.pdf

    Gromerdiegrom.
  • The document you're referring to is the published standard for ODF.  The Open XML Formats are not a published standard yet -- they're still going through the Ecma process, as ODF went through the Oasis process a year earlier.

    The fact the Open XML Formats standard isn't out yet is a big part of the reason we're doing this site.  Various contributors will be adding content here in the days and weeks ahead that will help document the format.  Meanwhile, if you're eager to dig into the details of the format, you can always refer to the initial draft of the Ecma spec here:
    http://www.ecma-international.org/activities/Office%20Open%20XML%20Formats/TC45_FD_XML_docform.zip

    Also, did you look at the links above?  Many of them make no reference to any software, just the formats themselves.  For example, Brian Jones wrote an overview of many aspects of the word-processing format (that doesn't any code or tools from any platform) here:
    http://blogs.msdn.com/brian_jones/archive/2006/02/02/523469.aspx

    Most of the posts on Brian's blog about the formats, like this one, are directly covering the formats and not geared toward any particular language or platform or tool.

    - Doug
  • Are microsoft has planned any release in future which replace the featuires of Excel COM.
    Open XML does not provide anything to execute Macros or Goalseek kind of operations using OpenXML API.
    yes manipulations can be done in the content and style/formatting of existing Spreadsheets.
Page 1 of 1 (3 items)