Open XML is a new standard. So new, in fact, that the schemas are still being edited and haven't been published by Ecma yet. And there are no books out on Open XML development, although that will surely change in the next year.
So for now, the best place to learn about Open XML is on-line. This site will be a growing repository of information, and there is also some great information on blogs already. Here are some links to useful posts for Open XML developers ...
First, for .NET developers, you can use the new WinFX packaging API to read/write/create Open XML documents. Kevin Boske has a post on his blog about "Getting Started with Office Open XML and WinFX" that provides a straightforward overview of what you'll need and how to get started. If you don't have WinFX, you can get the February CTP here.
Kevin also has some other posts of interest to Open XML developers:
"Deleting a part" provides an example of removing the VBA project from any Office Open XML file. This approach can easily be generalized to removing any component from the Open XML package. Includes source code, and some interesting dialog in the comments. Here's a a link to a code snippet that includes the updated ECMA namespaces.
"How to create documents programmatically" is a high-level summary of the issues and options in creating an Open XML document from scratch in your own application.
In learning the Open XML formats, most developers start with word processing documents. They're probably the easiest to understand, and certainly the most widely used in real-world applications. Brian Jones has a great post entitled Introduction to Word documents that covers the basics. He provides a detailed look at how DOCX files store all the pieces that make up a typical Word document: styles, bullets and numbering, font information, document setings, story content, tables, custom-defined XML, sections, and headers/footers. This is worth a very careful read if you're writing code that modifes or creates Word documents. The discussion in the comments also covers some good points.
Another area of great interest for developers is Open XML's support for custom schemas. You can define highly customized schemas for your particular domain or application, and integrate those schemas into documents so that your code can use your own semantics to describe or access the contents of those documents. Brian has three good posts on this topic:
"Custom Defined Schemas" covers the basics of how to use custom schemas in Open XML.
"Create a rich Word document based on your own custom XML" includes a sample ZIP file that provides a great example of how content controls can be bound to a custom XML schema./p>
"Integrating with business data: Store custom XML in the Office XML formats" is a high-level overview of how Office 2007 will support the use of custom schemas.
Brian's blog is the most comprehensive source of Open XML technical details on the web to date. These posts also provide useful information for Open XML developers:
"Inclusion of alternate formats" discusses how to include other documents, such as a PDF representation, in an Open XML document. Some of this doesn't work yet with Office 2007 Beta 1, but Brian explains where it's all headed.
In "Answer to question on package relationships" Brian answers a reader's question about the thinking behind the design of the package relationships within an Office Open XML document.
"Why Office has moved to XML formats" is interesting background information for those who might wonder why Microsoft Office is moving from proprietary binary formats to Open XML file formats in the next release, Office 2007.
Finally, if you're at the "what the heck is Open XML?" phase of learning about this topic, the "Exploring the XML File Formats" post on my blog covers the basic concepts without getting into the technical details.