Interview by Ryan Harte: February 24, 2010

Eric White is a Technical Writer for Microsoft who focuses on SharePoint, Office, and Open XML.  http://blogs.msdn.com/ericwhite/default.aspx

In this interview we talk to Eric about topics such as the synergies between functional programming and Open XML, aswell as the transformation of Open XML documents into other forms.

 

What is the coolest implementation of Open XML you have seen?
I’m not sure of the coolest but I see an awful lot of content generation systems.  I think the content generation systems are perhaps the most compelling implementation of Open XML or useful Open XML that I’ve seen of course I also particularly like content publishing systems where people take Open XML WordprocessingML and transform it to either XHTML or XTS or PDF or a variety of other formats.

 
Tell us a bit about what you do at Microsoft?
Right now I’m programming writer and I specialise in Open XML SharePoint and Office in general. I also really like some more of the geeky subjects such as Linq and functional programming and I think there is a lot of synergies between functional programming and Open XML because it really lends itself to querying documents and retrieving the data in documents in a really effective fashion and it also really lends itself to transformation of Open XML documents into other forms.

Just recently I completed the first version of a transform of an Open XML WordprocessingML to HTML and it was on the order of about 3,000 in changed length of code and it handles all kinds of interesting capabilities such as line numbering, those are the kind of areas I focus in.  Another area that I own is PowerTools for Open XML which is an open source project on CodePlex that has a lot of good examples using Open XML. 

So you are saying HTML can be converted to and from office documents, does that come up in SharePoint at all, with the migration of a whole lot of documents to web pages and stuff like that? 
Sure there are a few scenarios that are particularly compelling with this. One scenario is SharePoint wikis, you can access and replace content in SharePoint wikis using HTML and the ability to transform an Open XML document to HMTL allows you to populate for instance a SharePoint wiki with HTML you generate from a document.

Another really interesting use is that when people want to query documents it’s significantly easier to query an XHTML document as XML as opposed to querying an Open XML document so if all you’re interested in is getting the text of a set of paragraphs that have a particular style if you first transform to XHTML then the resulting query is exceptionally simple to write over the XHTML and finally developers often have the need to provide a preview of some content. They aren’t really interested in extreme accuracy of all the formatting such as fonts and so on; rather they want to provide a view into the actual words and photographs of the content.

So providing a transform of Open XML to XHTML where you can selectively pick certain portions of the document to transform gives developers a pretty good tool to provide a either a comprehensive or a preview of subset of Open XML document in their applications.

So how does PowerShell or PowerTools come into that there?
The PowerTools are there as the examples and guidance for building a whole variety of Open XML applications. They are provided in source code form so that developers can take the code and extend it or modify it for their own particular needs.

One of the power tools in the PowerTools for Open XML one of the PowerShell command-lets is to generate an XHTML document from an Open XML document so in the context of using Open XML within PowerShell the command-lets that are provided in the PowerTools for Open XML are particularly handy and also the source code that is in PowerTools for Open XML can be used by C# developers to build their own applications. 

So have you seen any cool uses of PowerTools, what’s been a stand out one for you?
I’ve seen a couple of really interesting ones particularly where, one I was thinking of was a document generation system where the developers would design a custom template document and would generate a number of instance documents from that template document using PowerTools.