Interview by Jian Sun, July 17, 2009
Xiaoyu is a content developer for the www.openxmldeveloper.org site and is based in China.  In this interview he talks about his experiences with Open XML and the www.openxmldeveloper.org site.
 
 Hi, my name is Jian Sun. I’m originally from Qingdao, China. On behalf of my colleagues I’m conducting this interview for the www.openxmldeveloper.org site. Thanks for taking your time to answer my questions for this interview. We will post this recording on the website, as well as the transcript in English and Chinese. Can you introduce yourself briefly please?
My last name is Zou, I live in Chengdu, China. You can call me Xiaoyu.
 
Xiaoyu, what do you think is the coolest implementation of Open XML?
The coolest implementation … this is a rather a broad question …
 
I mean the coolest implementation of Open XML application?
I think the format is very open. When you open the file, you immediately know the details and can start analyzing it, this is pretty cool. Prior to Office 2007, the office file formats were comparatively complex, including the open format rtf. For small applications like mine, you had to know a lot of detail about the file format. Open XML now simplifies this process; I only need to get to know the part I need to know. It’s just like working with HTML. I don’t have to parse out everything. I only need to extract out the part that I need. I think this is really good.
 
What are some of the Open XML based products that you know are pretty good?
I don’t know a great deal about the products. Apart from Microsoft Office 2007, I haven’t heard a lot about the products. On the www.openxmldeveloper.org forum, products and applications are rarely mentioned as well.
 
Do you know if OpenOffice has support for Microsoft Office Open XML format?
I’m not sure about the latest version of OpenOffice, but OpenOffice primarily supports ODF (Open Document Format). I have learned about document file format standards in China, it’s called UOA, as well as domestic office products that support this format too.
 
Do we have WPS format in China?
Yes, WPS is working with Google, and it’s (the editor) a free download from Google. I’m not sure if that supports Microsoft Office Open XML.
 
When you do Open XML development, what else do you use apart from www.openxmldeveloper.org?
Mainly www.msdn.com , the SDK references on there are very important to me when developing Open XML.
 
Are there any other references that you use?
There are other related smaller sites and articles around that can be found through search, but they are not ’official’. I haven’t adopted a lot of what was written on them. www.openxmldeveloper.org is definitely a good resource, and I frequently post questions on there.
 
Have you done any office development prior to Office 2007?
Not really, because the file formats were proprietary and there are license implications, so I haven’t really done much. As the new Open XML format is open and is a standard now, there is more to look forward to.
 
What do you do when you are not building software? Hobbies etc…
Haha, there are a lot. If you visit my blog, you’ll find I love plants, such as succulents. I also enjoy music and the arts.
 
Do you develop for a personal hobby or do you work as a developer?
It’s purely a personal hobby, for my learning and research purposes. I work on my own projects.
 
Tell me about the open source project you are working on?
It’s not open source for the moment, it is freeware. It’s CAT (Computer Assisted Translation) software. It deals with different file formats, analyzes them and assists a translator to translate the output to different file formats too. So the Open XML file format plays a very important role here.
 
What language does your software translate to and from?
Any language that you require really. It is not an automated translation, but assists people to translate. For example, if you are a translator, it doesn’t matter what language you are translating, this software will simplify the translation process and improve your efficiency.
 
How does it simplify the translation process?
It leverages a technology called ’Translation Memory’ and other technologies that assist and help with formatting documents. When translators are translating, they constantly have to adjust document formatting. This is an inconvenience. Through using technologies that assist this process, they no longer have to worry with adjusting and generating document formats. This makes translation work easier through improving the quality of the work and translators’ efficiency as well. ‘Translation Memory’ assists a translator through translating similar phrases or sentences that have been translated before. It allows the translator to select from what is stored in ’Translation Memory’, speeding up and simplifying the translation process.
 
I see, so it allows the translator to select translations from ‘Translation Memory’?
That’s right. ‘Translation memory’ remembers already translated sentences and sentence structure. It translates sentences with those that have a similar structure to the ones in ‘Translation Memory’. The software auto translates the sentence structure enabling the translator to focus on the parts that are different.
 
That’s very interesting. So which version of the Open XML Format SDK have you been using?
2.0
 
What are some of your experiences with SDK 2.0?
I didn’t use 1.0, I started on 2.0. One of the good things about the SDK 2.0 is it packages up all Open XML schema and I present them to developers. Developers never have to worry about things like validation. I work efficiently when working with SDK 2.0. If I were to deal with the XML DOM directly, I’ll have a lot of challenges around properties and access. In addition, the integration with LINQ is great. Some of the methods that come under IEnumerable interface are very handy.

What’s inconvenient about the SDK is it’s far too granular. Every single XML element is a Class, I’m not so sure about the original thoughts in designing the API in this way. It feels too repetitive but still practically usable. There are still areas where documentation is incomplete. It feels like the API is still very much designed around XML structures, which requires the programmer to hold a view of the XML, not completely Objected Oriented. Maybe it needs to cater for LINQ? When coding, I need to think from a XML perspective, not so much an Object Oriented angle.