wordpress hit counter
Use XSLT to transform XML to Open XML - OpenXML Developer - Blog - OpenXML Developer

Use XSLT to transform XML to Open XML

Blog

Samples, Demos, and Reference Articles

Use XSLT to transform XML to Open XML

Rate This
  • Comments 6

by Bryce Telford

SO WHAT IS IT?

I set out to create an OpenXML Document from an xml file using XSLT and the Open XML SDK 2.0. While it is possible to create a fully fledged Open XML document from XSLT and XML, a few shortcuts make the process a lot easier and faster.

Included in the source are some sample documents. In this case I’m using:

1.     hl7.xml – The Data – an XML based HL7 Medical File as the data source

2.     template.docx – The Template – An existing document to model the output on

3.     transform.xslt – The Transformation – The XSLT specifies how the data will fit into the template.  

Although I’ve used an xml file on the file system for my data, any well formed data source that XSLT can interact with could work with only minor modifications to the application.

HOW DO YOU USE IT?

 To run the application using the provided sample data and template, run the following from the source directory

bin\Debug\OOXMLSpike.exe ..\..\SampleFiles\hl7.xml ..\..\SampleFiles\template.docx ..\..\SampleFiles\transform.xslt ..\..\SampleFiles\output.docx

This will produce an output.docx file in the SampleFiles directory.

HOW'S IT DONE?

Here we’ll step through a simple example from start to finish illustrating the different skills and techniques required.

The process I followed to create my XSLT and then use that to create the Open XML document involved the following steps

1.     Design the output document (template)

2.     Remove document.xml from the template

3.     Create an XSLT using the document.xml as the base.

4.     Run the console application with the following parameters, <datasource>.xml <template>.docx<transform>.xslt <outputFilename>.docx

 

Step 1: Design the Output Document (template)

You can create a document using your chosen Open XML Document programme, in this case I used Microsoft Word and created various areas to be populated. An example is the title section of the document. Once all the required formatting, static content, layout and other preferences have been set on this template you can save and close the document and proceed to the next step.

 

Step 2: Remove the document.xml from the Open XML Document

Locate your file created above, copy it and change the extension to .zip.

Open the zip file, open the word folder and copy the document.xml to your working directory

Step 3: Create an XSLT using the document.xml as the base

Rename document.xml to transform.xslt (or any name of your choosing). Open this in your desired xslt editor, I used Microsoft Visual Studio 2008 for this. Convert the xml file into an xslt by following these steps:

1.     Remove the <?xml version="1.0" encoding="UTF-8" standalone="yes"?> tag

2.     Surround the entire text (<w:document>) with

<xsl:stylesheet

     version="1.0"

     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

     xmlns:xs="http://www.w3.org/2001/XMLSchema"

     xmlns:n2="urn:hl7-org:v3"

     exclude-result-prefixes="n2 xs xsi xsl"

     >

     <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

     <xsl:template match="/">

----EXISTING DOCUMENT---

      </xsl:template>

</xsl:stylesheet>

3.     Now modify the sections you want to replace with data with the expected XSLT mark-up, I’ll use the title example here:

Before:

<w:p w:rsidR="00720332" w:rsidRPr="006078F7" w:rsidRDefault="00B308B8" w:rsidP="004A3019">

      <w:pPr>

            <w:jc w:val="center"/>

            <w:rPr>

                  <w:color w:val="FF0000"/>

                  <w:sz w:val="36"/>

                  <w:szCs w:val="36"/>

            </w:rPr>

      </w:pPr>

      <w:r>

            <w:rPr>

                  <w:b/>

                  <w:color w:val="FF0000"/>

                  <w:sz w:val="36"/>

                  <w:szCs w:val="36"/>

            </w:rPr>

            <w:t>Title</w:t>

      </w:r>

</w:p>

After:

<w:p w:rsidR="00720332" w:rsidRPr="006078F7" w:rsidRDefault="00971BFB" w:rsidP="004A3019">

      <w:pPr>

            <w:jc w:val="center"/>

            <w:rPr>

                  <w:color w:val="FF0000"/>

                  <w:sz w:val="36"/>

                  <w:szCs w:val="36"/>

            </w:rPr>

      </w:pPr>

      <w:r w:rsidRPr="006078F7">

            <w:rPr>

                  <w:b/>

                  <w:color w:val="FF0000"/>

                  <w:sz w:val="36"/>

                  <w:szCs w:val="36"/>

            </w:rPr>

            <w:t>

<xsl:value-of select="string($var1_instance/n2:ClinicalDocument/n2:title)"/>

            </w:t>

      </w:r>

</w:p>

 

Step 4: Run the Console Application

You can now run the console application with the new XSLT, this will produce the output file specified.

 

CONSOLE APPLICATION

This is a fairly simple C# application which runs the xslt against the data source, creates a new file from the template supplied and then calls the Open XML SDK to replace the body of the document with the processed xml. This is then written to the file system.

This could easily be repackaged into a larger application and used to process xml generated from a custom data source.

DIFFERENT OPERATIONS

In the sample application I’ve used various xsl transformations to create the desired display of data. These include:

·         xsl:value-of (Singular)

·         xsl:for-each (Looping)

·         xsl:if (Conditional)

Combining these with the markup already generated it is easy to dynamically populate various Word constructs including but not limited to:

·         Word Art

·         Text Boxes

·         Headings

·         Tables

·         Styles

Word Art Injection

The highlighted blocks are the only sections that have changed from the original Word markup. Here we retrieve the Oraganisation name from the data, assign it to a variable and then use that variable to populate the string section of the WordArt. The string section stores the text displayed by the WordArt.

<w:pict>

      <v:shapetype id="_x0000_t144" coordsize="21600,21600" o:spt="144" adj="11796480" path="al10800,10800,10800,10800@2@14e">

            <v:formulas>

                  <v:f eqn="val #1"/>

                  <v:f eqn="val #0"/>

                  <v:f eqn="sum 0 0 #0"/>

                  <v:f eqn="sumangle #0 0 180"/>

                  <v:f eqn="sumangle #0 0 90"/>

                  <v:f eqn="prod @4 2 1"/>

                  <v:f eqn="sumangle #0 90 0"/>

                  <v:f eqn="prod @6 2 1"/>

                  <v:f eqn="abs #0"/>

                  <v:f eqn="sumangle @8 0 90"/>

                  <v:f eqn="if @9 @7 @5"/>

                  <v:f eqn="sumangle @10 0 360"/>

                  <v:f eqn="if @10 @11 @10"/>

                  <v:f eqn="sumangle @12 0 360"/>

                  <v:f eqn="if @12 @13 @12"/>

                  <v:f eqn="sum 0 0 @14"/>

                  <v:f eqn="val 10800"/>

                  <v:f eqn="cos 10800 #0"/>

                  <v:f eqn="sin 10800 #0"/>

                  <v:f eqn="sum @17 10800 0"/>

                  <v:f eqn="sum @18 10800 0"/>

                  <v:f eqn="sum 10800 0 @17"/>

                  <v:f eqn="if @9 0 21600"/>

                  <v:f eqn="sum 10800 0 @18"/>

            </v:formulas>

            <v:path textpathok="t" o:connecttype="custom" o:connectlocs="10800,@22;@19,@20;@21,@20"/>

            <v:textpath on="t" style="v-text-kern:t" fitpath="t"/>

            <v:handles>

                  <v:h position="@16,#0" polar="10800,10800"/>

            </v:handles>

            <o:lock v:ext="edit" text="t" shapetype="t"/>

      </v:shapetype>

      <v:shape id="_x0000_s1026" type="#_x0000_t144" style="position:absolute;margin-left:0;margin-top:.85pt;width:440.25pt;height:65.25pt;z-index:251660288;mso-position-horizontal:center" o:borderbottomcolor="this" fillcolor="black">

            <v:shadow color="#868686"/>

            <xsl:variable name="organisationName" select="string(n2:ClinicalDocument/n2:custodian/n2:assignedCustodian/n2:representedCustodianOrganization/n2:name)"/>

            <v:textpath style="font-family:&quot;Arial Black&quot;" fitshape="t" trim="t" string="{$organisationName}"/>

      </v:shape>

</w:pict>

Repeating Data

Below is a section of repeating data used in one of the text boxes. It iterates through each ‘item’ in the list node (xsl:for-each). For each ‘item’ it will print out the static formatting and also the value of that item in the text area.

<xsl:for-each select="n2:ClinicalDocument/n2:component/n2:StructuredBody/n2:component[4]/n2:section/n2:text/n2:list/n2:item">

      <w:p w:rsidR="00E05032" w:rsidRPr="00B741A8" w:rsidRDefault="00E05032" w:rsidP="00E05032">

            <w:pPr>

                  <w:pStyle w:val="ListParagraph"/>

                  <w:numPr>

                        <w:ilvl w:val="0"/>

                        <w:numId w:val="5"/>

                  </w:numPr>

            </w:pPr>

            <w:r>

                  <w:rPr>

                        <w:noProof/>

                  </w:rPr>

                  <w:t>

                        <xsl:value-of select="string(.)"/>

                  </w:t>

            </w:r>

      </w:p>

</xsl:for-each>

 

Attachment: OOXMLXSLTTransform.zip
  • I designed fleXdoc in a similar way: it also supports several xslt-constructions (value-of, for-each, if), but these instructions can be inserted in the template from within Word itself.

    You can check it out here: http://flexdoc.codeplex.com
  • I have this working and it does exactly what I want, except one frustrating issue.  I have the following in my XML file, however, when it comes through the style sheet all the new-line/carriage returns are gone.  I've tried placing ASCII code in the XML, using space preserve in the xslt, that renders the tabs, but not the carriage returns.  I've even placed the <w:t> and other code in the xml without any solution.   I've searched for solutions but have yet to find one.

    <OUTLINE>
    Topical Outline:  
       I. Reaction to Stimuli 7
    A. Visual
    B. Auditory
    C. Tactile

      II. Attending Skills 7
    A. Discussion
    B. Conversations
    C. Listening

     III. Visual Recognition 8
    A. Letters
    B. Symbols

      IV. Following Directions 8
    A. Oral
    B. Symbols
    C. Written
    </OUTLINE>
  • Jeff,

    The data from the xml-file is put within a Text-element (<w:t>). This element supports newlines, BUT only as Break-elements (<w:br/>), not just \n. So try to use <w:br/> inside the data.

    By the way, fleXdoc translates \n automatically to break-elements. You may want to check it out: http://flexdoc.codeplex.com.

    Gr,
    Robert te Kaat
  • Hi,

    Recently I have a task to upgrade the application that export data to Excel 2010, from Excel 2010. Anyone here can help to convert XSLT for 2003 to Office Open XML? Or am I in the right track?

    Below is the sample XML content generated. If saved as .xls, it can be opened in Excel 2010. However, if I just rename the extension to xlsx, it can’t be opened. Please help. My task is to export the data to xlsx format. Thanks!



    <?xml version="1.0"?>
    <?mso-application progid="Excel.Sheet"?>
    <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
    xmlns:o="urn:schemas-microsoft-com:office:office"
    xmlns:x="urn:schemas-microsoft-com:office:excel"
    xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
    xmlns:html="http://www.w3.org/TR/REC-html40">
    <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
    <ProtectStructure>False</ProtectStructure>
    <ProtectWindows>False</ProtectWindows>
    </ExcelWorkbook>
    <Styles>
    <Style ss:ID="Default" ss:Name="Normal">
    <Alignment ss:Vertical="Bottom"/>
    <Borders/>
    <Font/>
    <Interior/>
    <NumberFormat/>
    <Protection/>
    </Style>
    <Style ss:ID="header">
    <Font x:Family="Swiss" ss:Bold="1"/>
    </Style>
    </Styles>
  • Hi,

    I'm not really sure what you are attempting to do.  What exactly do you mean by 'an XSLT for 2003'?  The file that you show is not an XSLT style sheet.  An XLSX file is a very specific kind of file - must be a ZIP file that conforms to OPC, and contain XML with a very specific dialect.

    -Eric
  • It is a great example and it was working fine with me untill i had to insert dynamic images in few rows. Is there anyway we can do it? I tried many ways ans finally i am giving up. please help me.

    SAMP

Page 1 of 1 (6 items)