Following on from Bryce Telford’s article about using XSLT to modify an existing Open XML document in C# I decided to create a Ruby on Rails flavour of his solution.
I have provided the source code and files and I used the same xml, docx and xslt files as in Bryce’s article. The source files include the project folder so it can be opened in Netbeans, this folder will have no effect on the running of the application so can be deleted if necessary.
Here is a brief run through what these files are:
1. hl7.xml – This includes the data used for populating our docx
2. template.docx – This is the docx template to be filled with our data from the xml
3. transform.xslt – This file is used to translate the xml file to the correct Open Office xml format.
To run this app you will need to have installed Ruby 1.8.6 and the gems, zipruby, libxml, libxslt, rails and Nokogiri.
To run this application you will need to have a Ruby on Rails install and extract the required files from the source code provided.
Using the upload webpage pass in the xml data, docx template and xslt transformation file then click translate. Once the process has run the new docx will be downloaded.
The steps needed to use this app are as follows:
1. Design output document (template)
2. Create an XSLT using the document.xml from the template as a starting point.
3. Input files into the web page and click upload.
Steps 1 and 2 are covered in detail on Bryce Telford’s article so I will only cover the detail of step 3.
Workings of the Ruby on Rails Application
When the upload button is clicked the 3 files are uploaded to the server into the resources directory. Once this is completed a Ruby class called OfficeOpenXML is called to translate the xml data by passing through the locations of the 3 files and the desired location of the new document.
#Save files to temp area and Perform Translation
def self.save(upload,upload1,upload2)
name = sanitize_filename(upload['file'].original_filename).to_s
name1 = sanitize_filename(upload1['file1'].original_filename).to_s
name2 = sanitize_filename(upload2['file2'].original_filename).to_s
directory = "public\\data\\"
# create the file path
path = File.join(directory, name).to_s
path1 = File.join(directory,name1).to_s
path2 = File.join(directory,name2).to_s
# write the file
upload_file(path,upload,'file')
upload_file(path1,upload1,'file1')
upload_file(path2,upload2,'file2')
OfficeOpenXML.translate(path,
path1,
path2,
"public\\resources\\newdoc.docx")
End
The translate function retrieves the existing xml from the template as a Nokogiri XML Document, using the zipruby library to extract the document.xml
existing_xml = get_from_tempate("word/document.xml")
def get_from_tempate(filename)
#retrieve the document from the template doc
xml = Zip::Archive.open(@template) do |zipfile|
zipfile.fopen(filename).read
end
#parse the resulting file into the Nokogiri xml doc
Nokogiri::XML.parse(xml)
Using XPath the body node of the existing document is found. Since there is only one body element in the document the first item in the collection can be referenced.
body_node = existing_xml.root.xpath("w:body", {"w" => "http://schemas.openxmlformats.org/wordprocessingml/2006/main"}).first
All nodes attached to the existing body node are removed, effectively clearing all current xml data.
body_node.children.unlink
The xml data is transformed using given XSLT and using XPath the body node is selected, then children looped through to add to the existing xml body.
def new_xml
#transform the xml values to fit out word document.
stylesheet_doc.transform(Nokogiri::XML.parse(File.open(@xml)))
def stylesheet_doc
#Parse the xslt into the Nokogiri XSLT
Nokogiri::XSLT.parse(File.open(@xslt))
new_xml.xpath("*/w:body", {"w" => "http://schemas.openxmlformats.org/wordprocessingml/2006/main"}).first.children.each do |child|
body_node.add_child(child)
Once the xml has been updated the new document is compressed into a new document using zipruby.
compress(existing_xml)
def compress(newXML)
#Copy the template to the new document
FileUtils.copy(@template, @newdoc)
#Open the zip archive
Zip::Archive.open(@newdoc, Zip::CREATE) do |zipfile|
#Replace the document.xml with our new xml
zipfile.add_or_replace_buffer('word/document.xml', newXML.to_s)
Once the file has been created, the file is sent back to the browser with the content type of docx. The content type needs to be specified so that Internet Explorer knows what type of file is coming back, otherwise the browser will assume the file is a .zip.
send_file("#{RAILS_ROOT}/public/resources/newdoc.docx", :filename=> "newdoc.docx", :type=>"application/vnd.openxmlformats-officedocument.wordprocessingml.document")
In summary the Ruby on Rails application takes an XML file, translates it to fit the document schema and replaces the body of an existing document with the result of the translation. The new document is then served by the web server as a docx with the correct mime type.
I would like to acknowledge Kurt Preston for his assistance with the rails component of the application.
I was hoping to see this updated for the latest and greatest Ruby and Rails versions. Also, I think nokogiri provides XSLT support now. I need this for a client's project and am willing to hire someone to do the work. Contact me at bal711 at gmail to work out details. Thanks!