wordpress hit counter
Removing macro from WordProcessingML document using Java - OpenXML Developer - Blog - OpenXML Developer
Goodbye and Hello

OpenXmlDeveloper.org is Shutting Down

There is a time for all good things to come to an end, and the time has come to shut down OpenXmlDeveloper.org.

Screen-casts and blog posts: Content on OpenXmlDeveloper.org will be moving to EricWhite.com.

Forums: We are moving the forums to EricWhite.com and StackOverflow.com. Please do not post in the forums on OpenXmlDeveloper.org. Instead, please post in the forums at EricWhite.com or at StackOverflow.com.

Please see this blog post for more information about my plans moving forward.  Cheers, Eric

Removing macro from WordProcessingML document using Java

Removing macro from WordProcessingML document using Java

  • Comments 7

 Article by Vineela Kavoori, Sonata Software Limited

 

This article describes how to remove macro related details from the existing macro enabled word document using core Java.

 

To remove a macro from WordProcessingML document, the steps to be followed are:

1.    Unzip the macro enabled word document.

2.    Change the content type for the main document (document.xml) in [Content_Types].xml to “application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml”

3.    Remove entry of VBA related content and relationship type

4.    Delete all the parts related to macro

5.    Zip all the files and name the extension of file as “.docx”

 

Step 1:  Unzip the existing macro enabled word document:

To unzip the existing word document

1.           Get the document from the location specified.

2.           Using the "zip" package provided in the "util" package, unzip the word document 

The code snippet for this is as follows: 

public static void unZipFile(String zipFileName, String toExtractFile)

{               ..……      

     ZipFile zipFile = new ZipFile(sourceZipFile, ZipFile.OPEN_READ);

                    Enumeration enumeration = zipFile.entries();

                    while (enumeration.hasMoreElements())

                    {

                        ZipEntry zipEntry = (ZipEntry) enumeration.nextElement();

                        String currName = zipEntry.getName();

                        File destFile = new File(destDirectory, currName);                       

                        File destinationParent = destFile.getParentFile();

                        destinationParent.mkdirs();                        

                        if( ! zipEntry.isDirectory())

                        {

                            BufferedInputStream is =

                                    new  BufferedInputStream(zipFile.getInputStream(zipEntry));

                            int currentByte;

                            FileOutputStream fos = new FileOutputStream(destFile);

                            BufferedOutputStream dest =  new BufferedOutputStream(fos);                         

                            while((currentByte = is.read()) != -1)

                            {

                                dest.write(currentByte);

                            }

            }

        }

     ……

}

Step 2:  Modify [Content_Types].xml

To modify the [Content_Types].xml, the steps to be followed are:

1.    Get document element of [Content_Types].xml

2.    Navigate to the element where “ContentType” attribute is set to “application/vnd.ms-word.document.macroEnabled.main+xml”

3.    Change this “ContentType” attribute to “application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml”.

The code snippet for this is:

public String contentType = "application/vnd.openxmlformats-

                                                officedocument.wordprocessingml.document.main+xml";

public String contentTypeMacroEnabled = "application/vnd.ms-                                                                                                             word.document.macroEnabled.main+xml";

………

for(int contentNodecount=0; contentNodecount < overrideLst.getLength() ; contentNodecount++)

 {

Node contentTypeNode = overrideLst.item(contentNodecount);

 NamedNodeMap map = contentTypeNode.getAttributes();

           Node docElement = map.getNamedItem("ContentType");

           Element thisElement = (Element) contentTypeNode; 

                if (docElement.toString().contains(contentTypeMacroEnabled))

 {

                    nameOfDocFile = thisElement.getAttribute("PartName");

                    docElement.setNodeValue(contentType);                   

                }

………… 

Step 3:  Remove entry of vba related content and relationship type

1.    Remove the “Override” element in [Content_Types].xml that is pointing to the files related to macro (i.e vbaData.xml )

2.    Remove the “Relationship “ element that is pointing to the files related to macro (i.e vbaData.bin ) from “document.xml.rels”

The code snippet for this is:

public String contentOfPartToBeRemoved = "application/vnd.ms-word.vbaData+xml";

public String contentOfBinFileInRels = 

                    http://schemas.microsoft.com/office/2006/relationships/vbaProject;

………….. 

for(int contentNodecount=0; contentNodecount < overrideLst.getLength() ; contentNodecount++)

{

Node contentTypeNode = overrideLst.item(contentNodecount);

 NamedNodeMap map = contentTypeNode.getAttributes();

Node docElement = map.getNamedItem("ContentType");

…..

if(docElement.toString().contains(contentOfPartToBeRemoved))

 {

          …………

           rootElement.removeChild(contentTypeNode);

         ………………

……..

for(int i =0 ; i < childElements.getLength(); i++) {

                    if(childElements.item(i).getNodeType() == childElements.item(i).ELEMENT_NODE)

                    {child = (Element) childElements.item(i); }

                   if(child != null)

                   {

                    if(child.hasAttribute("Target")) {

                        String binFileName = child.getAttribute("Target");

                        String attribute = child.getAttribute("Type");

                        if(attribute.equals(contentOfBinFileInRels)) {                          

                              ………

                            rootElement.removeChild(child);

           …………

………


Step 4:  Delete parts related to macro i.e vbaData.xml, vbaProject.bin

To delete these files, the steps to be followed are:

1.    Get the path of the files from the “[Content_Types].xml  and “document.xml.rels” files

2.    Delete the files from respective paths obtained from the above step.

The code snippet for the above steps is:

public String contentOfPartToBeRemoved = "application/vnd.ms-word.vbaData+xml";

public String contentOfBinFileInRels = 

                   "http://schemas.microsoft.com/office/2006/relationships/vbaProject";

public static String extractToFolder = "E:\\ooxml\\DisableMacro\\MacroDisabled";   

………

for(int contentNodecount=0; contentNodecount < overrideLst.getLength() ; contentNodecount++)

{

Node contentTypeNode = overrideLst.item(contentNodecount);

 NamedNodeMap map = contentTypeNode.getAttributes();

Node docElement = map.getNamedItem("ContentType");

          ……….

if(docElement.toString().contains(contentOfPartToBeRemoved)) {               

                relativePathOfPartToBeRemoved = thisElement.getAttribute("PartName");

……..

 File fileToBeDeleted = new File(extractToFolder+relativePathOfPartToBeRemoved);                    fileToBeDeleted.delete();

………….. 

if(child.hasAttribute("Target")) {

                        String binFileName = child.getAttribute("Target");

                        String attribute = child.getAttribute("Type");

                        if(attribute.equals(contentOfBinFileInRels )) {                          

                           String parentFolderOfBinFile = helpObj.getImmediateParentFolderForFile(binFileName);                          

                           String binFilePath = "";

                            if(parentFolderOfBinFile.equals(""))

                            {

                               binFilePath = extractToFolder+"\\"+binFileName;

                            }

                            else

                            {

                                String locationOfBinFile =

                             locationOfDocumentRels.substring(0,

locationOfDocumentRels.indexOf(parentFolderOfBinFile));

                                binFilePath = locationOfBinFile+parentFolderOfBinFile+"\\"+binFileName;

                            }                           

                            File binFile = new File(binFilePath);

                            binFile.delete();

                            break;

                        }

 

Step 5:  Zip all the files and name the extension of file as “.docx”

To zip back all the files, the steps to be followed are:

1.    Get the list of path of all the files that have to be zipped.

2.    Get the list of relative path of all the files to be zipped.

3.    Using the "zip" package provided in the "util" package, zip the files into a word document 

The code snippet for this is:

            …….

            FileOutputStream outStream = new FileOutputStream(zipFileName);

            ZipOutputStream zipOutStream = new ZipOutputStream (outStream);

            zipOutStream.setLevel(Deflater.BEST_COMPRESSION);

            ……

             while(itr.hasNext())

             {

                             String path = (String) itr.next();

                             inStream = new FileInputStream(path);

                             String relPath = (String) relItr.next();

                             zipOutStream.putNextEntry(new ZipEntry(relPath));

                              int i=0;

                             while ((i=inStream.read()) != -1 )

                            {

                                    zipOutStream.write(i);

                            }

            }

            ….

 

This is a simple demo which demonstrates use of Java for removing macro from word document. The example demo is attached with the article as a zip file.

 

PS:  Once the program runs for a particular document, output file (MacroDisabled.docx) and one folder (MacroDisabled) will be created at the output location specified. These files have to be deleted before we run the code for the next input document.

Attachment: DisableMacro.zip
Page 1 of 1 (7 items)