wordpress hit counter
WordProcessingML Snippets: Part 2, basic paragraphs - OpenXML Developer - Blog - OpenXML Developer
Goodbye and Hello

OpenXmlDeveloper.org is Shutting Down

There is a time for all good things to come to an end, and the time has come to shut down OpenXmlDeveloper.org.

Screen-casts and blog posts: Content on OpenXmlDeveloper.org will be moving to EricWhite.com.

Forums: We are moving the forums to EricWhite.com and StackOverflow.com. Please do not post in the forums on OpenXmlDeveloper.org. Instead, please post in the forums at EricWhite.com or at StackOverflow.com.

Please see this blog post for more information about my plans moving forward.  Cheers, Eric

WordProcessingML Snippets: Part 2, basic paragraphs

WordProcessingML Snippets: Part 2, basic paragraphs

  • Comments 9

This article is part of the "Building documents with code snippets" series.

The first two snippets I'll discuss which actually modify your documents are AddParagraph and RemoveParagraph. These snippets can be used to work with paragraphs inside various places in the document. You can, for instance, add paragraphs to the main document body, headers, footers and content controls. Both snippets are not very complex; working with tables or images later on the series will require a bit more code than the samples in this article.

Let's start with a note about xml namespaces. Like I mentioned before, various namespaces are used across a WordProcessingML document. Some are WordProcessingML specific, others you'll find in the other Open XML formats as well, like the namespaces specific to DrawingML.

When you examine the xml structure of a paragraph inside WordProcessingML, like the one you can find below, you will notice the 'w' prefix is used for the elements. By default the 'w' namespace prefix is used for the WordProcessingML main document namespace:

While it is possible to use another prefix character, the 'CreateNamespaceManager' snippet which creates the XmlNamespaceManager needed for working with WordProcessingML assumes this namespace is tied to the 'w' prefix. This means the  new snippets also needs to use this prefix character when creating new WordProcessingML nodes. A more advanced implementation would first try and detect which prefix character is used for various namespaces, but for the sake of brevity and easy to read code, I decided not to go that way.

Now the snippets. I will start with the 'AddParagraph' snippet. A really basic paragraph containing a piece of text takes the following form in Open XML:

                  <w:t>My paragraph's text</w:t>

A real world paragraph will contain more information than this, obviously. You will usually find things such as style settings for the paragraph. I can expand on those features in later samples. For now I'll keep it simple to show how this and the other snippets work code-wise.

When you use the 'AddParagraph' snippet, a method with the following signature is added to your code file:

      static void AddParagraph(
            XmlNode parentNode, 
            XmlNamespaceManager namespaces, 
            string optionalText)

The first parameter defines the area in the document where you want to insert a paragraph. A paragraph can be inserted at various locations inside the document, so I named the first variable 'parentNode' instead of 'bodyNode' to emphasize this. The second parameter contains the now familiar namespace manager, and the third allows you to optionally add text into the new paragraph.

The implementation makes use of the classes in System.Xml to create the correct xml structure. New xml nodes can be created using the XmlDocument class, which is available as a property on each XmlNode, so it doesn't need to be passed in as a parameter to each method.

You do have to be careful about the order in which you add nodes in code. The Open XML schemas define many nodes to be composed of a sequence of child nodes. This means that the order in which child elements appear matters a lot.  Get the order wrong, and the code won't work.

      string wordNamespace = namespaces.LookupNamespace("w");
      XmlNode paragraphNode = parentNode.OwnerDocument.CreateElement(
            "w", "p", wordNamespace);
      if (String.IsNullOrEmpty(optionalText) == false)
            XmlNode rangeNode = parentNode.OwnerDocument.CreateElement(
                  "w", "r", wordNamespace);
            XmlNode textNode = parentNode.OwnerDocument.CreateElement(
                  "w", "t", wordNamespace);
            textNode.InnerText = optionalText;

When reading the draft of the Open XML file formats, you can find out that the 'p' element contains no required child nodes. Thus the 'r' node which defines a range of content is only added when the optional text is actually specified by the caller.

The second snippet creates another new method which you can use to remove a paragraph from the document. You specify the x-path predicate to use for selecting the paragraphs to remove from a section in the document. The x-path predicate is the part between the brackets [].

If you want to remove the paragraph based on the index, just specify a number.  But it is also possible to remove all paragraphs containing a specific text by using the following predicate:

w:r/w:t[contains(text(), "someText")]


      static void RemoveParagraph(XmlNode parentNode,
            XmlNamespaceManager namespaces,
            string selectionPredicate)
            string xpath = String.Format("w:p[{0}]", selectionPredicate);
            foreach (XmlNode paragraphNode in parentNode.SelectNodes(xpath, 


To use these snippets you can add code in the same manner as mentioned in the startup article on WordProcessingML. To add a new paragraph to the main document body, use the following code:

      XmlNamespaceManager namespaces = CreateNamespaceManager(doc);
      XmlNode body = GetToBodyNode(doc, namespaces);
      AddParagraph(body, namespaces, "My new paragraph");
      RemoveParagraph(body, namespaces, 
            "w:r/w:t[contains(text(), "My new")]");

That's it for working with paragraphs. The next article will go into working with tables. I'll show how to add and remove tables, columns and rows. The code gets more elaborate when working with tables, but the basics explained in this article will apply to those snippets as well. You can expect the next article in the coming few days, so stay tuned!

Page 1 of 1 (9 items)