Welcome to OpenXML Developer Sign in | Join | Help

Need an OpenXML SDK code for extracting slide titles and slide text

Last post 09-12-2008, 11:31 AM by jegu. 5 replies.
Sort Posts: Previous Next
  •  08-02-2008, 7:48 AM 3525

    Need an OpenXML SDK code for extracting slide titles and slide text

    Hello,

    I need to know how to use OpenXMLSDK (DocumentFormat.OpenXml.Packaging) to extract slide titles as well as slide texts for all slides in order.

    I would appreciate if you can give me some sample code in C# or a relevant tutorial website.

    Best regards,

  •  08-04-2008, 8:45 PM 3532 in reply to 3525

    Re: Need an OpenXML SDK code for extracting slide titles and slide text

    Try this

    const string presentationmlNamespace = "http://schemas.openxmlformats.org/presentationml/2006/main";

    List<string> titles = new System.Collections.Generic.List<string>();

    using (PresentationDocument pptDoc = PresentationDocument.Open(@"C:\users\abhatia\desktop\test.pptx", false))

    {

    // Manage namespaces to perform Xml XPath queries.

    NameTable nt = new NameTable();

    XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);

    nsManager.AddNamespace("p", presentationmlNamespace);

    foreach (SlidePart slidePart in pptDoc.PresentationPart.SlideParts)

    {

    XmlDocument slideDoc = new XmlDocument(nt);

    slideDoc.Load(slidePart.GetStream());

    XmlNodeList ss = slideDoc.SelectNodes("//p:sp", nsManager);

    foreach (XmlNode w in ss)

    {

    if (w.SelectSingleNode(".//p:ph", nsManager) != null)

    {

    string hed = w.SelectSingleNode(".//p:ph", nsManager).Attributes["type"].Value;

    // If you want to include subTitle then check for head=="subTitle" as well

    if (hed == "title" || hed == "ctrTitle")

    // // This is slide Header

    MessageBox.Show(w.InnerText);

    }

    else

    {

    // This is slide Text

    MessageBox.Show(w.InnerText);

    }

    }

    }

    }


    Ankush
  •  08-06-2008, 9:13 AM 3536 in reply to 3532

    Re: Need an OpenXML SDK code for extracting slide titles and slide text

    I was also looking to see how I can get the correct order of the slides and saw this post.   I tested your code but it doesn't seem to put the slides in order.

    When you view a presentation in "normal" view you will see the slided in the left pane numbered 1 through x.   How can you get the titles ordered just like that?

    Thanks

  •  08-12-2008, 5:12 AM 3557 in reply to 3536

    Re: Need an OpenXML SDK code for extracting slide titles and slide text

    hi,

    the way i did that was to extract and sort the slide ids like rId1,rId13,rId11 etc.
    just parse the numeric value of it 1,13,11
    sort it
    prefix rId again to this sorted slide ids
    now process foreach slide

    Although I am not sure if the slides are in this order always but when i did a qucik test in office 2007, the random addition and deletion of slides would always put the sldie ids in sequence

    thanks

  •  08-21-2008, 2:14 PM 3595 in reply to 3536

    Re: Need an OpenXML SDK code for extracting slide titles and slide text

    Hi,

    Try this code. It extracts the slide text & title in their order.

    const string presentationmlNamespace = "http://schemas.openxmlformats.org/presentationml/2006/main";

    PresentationDocument presDoc = PresentationDocument.Open(@"C:\users\abhatia\desktop\test.pptx", true);

    NameTable nt = new NameTable();

    XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);

    nsManager.AddNamespace("p", presentationmlNamespace);

    XmlDocument presXML = new XmlDocument(nt);

    presXML.Load(presDoc.PresentationPart.GetStream());

    // Get the rId of the slides present in the Presentation.xml

    XmlNodeList ss1 = presXML.SelectNodes("//p:sldId", nsManager);

    // Loop thorugh each slide and the slide Part by their rId

    foreach (XmlNode w1 in ss1)

    {

    OpenXmlPart sldPart = presDoc.PresentationPart.GetPartById(w1.Attributes[1].Value);

    XmlDocument slideDoc = new XmlDocument(nt);

    slideDoc.Load(sldPart.GetStream());

    // Get the sp element to read slide text and header

    XmlNodeList ss = slideDoc.SelectNodes("//p:sp", nsManager);

    foreach (XmlNode w in ss)
    {
    if ((w.SelectSingleNode(".//p:ph", nsManager) != null) && (w.SelectSingleNode(".//p:ph", nsManager).Attributes["type"] != null))
    {
    string hed = w.SelectSingleNode(".//p:ph", nsManager).Attributes["type"].Value;

    // If you want to include subTitle then check for head=="subTitle" as well

    if (hed == "title" || hed == "ctrTitle")
    // // This is slide Header

    MessageBox.Show(w.InnerText);
    }
    else
    {
    // This is slide Text

    MessageBox.Show(w.InnerText);

    }}}

    presDoc.Close();

     


    Ankush
  •  09-12-2008, 11:31 AM 3692 in reply to 3595

    Re: Need an OpenXML SDK code for extracting slide titles and slide text

    Hi,
    The slide order can only be decided by the order in the presentation.xml of SlideLst Node.

    "p:sldIdLst>p:sldId> ... /p:sldId> /p:sldIdLst>"

    Try moving the slides inside the presentation and have look into the presentation.xml file. Nothing (like rel Id, slidepart.. ) will be changed except the "p:sldId" order in "p:sldIdLst" node.

    Regards,
    Jegu
View as RSS news feed in XML