wordpress hit counter
Re: How to convert HTML to OpenXML - C/C++ - Development Tools - OpenXML Developer

Re: How to convert HTML to OpenXML

Development Tools

Discussions about working with Open XML using a wide range of development tools

How to convert HTML to OpenXML

  • rated by 0 users
  • This post has 5 Replies |
  • 12 Followers
  • Hi,

    I am working on a task to convert an HTML text (with images/formatted text) to OpenXML format (i.e) docx format.  We need to integrate this feature with a product developed in Native VC++ ( I assume the feature needs to be developed in VC++).

    I googled and found out that there is no Microsoft Libraries to support openXML (docx) in Native VC++.

    Is there any way to do it?

    Could you please help me on this issue. Please let me know if you need any other information regarding my task. Any help is appreciated.

    Cheers
  • you just need to manually translate C# code snippets into vc++  for .NET  .  In theory the same openxml sdk library will work with c++ for .NET  (as long as the language support .NET , such as VB , C# , C++  , ironPython .. , BOO , .. etc).

    If you are working in VC++ un-managed codes, you can using COM interop to use .NET library. So just develop your openxml solution using C# and make sure you built it as a strongly type ( use sn.exe to generate keys or let visual studio to do it , and make it COM visible ) ..

    Then you write your native C++ COM wrapper to call the C# library that you have developed .


  • Codeplex has a dll that can convert simple types:

    http://notesforhtml2openxml.codeplex.com/

    HTH
  • Thanks to Mr papaburger for helping about native C++ COM wrapper to call the C# library that you have developed .

  • Create a new console application. Add a reference to DocumentFormat.OpenXml.dll (shipped with the OpenXml SDK 2.0).
    Add an html file and fill it with:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
    <html>
         <head>
              <title></title>
         </head>
         <body>
              Looks how cool is <font size="x-large"><b>Open Xml</b></font>.
              Now with <font color="red"><u>HtmlToOpenXml</u></font>, it nevers been so easy to convert html.
              <p>
                   If you like it, add me a rating on <a href="http://notesforhtml2openxml.codeplex.com">codeplex</a>
              </p>
              <hr>
         </body>
    </html>
    

    Add a Resources.resx component and add your html file inside it.

    In Program.cs, add these lines of code:

    using System.IO;
    using DocumentFormat.OpenXml;
    using DocumentFormat.OpenXml.Packaging;
    using DocumentFormat.OpenXml.Wordprocessing;
    using NotesFor.HtmlToOpenXml;
    ...
    
    static void Main(string[] args)
    {
         const string filename = "test.docx";
         string html = Properties.Resources.DemoHtml;
    
         if (File.Exists(filename)) File.Delete(filename);
    
         using (MemoryStream generatedDocument = new MemoryStream())
         {
              using (WordprocessingDocument package = WordprocessingDocument.Create(generatedDocument, WordprocessingDocumentType.Document))
              {
                   MainDocumentPart mainPart = package.MainDocumentPart;
                   if (mainPart == null)
                   {
                        mainPart = package.AddMainDocumentPart();
                        new Document(new Body()).Save(mainPart);
                   }
    
                   HtmlConverter converter = new HtmlConverter(mainPart);
                   Body body = mainPart.Document.Body;
    
                   var paragraphs = converter.Parse(html);
                   for (int i = 0; i < paragraphs.Count; i++)
                   {
                        body.Append(paragraphs[i]);
                   }
    
                   mainPart.Document.Save();
              }
    
              File.WriteAllBytes(filename, generatedDocument.ToArray());
         }
    
         System.Diagnostics.Process.Start(filename);
    }
    



    Run your application and you will obtain:
    demo.jpg

     

     

     

    source: http://html2openxml.codeplex.com/documentation

     

    <a href="http://noithatducduong.com/cua-go/cua-go-tu-nhien/" title="cửa gỗ đức dương">cửa gỗ</a>

  • Isn't it easier to work with something like a copy paste action? To use the html code instead of converting it jocur

Page 1 of 1 (6 items)