wordpress hit counter
Using Streams to open/edit/save documents seems to corrupt output intermittently - .Net - Development Tools - OpenXML Developer

Using Streams to open/edit/save documents seems to corrupt output intermittently

Development Tools

Discussions about working with Open XML using a wide range of development tools

Using Streams to open/edit/save documents seems to corrupt output intermittently

  • rated by 0 users
  • This post has 1 Reply |
  • 1 Follower
  • Hi all,

    Can anyone point me in the direction of any guides on how to correctly use the WordDocument class with streams please?  I'm seeing the following behaviour on a smal class I created to edit a document.

    1. If I use the WordprocessingDocument.Open method passing filenames my program then everything works perfectly and consistently (I can't break it).
    2. If I use WordprocessingDocument.Open using a Stream then I see one of the following behaviours:
    1. It works correctly (about 20% of the time)
    2. Generates a DOCX file that appears identical (contents and checksums are identical), however Word reports the DOCX as corrupt and it's almost twice as big as a correctly outputted file; i.e. corrupt file size is (correct file size - 1) * 2.  About 40% of the time.
    3. A file that is the same size, but still corrupted, about 40% of the time.

    In each case of corruption, Word can recover the content perfectly.  Testing the archive in 7-Zip reports no errors on the corrupted files.  The contents appear to be identical (I'm comparing file names, locations, sizes and ZIP CRCs).  The files are, obviously, different on a binary level, but I have no idea what's going on.  The OpenXML SDK productivity tool reports "Unable to read beyond the end of the stream".

    I'm guessing that it's something to do with a threading issue on accessing the stream.  I've tried various things:

    1. Loading in to a MemoryStream using the empty constructor then writing the data to it.
    2. Seeking or not seeking to the beginning of the stream before passing to WordDocument.Open
    3. Trying with AutoSave on and off in the OpenDocumentSettings.
    4. Flushing MemoryStream after writing to it
    5. Closing the MemoryStream before calling GetBuffer.

    In each case I'm trying to write the output to a new file.

    In all cases, I'm using an OpenXmlValidator.Validate method on the WordDocument and it's never thrown an error.

    I don't want to attempt applying locks all over the place without knowing whether it's going to work reliably or not!

    Has anyone else out there experienced the same issue?

    I'm using DocumentFormat.OpenXml v2.0.5022.0.  My app is compiled against AnyCPU, running on W7 64-bit Enterprise.

    I didn't want to paste loads of code in here or sample errors as I'm not sure what correct etiquette here is, happy to do so if that's acceptable!  I'm guessing there's a best practice or pattern for handling streams with the OpenXML SDK, but internet searches and the SDK CHM file haven't yielded anything that I think is helpful so far.

    Cheers,

    Andy    

  • Fixed my own problem.  And, of course, having spent hours beating my head against a wall on this one, I discover that I'm an idiot approximately 10 minutes after posting this in a public forum!

    Basically, it's nothing to do with OpenXML SDK, it's completely me calling the wrong method on MemoryStream.

    MemoryStream.GetBuffer() returns the total buffer, to get the *data* I should have used MemoryStream.ToArray().

    Reference: msdn.microsoft.com/.../dev10.query

    I'm going to uninstall Visual Studio now, don't worry.

    Cheers,

    Andy

Page 1 of 1 (2 items)