wordpress hit counter
Formatted Excel using SDK 2.0 and .NET - OpenXML Developer - Blog - OpenXML Developer

Formatted Excel using SDK 2.0 and .NET

Blog

Samples, Demos, and Reference Articles

Formatted Excel using SDK 2.0 and .NET

Rate This
  • Comments 12

By Lawrence Hodson


Introduction
 

The Open XML SDK v2.0 provides an API to Microsoft .NET developers for automating the new Office document formats (docx, xlsx etc...). Until now, I used it for manipulating Word documents. For this sample I wanted to attempt producing Excel documents using the SDK & .NET.


The Goal 

Many online banking websites let you export your transactions to CSV formats. My challenge was to take this CSV format and reproduce into a nicely formatted and branded Excel download instead.

To achieve this I built a simple ASP.NET web page with a download button. When clicked an Excel file of transactions would be generated and saved (or opened) by the end user.

I decided to use a templated approach where the resulting document is built from a pre-defined Excel spread sheet, and then filled with transaction data.

This template document will contain Title, Header rows, and appropriate formatting. Below the header will be several template cells marked up with "[ColumnName]" values. The code would replace them with the same column named values from a .NET DataTable of randomly generated transactions.

In order for the code to not have to scan the entire template spreadsheet for these template cell values, the range of cells will be defined as an Excel Named Range feature.

When the code runs it will look up this Named Range to determine the starting row, column & end column.

The end result will look something like this...


The Sample

Go ahead and download the sample code.  It is the complete solution, whereas this article will only cover snippets of particular interest. The download is a Visual Studio 2008 solution with the code written in C#.

To install the compiled binary version of the sample save the sub folder named “binary” to a Windows machine with IIS installed and create a web site pointing at that folder. You may need to edit the web.config setting to point to the TemplateDocs sub folder. To run the sample, browse to the site and click the “Generate spreadsheet” button. The data source in the spreadsheet is made up from generated random numbers. In your production application you would typically get this from a database query instead.

The sample contains both the runtime "Binary" for the website and the entire solution Source code.

The solution source is made up of the following:

  • OpenXml.Sandpit.FormattedExcel.Library
    Class library that does all the Open XML specific things
  • OpenXml.Sandpit.FormattedExcel.Web
    Simple ASP.NET Web application that uses the Library to generate the Document.
  • Lib
    The Lib folder contains the DocumentFormat.OpenXml.dll referenced by the rest of the code.


About the Code: OpenXml.Sandpit.FormattedExcel.Web

The Web application is very simple. It has just one page containing the Generate button. When clicked, the Excel file is generated and the Response is changed from normal Html content to Excel. Modern browsers should recognise this and let the end user choose to Open the document, or Save it somewhere on their file system.

A SpreadsheetWorker object is instantiated in a "using" statement to ensure any unmanaged resources (file handles etc...) are Disposed of promptly when complete. I will expand on the SpreadsheetWorker class later on.

using (SpreadsheetWorker worker = new SpreadsheetWorker())
{
           ...
}

The location of the template document is read from configuration and the document is then opened by the worker.

string filePath = ConfigurationManager.AppSettings["TemplateDocumentFilePath"];
worker.Open(filePath);

A .NET DataTable containing the transaction records are obtained. In a real world application this might be retrieved from a Database, but for this sample it is simply loaded with random data.

The transaction DataTable is then passed to FillData() along with the name of the Excel Named Range that defines where the transactions will be appended into the document.

DataTable dataTable = GetTransactionDataTable();
worker.FillData("TemplateRow", dataTable);

The resulting document is then saved to a byte array using SaveAs().

byte[] outputFileBytes = worker.SaveAs();

The byte array is then returned as an Excel response.

Response.ClearHeaders();
Response.ClearContent();
Response.ContentType = "application/vnd.ms-excel";
Response.AddHeader("Content-Disposition", string.Format("attachment;filename={0}", outputFileName));
Response.BinaryWrite(outputFileBytes);
Response.Flush();
Response.End();


About the Code: OpenXml.Sandpit.FormattedExcel.Library
The library assembly and SpreadsheetWorker class is where all the real action happens.

The approach I took is similar to that of the Headers and Footers article I wrote earlier. The SpreadsheetWorker abstracts the OpenXml SDK functionality we need and supports a simple Open -> Manipulate -> Save -> close type lifecycle.

For an explanation of how this is achieved refer to the earlier article.

The Excel specific action happens in the FillData() method.


The FillData Method

The FillData method modifies the in-memory document and appends the transaction rows to the spreadsheet as follows:

Firstly, it determines where to append the transactions using the Named Range defined in the template. The code will look for this Named Range to get the start column, row and end column.

I found a nice example of how to do this at Brian Jones' blog here.

List<DefinedNameVal> namesTable = BuildDefinedNamesTable(_inMemoryDocument.WorkbookPart);

Using this information the actual template row is retrieved

WorksheetPart worksheetPart = GetWorkSheetPart(matchedNameRange);
SheetData sheetData = GetWorkSheetData(worksheetPart, matchedNameRange);
Row templateContentRow = GetContentRow(sheetData, Convert.ToInt32(matchedNameRange.EndRow));

Next the DataTable of transactions is iterated through and a Spreadsheet row inserted for each.

For each DataRow an Excel row is created and formatting cloned from the template row.

Row contentRow = CreateContentRow(rowNumber, dataRow, templateContentRow, matchedNameRange);

The CreateContentRow() method creates a new row cloning the span and style from the template row.

//Create new row
Row r = new Row();
r.RowIndex = (UInt32)index;
r.Spans = templateRow.Spans;
r.StyleIndex = templateRow.StyleIndex;

It then populates it with cells cloned from each cell defined in the Named Range.  

cell = CreateCellFromTemplate(columnId, index, templateCell, dataValue); 

The new row is then added to the spreadsheet.

templateContentRow.InsertBeforeSelf(contentRow);

Finally, when all the transaction rows have been added the template Row is removed, and changes are saved to the in Memory document

sheetData.RemoveChild(templateContentRow);
worksheetPart.Worksheet.Save();

Conclusion

This was my first look at producing a spreadsheet using Open XML. The packaging part was easy – as with Word.

Perhaps I missed something, but it seemed to me that inserting cells in rows requires more effort to ensure a valid document at the end. As it turned out there was a lot of trial and error before I finally got it right. 

I have left it to the reader to look into how to support total row. The reader could also investigate how to treat any rows below the Named Range template row as a footer and adjust any ranges in the formula so that these work as expected. My initial attempts are not included in the sample because I always ended up with an invalid document. I suspect that Cell/Row references of trailing content might need to be updated to adjust for the inserted transactions rows.  PowerTools solves some similar issues for word documents. Perhaps similar help for spreadsheet documents would help.

In this sample I chose to assume only a single row in the Named Range. The sample could certainly be adapted to support multiple template rows in the Named Range.

To complete this sample, I referred to several other useful code samples.

Attachment: OpenXmlFormattedExcel.zip
  • Great Article!
    I am trying to wrap Open XML SDK 2.0 August 2009 CTP
    so that I can generate Excel doc in a more intuitive way.
    I can generate a excel doc with charts in it, no problem.

    But I am missing something, can not fully understand the spec.
    I am trying to add a format to a number through the SDK but don't know how.
    And header styles, etc.
    Could you shed some light on that?
  • Hi Lawrence,

    Thanks for your very interesting article!

    I'm doing some experimentations with your code and I've found something strange: if I put some content in the cells below the named range (for example at row 7), it is overwritten by the call to the method InsertBeforeSelf. Is it possible to push down the content instead?

    Regards,
    Stenio
  • I figured out how to do it by my self. Here is the code:

    after the RemoveChild in FillData put:

                   if (contentRow != null) {
                       UpdateRowIndices(contentRow, (uint)dataTable.Rows.Count);
                       UpdateNamedRanges(namedRangeToFill, int.Parse(matchedNameRange.StartRow), (uint)dataTable.Rows.Count);
                   }

    and than

    private void UpdateRowIndices(Row fromRow, uint nRows) {
               foreach (Row row in fromRow.ElementsAfter()) {
                   uint newRowIndex = row.RowIndex.Value + nRows -1;

                   UpdateCellReferences(row, newRowIndex);

                   // Update the reserved row index.
                   row.RowIndex = new UInt32Value(newRowIndex);
               }
           }

           private void UpdateCellReferences(Row row, uint newRowIndex) {
               foreach (Cell cell in row.Elements<Cell>()) {
                   // Update the references for reserved cells.
                   string cellReference = cell.CellReference.Value;
                   cell.CellReference = new StringValue(cellReference.Replace(row.RowIndex.Value.ToString(), newRowIndex.ToString()));
               }
           }

           private void UpdateNamedRanges(string namedRangeToFill, int fromRow, uint nRows) {
               foreach (DefinedName name in _inMemoryDocument.WorkbookPart.Workbook.GetFirstChild<DefinedNames>()) {
                   if(name.Name == namedRangeToFill) continue;

                   //Parse defined name string...
                   string key = name.Name;
                   string reference = name.InnerText;

                   string sheetName = reference.Split('!')[0];
                   sheetName = sheetName.Trim('\'');

                   //Assumption: None of my defined names are relative defined names (i.e. A1)
                   string range = reference.Split('!')[1];
                   string[] rangeArray = range.Split('$');

                   string startCol = rangeArray[1];
                   string startRow = rangeArray[2].TrimEnd(':');

                   if (int.Parse(startRow) < fromRow) continue;
                   startRow = (int.Parse(startRow)+nRows-1).ToString();

                   string endCol = null;
                   string endRow = null;

                   if (rangeArray.Length > 3) {
                       endCol = rangeArray[3];
                       endRow = (int.Parse(rangeArray[4])+nRows-1).ToString();
                   }

                   DefinedName newDefinedName = new DefinedName(string.Format("{0}!${1}${2}:${3}${4}", sheetName, startCol, startRow, endCol, endRow));
                   newDefinedName.Name = name.Name;
                   _inMemoryDocument.WorkbookPart.Workbook.GetFirstChild<DefinedNames>().ReplaceChild(newDefinedName, name);
               }
       }

    Hope this helps.

    Regards,
    Stenio Brunetta
  • WHERE exactly can i download the SOURCE Code
  • u hv to login
  • Ok, I'm logged in. where exactly can anyone download the source?
  • Hi Lawrence,
    Great tutorial. I need to update individual cells rather than flush a whole table on the sheet. the worker object seems a bit limited for that.
    should I not use your library in that case? if so, how does all this work with "not" using your library?
  • i have registered and trying for code i could not get it, where can i get source code
  • List<DefinedNameVal> namesTable = BuildDefinedNamesTable(_inMemoryDocument.WorkbookPart);

    The above code doe not compile:
    Error message as follows: unknown type

    Environment: VS 2010 premium and Microsoft Office Open XML SDK 2.0
    the project is referncing: DocumentFormat.OpenXml.dll
    I am using:
    using DocumentFormat.OpenXml.Packaging;
    using DocumentFormat.OpenXml.Spreadsheet;
    using DocumentFormat.OpenXml;

    What's the problem?


  • Hi ZoeLola,

    Are you attempting to download and compile the entire project, and then receiving the error?

    @reddy506, @Thorne, @amaranth,

    You have to log in, and then the code is available as an attachment to this article.  It is just below the links to Brian Jones's blog, OpenXML PowerTools, and MSDN Sample.

    -Eric
  • <plug type="shameless">
    I'm part of the team making ExcelWriter, which helps you solve exactly this problem much faster and with much less code.  Here's a quick example of how you can make a report like the one here with only a few lines of code: http://wiki.softartisans.com/display/BLOGS/2011/01/27/Using+ExcelTemplate+Grouping

    Basically, you can make a template for your report, tell our Template object what data to use with it, and then be done.  We handle all the messy work of making sure the XML is valid, and we'll automatically update formulas so that you can easily have aggregates in a footer, or a seperate analysis sheet, or even live charts.

  • Pingback from  Save Excel Document in ASP.NET | PHP Developer Resource

Page 1 of 1 (12 items)