libxlsxwriter
Working with Memory and Performance

Constant Memory Mode

By default libxlsxwriter holds all cell data in memory. This is to allow non-sequential data storage and also to allow future features where formatting is applied separately from the data.

The effect of this is that for large files libxlsxwriter can consume a lot of memory.

Fortunately, this memory usage can be reduced almost completely by using workbook_new_opt() and the lxw_workbook_options constant_memory property:

#include "xlsxwriter.h"
int main() {
lxw_row_t row;
lxw_col_t col;
lxw_row_t max_row = 1000;
lxw_col_t max_col = 50;
/* Set the worksheet options. */
lxw_workbook_options options = {.constant_memory = LXW_TRUE,
.tmpdir = NULL,
.use_zip64 = LXW_FALSE};
/* Create a new workbook with options. */
lxw_workbook *workbook = workbook_new_opt("constant_memory.xlsx", &options);
lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);
for (row = 0; row < max_row; row++) {
for (col = 0; col < max_col; col++) {
worksheet_write_number(worksheet, row, col, 123.45, NULL);
}
}
return workbook_close(workbook);
}

This optimization works by flushing each row after a subsequent row is written. In this way the largest amount of data held in memory for a worksheet is the amount of data required to hold a single row of data.

If required, this memory usage can be reduced even more by setting LXW_COL_MAX in worksheet.c from 16384 down to a value that matches the maximum column that is likely to be encountered.

The trade-off when using constant_memory mode is that data must be added sequentially in row order and you won't be able to take advantage of any functions that manipulate cell data after it is written. One such feature is worksheet_set_row() which in constant_memory mode can only be used when writing data to the current row. Another is worksheet_merge_range() which can only be applied to a merged range in the current row.

Row Column Order

Since each new row flushes the previous row, data must be written in sequential row order when constant_memory mode is on:

lxw_workbook *workbook = workbook_new_opt("constant_memory.xlsx", &options);
...
// !! Don't use "column x row" order in 'constant_memory' mode. Only
// the first column of data will be written.
for (col = 0; col < max_col; col++) {
for (row = 0; row < max_row; row++) {
worksheet_write_number(worksheet, row, col, 123.45, NULL);
}
}

Inline strings

Another optimization that is used to reduce memory usage in constant_memory mode is that cell strings aren't stored in an Excel structure call "shared strings" and instead are written "in-line".

This is a documented Excel feature that is supported by most spreadsheet applications. However, it isn't supported by some some spreadsheet viewer applications. Also, the size of the output file can increase by 20%-100% depending on the amount of repeated string data.

Performance

Currently the library is optimized but not highly optimized. Also, the library is currently single threaded.

Next: Working with VBA Macros

workbook_close
lxw_error workbook_close(lxw_workbook *workbook)
Close the Workbook object and write the XLSX file.
lxw_workbook_options
Workbook options.
Definition: workbook.h:254
LXW_FALSE
@ LXW_FALSE
Definition: common.h:51
workbook_new_opt
lxw_workbook * workbook_new_opt(const char *filename, lxw_workbook_options *options)
Create a new workbook object, and set the workbook options.
lxw_worksheet
Struct to represent an Excel worksheet.
Definition: worksheet.h:1603
LXW_TRUE
@ LXW_TRUE
Definition: common.h:53
lxw_workbook
Struct to represent an Excel workbook.
Definition: workbook.h:273
lxw_row_t
uint32_t lxw_row_t
Definition: common.h:40
worksheet_write_number
lxw_error worksheet_write_number(lxw_worksheet *worksheet, lxw_row_t row, lxw_col_t col, double number, lxw_format *format)
Write a number to a worksheet cell.
workbook_add_worksheet
lxw_worksheet * workbook_add_worksheet(lxw_workbook *workbook, const char *sheetname)
Add a new worksheet to a workbook.
lxw_col_t
uint16_t lxw_col_t
Definition: common.h:46