Comparison of popular Excel libraries

21 June, 2017

Simple benchmark of popular library

1

2

3

4

5

6

7

8

9

10

11

12

13

14

|          TEST_NAME          | NUM_ROWS | NUM_COLS | TIME_IN_SECONDS |

|-----------------------------|----------|----------|-----------------|

| pyexcelerate value fastest  |     1000 |      100 |            0.47 |

| pyexcelerate value faster   |     1000 |      100 |            0.51 |

| pyexcelerate value fast     |     1000 |      100 |            1.53 |

| xlsxwriter value            |     1000 |      100 |            0.84 |

| openpyxl                    |     1000 |      100 |            2.74 |

| pyexcelerate style cheating |     1000 |      100 |            1.23 |

| pyexcelerate style fastest  |     1000 |      100 |            2.40 |

| pyexcelerate style faster   |     1000 |      100 |            2.75 |

| pyexcelerate style fast     |     1000 |      100 |            6.15 |

| xlsxwriter style cheating   |     1000 |      100 |            1.21 |

| xlsxwriter style            |     1000 |      100 |            4.85 |

| openpyxl                    |     1000 |      100 |            6.32 |

Since the data size accessing is relatively large, approx. >50 excel files, >10 sheets in each file and >1k rows in each sheet, performance become very important.

OpenPyXLLibXL can provide both write & read functions

xlrd for reading only

xlwtPyExcelerate for writing only

OpenPyXL = Free with relatively slow performance

xlwt = Free with no limit but 2007 Excel (XML format) .xlsx is not supported

LibXL (C++) = extremely fast performance but with 300 cells read limit for free trial

xlrd = Free with no limit, support xlsx reading and fast performance

PyExcelerate = Free with no limit, support xlsx writing and fast performance

After the implementation of all libraries, I have decided to read with xlrd and write with PyExcelerate, the development was smooth and clean, no collision occurs between these two libraries, however, PyExceleratebehaves slightly different comparing to xlrd.

1.      For example, accessing cell value A4, it is (3, 0) for xlrd but (4, 1) for PyExcelerate as xlrd is zero-based, corruption occurs when trying to write to col#0 or row#0.

2.      Reading error value like “#VALUE!” is relatively easy in OpenPyXL as it’s directly stored as string”#VALUE!”, however, it is stored as “15” for xlrd in my example.

(0)

相关推荐