Tree @run/433de8f3-f8d2-4c44-a0de-9f04321f9854/upstream (Download .tar.gz)

tutorial.rst @run/433de8f3-f8d2-4c44-a0de-9f04321f9854/upstream — view markup · raw · history · blame

Sheet: Data Access

Iterate a csv file

Here is the way to read the csv file and iterate through each row:

System Message: INFO/1 (<string>, line 9)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 9)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> import pyexcel
   >>> pyexcel.save_as(array=[['Name', 'Age'], ['Chu Chu', '10'], ['Mo mo', '11']],
   ...     dest_file_name='tutorial.csv')

>>> sheet = pyexcel.get_sheet(file_name='tutorial.csv')
>>> for row in sheet:
...     print("%s: %s" % (row[0], row[1]))
Name: Age
Chu Chu: 10
Mo mo: 11

Often people wanted to use csv.Dict reader to read it because it has a header. Here is how you do it with pyexcel:

System Message: ERROR/3 (<string>, line 28)

Error in "code-block" directive: unknown option: "linenos".

.. code-block:: python
   :linenos:

   >>> sheet = pyexcel.get_sheet(file_name='tutorial.csv')
   >>> sheet.name_columns_by_row(0)
   >>> for row in sheet:
   ...     print("%s: %s" % (row[0], row[1]))
   Chu Chu: 10
   Mo mo: 11

Line 2 remove the header from the actual content. The removed header can be used to access its columns using the name itself, for example:

>>> sheet.column['Age']
[10, 11]

Random access to individual cell

Top left corner of a sheet is (0, 0), meaning both row index and column index start from 0. To randomly access a cell of :class:`~pyexcel.Sheet` instance, two syntax are available:

System Message: INFO/1 (<string>, line 51)

No role entry for "class" in module "docutils.parsers.rst.languages.en". Trying "class" as canonical role name.

System Message: ERROR/3 (<string>, line 51); backlink

Unknown interpreted text role "class".

sheet[row, column]

This syntax helps you iterate the data by row and by column. If you use excel positions, the syntax below help you get the cell instantly without converting alphabet column index to integer:

sheet['A1']

Please note that with excel positions, top left corner is 'A1'.

For example: suppose you have the following data sheet,

System Message: INFO/1 (<string>, line 64)

No directive entry for "pyexcel-table" in module "docutils.parsers.rst.languages.en". Trying "pyexcel-table" as canonical directive name.

System Message: ERROR/3 (<string>, line 64)

Unknown directive type "pyexcel-table".

.. pyexcel-table::

   ---pyexcel:example data---
   Example,X,Y,Z
   a,1,2,3
   b,4,5,6
   c,7,8,9

here is the example code showing how you can randomly access a cell:

System Message: INFO/1 (<string>, line 75)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 75)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> data = [['Example', 'X', 'Y', 'Z'], ['a', 1, 2, 3],['b', 4, 5, 6],['c', 7, 8, 9]]
   >>> s = pyexcel.Sheet(data)
   >>> s.save_as("example.xls")

System Message: INFO/1 (<string>, line 82)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 82)

Unknown directive type "testcode".

.. testcode::

   >>> sheet = pyexcel.get_sheet(file_name="example.xls")
   >>> sheet.content
   +---------+---+---+---+
   | Example | X | Y | Z |
   +---------+---+---+---+
   | a       | 1 | 2 | 3 |
   +---------+---+---+---+
   | b       | 4 | 5 | 6 |
   +---------+---+---+---+
   | c       | 7 | 8 | 9 |
   +---------+---+---+---+
   >>> print(sheet[3, 2])
   8
   >>> print(sheet["D3"])
   6
   >>> sheet[2, 3] = 10
   >>> print(sheet[2, 3])
   10

Note

In order to set a value to a cell, please use sheet[row_index, column_index] = new_value

Random access to rows and columns

System Message: INFO/1 (<string>, line 111)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 111)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> sheet[1, 0] = str(sheet[1, 0])
   >>> str(sheet[1,0])
   'a'
   >>> sheet[0, 2] = str(sheet[0, 2])
   >>> sheet[0, 2]
   'Y'

Continue with previous excel file, you can access row and column separately:

>>> sheet.row[1]
['a', 1, 2, 3]
>>> sheet.column[2]
['Y', 2, 5, 8]

Use custom names instead of index

Alternatively, it is possible to use the first row to refer to each columns:

>>> sheet.name_columns_by_row(0)
>>> print(sheet[1, "Y"])
5
>>> sheet[1, "Y"] = 100
>>> print(sheet[1, "Y"])
100

You have noticed the row index has been changed. It is because first row is taken as the column names, hence all rows after the first row are shifted. Now accessing the columns are changed too:

>>> sheet.column['Y']
[2, 100, 8]

Hence access the same cell, this statement also works:

>>> sheet.column['Y'][1]
100

Further more, it is possible to use first column to refer to each rows:

>>> sheet.name_rows_by_column(0)

To access the same cell, we can use this line:

>>> sheet.row["b"][1]
100

For the same reason, the row index has been reduced by 1. Since we have named columns and rows, it is possible to access the same cell like this:

>>> print(sheet["b", "Y"])
100
>>> sheet["b", "Y"] = 200
>>> print(sheet["b", "Y"])
200

Note

When you have named your rows and columns, in order to set a value to a cell, please use sheet[row_name, column_name] = new_value

For multiple sheet file, you can regard it as three dimensional array if you use :class:`~pyexcel.Book`. So, you access each cell via this syntax:

System Message: INFO/1 (<string>, line 172)

No role entry for "class" in module "docutils.parsers.rst.languages.en". Trying "class" as canonical role name.

System Message: ERROR/3 (<string>, line 172); backlink

Unknown interpreted text role "class".

book[sheet_index][row, column]

or:

book["sheet_name"][row, column]

Suppose you have the following sheets:

System Message: INFO/1 (<string>, line 182)

No directive entry for "pyexcel-table" in module "docutils.parsers.rst.languages.en". Trying "pyexcel-table" as canonical directive name.

System Message: ERROR/3 (<string>, line 182)

Unknown directive type "pyexcel-table".

.. pyexcel-table::

   ---pyexcel:Sheet 1---
   1,2,3
   4,5,6
   7,8,9
   ---pyexcel---
   ---pyexcel:Sheet 2---
   X,Y,Z
   1,2,3
   4,5,6
   ---pyexcel---
   ---pyexcel:Sheet 3---
   O,P,Q
   3,2,1
   4,3,2

System Message: INFO/1 (<string>, line 199)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 199)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> data = {
   ...      'Sheet 1':
   ...          [
   ...              [1.0, 2.0, 3.0],
   ...              [4.0, 5.0, 6.0],
   ...              [7.0, 8.0, 9.0]
   ...          ],
   ...      'Sheet 2':
   ...          [
   ...              ['X', 'Y', 'Z'],
   ...              [1.0, 2.0, 3.0],
   ...              [4.0, 5.0, 6.0]
   ...          ],
   ...      'Sheet 3':
   ...          [
   ...              ['O', 'P', 'Q'],
   ...              [3.0, 2.0, 1.0],
   ...              [4.0, 3.0, 2.0]
   ...          ]
   ...  }
   >>> book = pyexcel.Book(data)
   >>> book.save_as("example.xls")

And you can randomly access a cell in a sheet:

>>> book = pyexcel.get_book(file_name="example.xls")
>>> print(book["Sheet 1"][0,0])
1
>>> print(book[0][0,0]) # the same cell
1

Tip

With pyexcel, you can regard single sheet reader as an two dimensional array and multi-sheet excel book reader as a ordered dictionary of two dimensional arrays.

Reading a single sheet excel file

Suppose you have a csv, xls, xlsx file as the following:

System Message: INFO/1 (<string>, line 242)

No directive entry for "pyexcel-table" in module "docutils.parsers.rst.languages.en". Trying "pyexcel-table" as canonical directive name.

System Message: ERROR/3 (<string>, line 242)

Unknown directive type "pyexcel-table".

.. pyexcel-table::

   ---pyexcel:example data---
   1,2,3
   4,5,6
   7,8,9

System Message: INFO/1 (<string>, line 249)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 249)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> data = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
   >>> s = pyexcel.Sheet(data)
   >>> s.save_as("example.xls")

The following code will give you the data in json:

>>> import json
>>> # "example.csv","example.xlsx","example.xlsm"
>>> sheet = pyexcel.get_sheet(file_name="example.xls")
>>> print(json.dumps(sheet.to_array()))
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Read the sheet as a dictionary

Suppose you have a csv, xls, xlsx file as the following:

System Message: INFO/1 (<string>, line 269)

No directive entry for "pyexcel-table" in module "docutils.parsers.rst.languages.en". Trying "pyexcel-table" as canonical directive name.

System Message: ERROR/3 (<string>, line 269)

Unknown directive type "pyexcel-table".

.. pyexcel-table::

   ---pyexcel:data with columns---
   Column 1,Column 2,Column 3
   1,4,7
   2,5,8
   3,6,9

System Message: INFO/1 (<string>, line 277)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 277)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> data = [
   ...      ["Column 1", "Column 2", "Column 3"],
   ...      [1, 2, 3],
   ...      [4, 5, 6],
   ...      [7, 8, 9]
   ...  ]
   >>> s = pyexcel.Sheet(data)
   >>> s.save_as("example_series.xls")

The following code will give you data series in a dictionary:

System Message: INFO/1 (<string>, line 292)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 292)

Unknown directive type "testcode".

.. testcode::

   >>> # "example.xls","example.xlsx","example.xlsm"
   >>> sheet = pyexcel.get_sheet(file_name="example_series.xls", name_columns_by_row=0)

System Message: INFO/1 (<string>, line 297)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 297)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> sheet.colnames = [ str(name) for name in sheet.colnames]

System Message: INFO/1 (<string>, line 302)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 302)

Unknown directive type "testcode".

.. testcode::

    >>> sheet.to_dict()
    OrderedDict([('Column 1', [1, 4, 7]), ('Column 2', [2, 5, 8]), ('Column 3', [3, 6, 9])])

Can I get an array of dictionaries per each row?

Suppose you have the following data:

System Message: INFO/1 (<string>, line 312)

No directive entry for "pyexcel-table" in module "docutils.parsers.rst.languages.en". Trying "pyexcel-table" as canonical directive name.

System Message: ERROR/3 (<string>, line 312)

Unknown directive type "pyexcel-table".

.. pyexcel-table::

   ---pyexcel:data with columns---
   X,Y,Z
   1,2,3
   4,5,6
   7,8,9

System Message: INFO/1 (<string>, line 320)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 320)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> data = [['X', 'Y', 'Z'], [1, 2, 3],[4, 5, 6],[7, 8, 9]]
   >>> s = pyexcel.Sheet(data)
   >>> s.save_as("example.xls")

The following code will produce what you want:

>>> # "example.csv","example.xlsx","example.xlsm"
>>> sheet = pyexcel.get_sheet(file_name="example.xls", name_columns_by_row=0)
>>> records = sheet.to_records()
>>> for record in records:
...     keys = sorted(record.keys())
...     print("{")
...     for key in keys:
...         print("'%s':%d" % (key, record[key]))
...     print("}")
{
'X':1
'Y':2
'Z':3
}
{
'X':4
'Y':5
'Z':6
}
{
'X':7
'Y':8
'Z':9
}

Writing a single sheet excel file

Suppose you have an array as the following:

1	2	3
4	5	6
7	8	9

The following code will write it as an excel file of your choice:

.. testcode::

>>> array = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> # "output.xls" "output.xlsx" "output.ods" "output.xlsm"
>>> sheet = pyexcel.Sheet(array)
>>> sheet.save_as("output.csv")

Suppose you have a dictionary as the following:

System Message: INFO/1 (<string>, line 380)

No directive entry for "pyexcel-table" in module "docutils.parsers.rst.languages.en". Trying "pyexcel-table" as canonical directive name.

System Message: ERROR/3 (<string>, line 380)

Unknown directive type "pyexcel-table".

.. pyexcel-table::

   ---pyexcel:data with columns---
   Column 1,Column 2,Column 3
   1,4,7
   2,5,8
   3,6,9

The following code will write it as an excel file of your choice:

>>> example_dict = {"Column 1": [1, 2, 3], "Column 2": [4, 5, 6], "Column 3": [7, 8, 9]}
>>> # "output.xls" "output.xlsx" "output.ods" "output.xlsm"
>>> sheet = pyexcel.get_sheet(adict=example_dict)
>>> sheet.save_as("output.csv")

Write multiple sheet excel file

Suppose you have previous data as a dictionary and you want to save it as multiple sheet excel file:

>>> content = {
...     'Sheet 1':
...         [
...             [1.0, 2.0, 3.0],
...             [4.0, 5.0, 6.0],
...             [7.0, 8.0, 9.0]
...         ],
...     'Sheet 2':
...         [
...             ['X', 'Y', 'Z'],
...             [1.0, 2.0, 3.0],
...             [4.0, 5.0, 6.0]
...         ],
...     'Sheet 3':
...         [
...             ['O', 'P', 'Q'],
...             [3.0, 2.0, 1.0],
...             [4.0, 3.0, 2.0]
...         ]
... }
>>> book = pyexcel.get_book(bookdict=content)
>>> book.save_as("output.xls")

You shall get a xls file

Read multiple sheet excel file

Let's read the previous file back:

>>> book = pyexcel.get_book(file_name="output.xls")
>>> sheets = book.to_dict()
>>> for name in sheets.keys():
...     print(name)
Sheet 1
Sheet 2
Sheet 3

Work with data series in a single sheet

Suppose you have the following data in any of the supported excel formats again:

System Message: INFO/1 (<string>, line 446)

No directive entry for "pyexcel-table" in module "docutils.parsers.rst.languages.en". Trying "pyexcel-table" as canonical directive name.

System Message: ERROR/3 (<string>, line 446)

Unknown directive type "pyexcel-table".

.. pyexcel-table::

   ---pyexcel:data with columns---
   Column 1,Column 2,Column 3
   1,4,7
   2,5,8
   3,6,9

System Message: INFO/1 (<string>, line 454)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 454)

Unknown directive type "testcode".

.. testcode::

   >>> sheet = pyexcel.get_sheet(file_name="example_series.xls", name_columns_by_row=0)

System Message: INFO/1 (<string>, line 458)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 458)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> sheet.colnames = [ str(name) for name in sheet.colnames]

Play with data

You can get headers:

>>> print(list(sheet.colnames))
['Column 1', 'Column 2', 'Column 3']

You can use a utility function to get all in a dictionary:

>>> sheet.to_dict()
OrderedDict([('Column 1', [1, 4, 7]), ('Column 2', [2, 5, 8]), ('Column 3', [3, 6, 9])])

Maybe you want to get only the data without the column headers. You can call :meth:`~pyexcel.Sheet.rows()` instead:

System Message: INFO/1 (<string>, line 477)

No role entry for "meth" in module "docutils.parsers.rst.languages.en". Trying "meth" as canonical role name.

System Message: ERROR/3 (<string>, line 477); backlink

Unknown interpreted text role "meth".

>>> list(sheet.rows())
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

You can get data from the bottom to the top one by calling :meth:`~pyexcel.Sheet.rrows()` instead:

System Message: INFO/1 (<string>, line 482)

No role entry for "meth" in module "docutils.parsers.rst.languages.en". Trying "meth" as canonical role name.

System Message: ERROR/3 (<string>, line 482); backlink

Unknown interpreted text role "meth".

>>> list(sheet.rrows())
[[7, 8, 9], [4, 5, 6], [1, 2, 3]]

You might want the data arranged vertically. You can call :meth:`~pyexcel.Sheet.columns()` instead:

System Message: INFO/1 (<string>, line 487)

No role entry for "meth" in module "docutils.parsers.rst.languages.en". Trying "meth" as canonical role name.

System Message: ERROR/3 (<string>, line 487); backlink

Unknown interpreted text role "meth".

>>> list(sheet.columns())
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]

You can get columns in reverse sequence as well by calling :meth:`~pyexcel.Sheet.rcolumns()` instead:

System Message: INFO/1 (<string>, line 492)

No role entry for "meth" in module "docutils.parsers.rst.languages.en". Trying "meth" as canonical role name.

System Message: ERROR/3 (<string>, line 492); backlink

Unknown interpreted text role "meth".

>>> list(sheet.rcolumns())
[[3, 6, 9], [2, 5, 8], [1, 4, 7]]

Do you want to flatten the data? You can get the content in one dimensional array. If you are interested in playing with one dimensional enumeration, you can check out these functions :meth:`~pyexcel.Sheet.enumerate`, :meth:`~pyexcel.Sheet.reverse`, :meth:`~pyexcel.Sheet.vertical`, and :meth:`~pyexcel.Sheet.rvertical()`:

System Message: INFO/1 (<string>, line 497)

No role entry for "meth" in module "docutils.parsers.rst.languages.en". Trying "meth" as canonical role name.

System Message: ERROR/3 (<string>, line 497); backlink

Unknown interpreted text role "meth".

System Message: INFO/1 (<string>, line 497)

No role entry for "meth" in module "docutils.parsers.rst.languages.en". Trying "meth" as canonical role name.

System Message: ERROR/3 (<string>, line 497); backlink

Unknown interpreted text role "meth".

System Message: INFO/1 (<string>, line 497)

No role entry for "meth" in module "docutils.parsers.rst.languages.en". Trying "meth" as canonical role name.

System Message: ERROR/3 (<string>, line 497); backlink

Unknown interpreted text role "meth".

System Message: INFO/1 (<string>, line 497)

No role entry for "meth" in module "docutils.parsers.rst.languages.en". Trying "meth" as canonical role name.

System Message: ERROR/3 (<string>, line 497); backlink

Unknown interpreted text role "meth".

>>> list(sheet.enumerate())
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(sheet.reverse())
[9, 8, 7, 6, 5, 4, 3, 2, 1]
>>> list(sheet.vertical())
[1, 4, 7, 2, 5, 8, 3, 6, 9]
>>> list(sheet.rvertical())
[9, 6, 3, 8, 5, 2, 7, 4, 1]

System Message: INFO/1 (<string>, line 510)

No directive entry for "testcode" in module "docutils.parsers.rst.languages.en". Trying "testcode" as canonical directive name.

System Message: ERROR/3 (<string>, line 510)

Unknown directive type "testcode".

.. testcode::
   :hide:

   >>> import os
   >>> os.unlink("output.xls")
   >>> os.unlink("output.csv")
   >>> os.unlink("example.xls")
   >>> os.unlink("example_series.xls")
   >>> os.unlink("tutorial.csv")

Docutils System Messages

System Message: INFO/1 (<string>, line 46)

Hyperlink target "access-to-cell" is not referenced.