Differences between revisions 3 and 5 (spanning 2 versions)
Revision 3 as of 2020-04-08 16:01:34
Size: 1021
Editor: Burathar
Comment:
Revision 5 as of 2020-04-08 16:30:39
Size: 1511
Editor: Burathar
Comment:
Deletions are marked like this. Additions are marked like this.
Line 11: Line 11:
== Standard import ==
=== Standard import ===
Line 18: Line 17:
== Read CSV file ==
=== Read CSV file ===
Line 25: Line 23:
|| '''Variable''' || '''Meaning''' ||
|| sep || separator character, can be an array ||
|| header || row number that contains column titles. remove if none ||
|| dtype || force any column to be interpreted as specific [[https://pandas.pydata.org/pandas-docs/stable/getting_started/basics.html#basics-dtypes|datatype]] ||
|| parse_dates || parse any column as datetime ||
|| encoding || file encoding e.g. utf-8 or ansi ||
Line 26: Line 30:
== Renaming Columns ==
=== Renaming Columns ===
Line 30: Line 33:
}} }}}
Line 32: Line 35:
== Do Y for each value in column X ==
=== Do Y for each value in column X ===
Line 36: Line 38:
}} }}}
Line 39: Line 41:

== Renaming Columns ==
=== Renaming Columns1 ===
Line 44: Line 44:
}} }}}

Description

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Snippets

Standard import

   1 import numpy as np
   2 import pandas as pd

Read CSV file

   1 path = '/path/to/file(s)'
   2 df = pd.read_csv(path + 'name.csv', sep=';', header=0, dtype={'Force String Column': str}, 
   3                  parse_dates=['Date Column'] encoding='utf-8')

Variable

Meaning

sep

separator character, can be an array

header

row number that contains column titles. remove if none

dtype

force any column to be interpreted as specific datatype

parse_dates

parse any column as datetime

encoding

file encoding e.g. utf-8 or ansi

Renaming Columns

   1 dt.rename(columns={'File column name 1': 'Column1', 'File column name 2': '2'}, inplace=True)

Do Y for each value in column X

   1 df.loc[:, ['X']] = df.loc[:, ['X']].apply(lambda x: x.Y)

Renaming Columns1

   1 dt.rename(columns={'File column name 1': 'Column1', 'File column name 2': '2'}, inplace=True)

Howto/Python3/pandas (last edited 2020-04-08 17:13:18 by Burathar)