Differences between revisions 2 and 5 (spanning 3 versions)
Revision 2 as of 2020-04-08 15:53:19
Size: 641
Editor: Burathar
Comment:
Revision 5 as of 2020-04-08 16:30:39
Size: 1511
Editor: Burathar
Comment:
Deletions are marked like this. Additions are marked like this.
Line 11: Line 11:
== Standard import ==
=== Standard import ===
Line 18: Line 17:
== Read CSV file ==
=== Read CSV file ===
Line 22: Line 20:
import pandas as pd df = pd.read_csv(path + 'name.csv', sep=';', header=0, dtype={'Force String Column': str},
                 parse_dates=['Date Column'] encoding='utf-8')
}}}
|| '''Variable''' || '''Meaning''' ||
|| sep || separator character, can be an array ||
|| header || row number that contains column titles. remove if none ||
|| dtype || force any column to be interpreted as specific [[https://pandas.pydata.org/pandas-docs/stable/getting_started/basics.html#basics-dtypes|datatype]] ||
|| parse_dates || parse any column as datetime ||
|| encoding || file encoding e.g. utf-8 or ansi ||

=== Renaming Columns ===
{{{#!highlight python
dt.rename(columns={'File column name 1': 'Column1', 'File column name 2': '2'}, inplace=True)
Line 25: Line 35:
=== Do Y for each value in column X ===
Line 26: Line 37:
import numpy as np
df = pd.read_csv(path + 'name.csv', sep=';', header=0, dtype={'Force String Column': str}, parse_dates=['Date Column'] encoding='utf-8')
df.loc[:, ['X']] = df.loc[:, ['X']].apply(lambda x: x.Y)
Line 29: Line 39:


=== Renaming Columns1 ===
{{{#!highlight python
dt.rename(columns={'File column name 1': 'Column1', 'File column name 2': '2'}, inplace=True)
}}}

Description

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Snippets

Standard import

   1 import numpy as np
   2 import pandas as pd

Read CSV file

   1 path = '/path/to/file(s)'
   2 df = pd.read_csv(path + 'name.csv', sep=';', header=0, dtype={'Force String Column': str}, 
   3                  parse_dates=['Date Column'] encoding='utf-8')

Variable

Meaning

sep

separator character, can be an array

header

row number that contains column titles. remove if none

dtype

force any column to be interpreted as specific datatype

parse_dates

parse any column as datetime

encoding

file encoding e.g. utf-8 or ansi

Renaming Columns

   1 dt.rename(columns={'File column name 1': 'Column1', 'File column name 2': '2'}, inplace=True)

Do Y for each value in column X

   1 df.loc[:, ['X']] = df.loc[:, ['X']].apply(lambda x: x.Y)

Renaming Columns1

   1 dt.rename(columns={'File column name 1': 'Column1', 'File column name 2': '2'}, inplace=True)

Howto/Python3/pandas (last edited 2020-04-08 17:13:18 by Burathar)