Python Pandas DataFrames Made Simple
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns. A Pandas DataFrame has three main components: the data, rows, and columns.
Columns and Rows :
Columns : Represent different variables or features. Each column has a name (column label).
Rows : Contain the actual data entries. Each row has a unique index.
Example
Create a simple Pandas DataFrame :
import pandas as pd
data = {
“calories”: [420, 380, 390],
“duration”: [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)
Result :
|
calories |
duration |
0 |
420 |
50 |
1 |
380 |
40 |
2 |
390 |
45 |
Locate Row :
Pandas use the loc attribute to return one or more specified row(s)
#returns to the row index:
print(df.loc[0])
Column Selection :
# Returns the ‘calories’ column
df[ ’calories’ ]
Manipulating Data :
· Adding a Column:
df[‘Salary’] = [60000, 80000, 75000]
· Filtering Data:
# Return rows where ‘calories’ is greater than 30
df[df[‘calories‘] > 30]
Leave a Reply
Want to join the discussion?Feel free to contribute!