Python Pandas DataFrames Made Simple
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns. A Pandas DataFrame has three main components: the data, rows, and columns.
Columns : Represent different variables or features. Each column has a name (column label).
Rows : Contain the actual data entries. Each row has a unique index.
Create a simple Pandas DataFrame :
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)
| calories | duration |
0 | 420 | 50 |
1 | 380 | 40 |
2 | 390 | 45 |
Pandas use the loc attribute to return one or more specified row(s)
#returns to the row index:
print(df.loc[0])
# Returns the 'calories' column
df[ ’calories’ ]
Manipulating Data :
· Adding a Column:
df['Salary'] = [60000, 80000, 75000]
· Filtering Data:
# Return rows where 'calories' is greater than 30
df[df['calories'] > 30]
Take the first step towards data-led growth by partnering with MSA Infotech. Whether you seek tailored solutions or expert consultation, we are here to help you harness the power of data for your business. Contact us today and let’s embark on this transformative data adventure together. Get a free consultation today!
We utilize data to transform ourselves, our clients, and the world.
Partnership with leading data platforms and certified talents