Python basics for Machine Learning Using Pandas -1

# Importing pandas library
import pandas as pd
# 'as pd' makes it easy to call the pandas library. 'pd' can be named anything that you want.

Reading Data

titanic = pd.read_csv(r"C:\Users\Name\Desktop\codenotes\ML\titanicdataset\titanic_train.csv")
#when there is "\" you can add 'r' before to convert them into a raw string for the path.
#you can actually replace the '\' with '/' or '\\'.

Creating a copy of the data.

df = titanic.copy()

Dimension of the dataset:

df.shape
#the first value is the number of rows and the second one is the number of columns.

Understanding the dataset

df.describe()

Viewing the data frame

df.head()
df.tail()

Select a column to view.

1 --> df.Name2 --> df['Name']

Finding null values

df.isnull().sum() 
#isnull just gives us the boolean values 'True' if null. 'Flase' if not null.
#sum can be used to get the number of values that are null

Accessing values in dataframe

  • Iloc() function.
  • Loc() function.

iloc[] function.

  • An integer, e.g. 5.
  • A list or array of integers, e.g. [4, 3, 0].
  • A slice object with ints, e.g. 1:7.
  • A boolean array.
df.iloc[1:4,2:4] #it selects the first 3 columns and 2 rowsdf.iloc[1:4] #it selects 3 columns and all the rows.

loc() function.

  • A single label, e.g. 5 or ‘a’, (note that 5 is interpreted as a label of the index, and never as an integer position along with the index).
  • A list or array of labels, e.g. [‘a’, ‘b’, ‘c’].
  • A slice object with labels, e.g. ‘a’:’f’.
  • A boolean array of the same length as the axis being sliced, e.g. [True, False, True].
df.loc[1:4] 
# it selects the first 4 rows with the labels from 1 to 4.
df.loc[1:4,'Name':'Ticket']
# it selects the first 4 rows with the labels from 1 to 4. and selects all the columns from 'Name' to 'Ticket'

Deletion of data

df.drop(0) 
#this will create a object with a dropped row number 0 index name.
#this wont change the original dataframe since inplace will be 'False' by default. This means if we call back 'df' then
there will be no changes done to it.'''
df.drop(0,inplace = True) 
#giving inplace true will be a permanent change in the dataframe.
df
df.drop('PassengerId',axis = 1)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Anantha Kattani

Anantha Kattani

16 Followers

Let's create good machine learning projects to create a positive change in the society.