Importing and exporting datasets in Python.

What is Dataset?

A Dataset is container of data in python. It can work as data storage for the various algorithms in python. and also a primary storage of data in Data Science. Below I will be discussing how to import a dataset as dataframe in python. For this post I’ll be using a public dataset called Titanic Dataset available on kaggle. You can dowload it from Titanic Dataset

  • Importing Data from various excel sheets as dataframe.

There are various type of excel sheets supported by Microsoft excel. Importing some of those files is illustrated below in python.

  1. Importing a CSV file.
import pandas as pd
my_df=pd.read_csv(C:\Users\geekycodesco\Downloads\titanic\test.csv)

2. Importing a xlsx file.

import pandas as pd
my_def=pd.read_excel(C:\Users\geekycodesco\Downloads\titanic\test.xls)
  • Display the dataset imported in above codes.
my_def.head()

By default python shows first 5 rows in the dataset. But if you want to see 10 rows then you can see it by below code

my_def.head(10)
  • Exporting data-frame as excel sheets in Python.

When you want to save a data-frame in your local drive for later use you need to save it as excel sheet.

my_def.to_csv('file_name.csv’) # the current directory
my_def.to_csv('C:/Users/abc/Desktop/file_name.csv') #customized directory

if you want to save NaN values as Unknown

my_def.to_csv('file_name.csv',na_rep='Unkown')

Want to import headers or not?

my_def.to_csv('file_name.csv',header=False)

Leave a Reply

%d bloggers like this: