What is Dataset?
Datasets are container of data in python. It can work as data storage for the various algorithms in python. and also a primary storage of data in Data Science. Below I will be discussing how to import a datasets as dataframe in python. For this post I’ll be using a public dataset called Titanic Dataset available on kaggle. You can dowload it from Titanic Dataset
- Importing Data from various excel sheets as dataframe.
There are various type of excel sheets supported by Microsoft excel. Importing some of those files is illustrated below in python.
1.Importing datasets from a CSV file.
import pandas as pd my_df=pd.read_csv(C:\Users\geekycodesco\Downloads\titanic\test.csv)
2. Importing datasets from a xlsx file
import pandas as pd my_def=pd.read_excel(C:\Users\geekycodesco\Downloads\titanic\test.xls)
- Display the dataset imported in above codes.
By default python shows first 5 rows in the dataset. But if you want to see 10 rows then you can see it by below code
- Exporting data-frame as excel sheets in Python.
When you want to save a data-frame in your local drive for later use you need to save it as excel sheet.
my_def.to_csv('file_name.csv’) # the current directory my_def.to_csv('C:/Users/abc/Desktop/file_name.csv') #customized directory
if you want to save NaN values as Unknown
Want to import headers or not?
Read more python blogs here.