September 25, 2021

Working With Data, Json, Pandas DataFrame with Python – Useful Tips

In this tutorial, we would cover how to do the following

  1. Import JSON Data to Python
  2. Building a Pandas Dataframe
  3. Adding Rows to Dataframe
  4. Displaying a Formatted Table
  5. Using Dataframe iloc[]

 

1. Import JSON Data to Python

Let’s first get some sample json file t work with. You can get same json file from here https://jsonplaceholder.typicode.com/. I have downloaded a json file to my local system. I believe you already have Jupiter Notebook. So the firstĀ  you need to to do is to start a new notebook.

To be able to import json file, you need the json module and then use the load method. The code is shown below

# How to import json in Python
import json

with open('/Users/kindsonmunonye/todos.json') as f:
  data = json.load(f)

The data is imported and save in a variable called data. You can use the command print(data) to display the data.

 

2. Building a Pandas Dataframe

If you’ll be working with data in Python, then you’ll most likely need to have the data in Pandas Dataframe. This is basically a tabular representation of the data with rows and columns. So let’s assume you want to create the following dataframe in Python

NameAgeHeight
Kindson39185
Jadon30170
Solace14155

The code below creates the DataFrame

import pandas as pd
df = pd.DataFrame(columns=["name", "age", "height"])

But right now the dataframe is empty!

 

3. Adding Rows to Pandas Dataframe

Adding Rows to Dataframe using loc[]

To add rows, you use the loc property of the dataframe. The code below populates the dataframe

df.loc[0] = ["Kindson", 43, 80]
df.loc[1] = ["Jadon", 30, 170]
df.loc[2] = ["Solace", 14, 155]

Now if you display the dataframe, you will have:

Displaying a dataframe
Displaying a dataframe

 

Adding Rows Using Dataframe.append()

the append method provide a way to add items rows to the dataframe without worrying about the row index. For instance, the code below adds two new rows to the dataframe

df.append({"name": "McMills", "age":14, "height": 74}, ignore_index=True)
df.append({"name": "Adaoma", "age":20, "height": 89}, ignore_index=True)

If you run this, the you will see that two new rows are added to the dataframe as shown below:

Adding row to dataframe using append
Adding row to dataframe using append

 

4. Displaying a Formatted Table

Sometimes, when you import data into Python, and display it using the print() method, it is not displayed in nice tabular format. To fix this, you need to us the display module available in the IPython.display library.

This code below imports the display module, then you can use display(data) instead of print(data) to display your data.

from IPython.display import display 

5. How to Use iloc[]

You will need to use iloc[] for selecting subsets of some data or from a dataframe.

You need to specify the rows you want to select and the columns you want to select as well. The syntax is:

df.iloc[row_range, col_range]

Here are some examples:

  • df[0:3, 0:3] – select first 3 rows and first 3 columns
  • df[0:3] – first 3 rows (0, 1, 2) and all the columns. Same as df[0:3,]
  • df[,0:3] – all the rows but first 3 columns
  • df[1: , 2:3] – row 1 to the last, but column 2 to3 (not inclusive)

I would recommend you play around with this to see how i really works. Also watch the video on my YouTube channel for more practical examples.

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments