{"id":220,"date":"2021-01-25T13:19:52","date_gmt":"2021-01-25T13:19:52","guid":{"rendered":"https:\/\/www.kindsonthegenius.com\/data-science\/?p=220"},"modified":"2021-01-25T14:55:58","modified_gmt":"2021-01-25T14:55:58","slug":"working-with-data-json-pandas-dataframe-with-python-useful-tipcs","status":"publish","type":"post","link":"https:\/\/www.kindsonthegenius.com\/data-science\/working-with-data-json-pandas-dataframe-with-python-useful-tipcs\/","title":{"rendered":"Working With Data, Json, Pandas DataFrame with Python &#8211; Useful Tips"},"content":{"rendered":"<p>In this tutorial, we would cover how to do the following<\/p>\n<ol>\n<li><a href=\"#t1\">Import JSON Data to Python<\/a><\/li>\n<li><a href=\"#t2\">Building a Pandas Dataframe<\/a><\/li>\n<li><a href=\"#t3\">Adding Rows to Dataframe<\/a><\/li>\n<li><a href=\"#t4\">Displaying a Formatted Table<\/a><\/li>\n<li><a href=\"#t5\">Using Dataframe iloc[]<\/a><\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<h4><strong id=\"t1\">1. Import JSON Data to Python<\/strong><\/h4>\n<p>Let&#8217;s first get some sample json file t work with. You can get same json file from here <a href=\"https:\/\/jsonplaceholder.typicode.com\/\" target=\"_blank\" rel=\"noopener\">https:\/\/jsonplaceholder.typicode.com\/<\/a>. I have downloaded a json file to my local system. I believe you already have Jupiter Notebook. So the first\u00a0 you need to to do is to start a new notebook.<\/p>\n<p>To be able to import json file, you need the json module and then use the load method. The code is shown below<\/p>\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #888888;\"># How to import json in Python<\/span>\r\n<span style=\"color: #008800; font-weight: bold;\">import<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">json<\/span>\r\n\r\n<span style=\"color: #008800; font-weight: bold;\">with<\/span> <span style=\"color: #007020;\">open<\/span>(<span style=\"background-color: #fff0f0;\">'\/Users\/kindsonmunonye\/todos.json'<\/span>) <span style=\"color: #008800; font-weight: bold;\">as<\/span> f:\r\n  data <span style=\"color: #333333;\">=<\/span> json<span style=\"color: #333333;\">.<\/span>load(f)\r\n<\/pre>\n<p>The data is imported and save in a variable called data. You can use the command <em>print(data)<\/em> to display the data.<\/p>\n<p>&nbsp;<\/p>\n<h4><strong id=\"t2\">2. Building a Pandas Dataframe<\/strong><\/h4>\n<p>If you&#8217;ll be working with data in Python, then you&#8217;ll most likely need to have the data in Pandas Dataframe. This is basically a tabular representation of the data with rows and columns. So let&#8217;s assume you want to create the following dataframe in Python<\/p>\n<table style=\"width: 50%;\">\n<tbody>\n<tr style=\"background-color: #f7f6f3; font-weight: bold;\">\n<td>Name<\/td>\n<td>Age<\/td>\n<td>Height<\/td>\n<\/tr>\n<tr>\n<td>Kindson<\/td>\n<td>39<\/td>\n<td>185<\/td>\n<\/tr>\n<tr>\n<td>Jadon<\/td>\n<td>30<\/td>\n<td>170<\/td>\n<\/tr>\n<tr>\n<td>Solace<\/td>\n<td>14<\/td>\n<td>155<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The code below creates the DataFrame<\/p>\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #008800; font-weight: bold;\">import<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">pandas<\/span> <span style=\"color: #008800; font-weight: bold;\">as<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">pd<\/span>\r\ndf <span style=\"color: #333333;\">=<\/span> pd<span style=\"color: #333333;\">.<\/span>DataFrame(columns<span style=\"color: #333333;\">=<\/span>[<span style=\"background-color: #fff0f0;\">\"name\"<\/span>, <span style=\"background-color: #fff0f0;\">\"age\"<\/span>, <span style=\"background-color: #fff0f0;\">\"height\"<\/span>])\r\n<\/pre>\n<p>But right now the dataframe is empty!<\/p>\n<p>&nbsp;<\/p>\n<h4><strong id=\"t3\">3. Adding Rows to Pandas Dataframe<\/strong><\/h4>\n<h5><strong>Adding Rows to Dataframe using loc[]<\/strong><\/h5>\n<p>To add rows, you use the <em>loc<\/em> property of the dataframe. The code below populates the dataframe<\/p>\n<pre style=\"margin: 0; line-height: 125%;\">df<span style=\"color: #333333;\">.<\/span>loc[<span style=\"color: #0000dd; font-weight: bold;\">0<\/span>] <span style=\"color: #333333;\">=<\/span> [<span style=\"background-color: #fff0f0;\">\"Kindson\"<\/span>, <span style=\"color: #0000dd; font-weight: bold;\">43<\/span>, <span style=\"color: #0000dd; font-weight: bold;\">80<\/span>]\r\ndf<span style=\"color: #333333;\">.<\/span>loc[<span style=\"color: #0000dd; font-weight: bold;\">1<\/span>] <span style=\"color: #333333;\">=<\/span> [<span style=\"background-color: #fff0f0;\">\"Jadon\"<\/span>, <span style=\"color: #0000dd; font-weight: bold;\">30<\/span>, <span style=\"color: #0000dd; font-weight: bold;\">170<\/span>]\r\ndf<span style=\"color: #333333;\">.<\/span>loc[<span style=\"color: #0000dd; font-weight: bold;\">2<\/span>] <span style=\"color: #333333;\">=<\/span> [<span style=\"background-color: #fff0f0;\">\"Solace\"<\/span>, <span style=\"color: #0000dd; font-weight: bold;\">14<\/span>, <span style=\"color: #0000dd; font-weight: bold;\">155<\/span>]\r\n<\/pre>\n<p>Now if you display the dataframe, you will have:<\/p>\n<figure id=\"attachment_222\" aria-describedby=\"caption-attachment-222\" style=\"width: 300px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.21.20.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-222\" src=\"https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.21.20-300x176.png\" alt=\"Displaying a dataframe\" width=\"300\" height=\"176\" srcset=\"https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.21.20-300x176.png 300w, https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.21.20-1024x602.png 1024w, https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.21.20-768x452.png 768w, https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.21.20.png 1194w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><figcaption id=\"caption-attachment-222\" class=\"wp-caption-text\">Displaying a dataframe<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<p><strong>Adding Rows Using Dataframe.append()<\/strong><\/p>\n<p>the append method provide a way to add items rows to the dataframe without worrying about the row index. For instance, the code below adds two new rows to the dataframe<\/p>\n<pre style=\"margin: 0; line-height: 125%;\">df<span style=\"color: #333333;\">.<\/span>append({<span style=\"background-color: #fff0f0;\">\"name\"<\/span>: <span style=\"background-color: #fff0f0;\">\"McMills\"<\/span>, <span style=\"background-color: #fff0f0;\">\"age\"<\/span>:<span style=\"color: #0000dd; font-weight: bold;\">14<\/span>, <span style=\"background-color: #fff0f0;\">\"height\"<\/span>: <span style=\"color: #0000dd; font-weight: bold;\">74<\/span>}, ignore_index<span style=\"color: #333333;\">=<\/span><span style=\"color: #007020;\">True<\/span>)\r\ndf<span style=\"color: #333333;\">.<\/span>append({<span style=\"background-color: #fff0f0;\">\"name\"<\/span>: <span style=\"background-color: #fff0f0;\">\"Adaoma\"<\/span>, <span style=\"background-color: #fff0f0;\">\"age\"<\/span>:<span style=\"color: #0000dd; font-weight: bold;\">20<\/span>, <span style=\"background-color: #fff0f0;\">\"height\"<\/span>: <span style=\"color: #0000dd; font-weight: bold;\">89<\/span>}, ignore_index<span style=\"color: #333333;\">=<\/span><span style=\"color: #007020;\">True<\/span>)\r\n<\/pre>\n<p>If you run this, the you will see that two new rows are added to the dataframe as shown below:<\/p>\n<figure id=\"attachment_223\" aria-describedby=\"caption-attachment-223\" style=\"width: 1024px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.57.34.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-223 size-large\" src=\"https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.57.34-1024x391.png\" alt=\"Adding row to dataframe using append\" width=\"1024\" height=\"391\" srcset=\"https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.57.34-1024x391.png 1024w, https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.57.34-300x114.png 300w, https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.57.34-768x293.png 768w, https:\/\/www.kindsonthegenius.com\/data-science\/wp-content\/uploads\/sites\/12\/2021\/01\/Screenshot-2021-01-25-at-13.57.34.png 1468w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption id=\"caption-attachment-223\" class=\"wp-caption-text\">Adding row to dataframe using append<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<h4><strong id=\"t4\">4. Displaying a Formatted Table<\/strong><\/h4>\n<p>Sometimes, when you import data into Python, and display it using the print() method, it is not displayed in nice tabular format. To fix this, you need to us the display module available in the IPython.display library.<\/p>\n<p>This code below imports the display module, then you can use display(data) instead of print(data) to display your data.<\/p>\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #008800; font-weight: bold;\">from<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">IPython.display<\/span> <span style=\"color: #008800; font-weight: bold;\">import<\/span> display \r\n<\/pre>\n<h4><\/h4>\n<h4><strong id=\"t5\">5. How to Use iloc[]<\/strong><\/h4>\n<p>You will need to use<em> iloc[]<\/em> for selecting subsets of some data or from a dataframe.<\/p>\n<p>You need to specify the rows you want to select and the columns you want to select as well. The syntax is:<\/p>\n<pre style=\"margin: 0; line-height: 125%;\">df<span style=\"color: #333333;\">.<\/span>iloc[row_range, col_range]\r\n<\/pre>\n<p>Here are some examples:<\/p>\n<ul>\n<li><strong>df[0:3, 0:3]<\/strong> &#8211; select first 3 rows and first 3 columns<\/li>\n<li><strong>df[0:3]<\/strong> &#8211; first 3 rows (0, 1, 2) and all the columns. Same as df[0:3,]<\/li>\n<li><strong>df[,0:3]<\/strong> &#8211; all the rows but first 3 columns<\/li>\n<li><strong>df[1: , 2:3]<\/strong> &#8211; row 1 to the last, but column 2 to3 (not inclusive)<\/li>\n<\/ul>\n<p>I would recommend you play around with this to see how i really works. Also watch the video on <a href=\"https:\/\/www.youtube.com\/c\/KindsonTheTechPro\" target=\"_blank\" rel=\"noopener\">my YouTube channel<\/a> for more practical examples.<\/p>\n<!-- AddThis Advanced Settings generic via filter on the_content --><!-- AddThis Share Buttons generic via filter on the_content -->","protected":false},"excerpt":{"rendered":"<p>In this tutorial, we would cover how to do the following Import JSON Data to Python Building a Pandas Dataframe Adding Rows to Dataframe Displaying &hellip; <!-- AddThis Advanced Settings generic via filter on get_the_excerpt --><!-- AddThis Share Buttons generic via filter on get_the_excerpt --><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[43],"tags":[41,42,40,38],"class_list":["post-220","post","type-post","status-publish","format-standard","hentry","category-python","tag-dataframe","tag-json","tag-pandas","tag-python"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/posts\/220","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/comments?post=220"}],"version-history":[{"count":3,"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/posts\/220\/revisions"}],"predecessor-version":[{"id":226,"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/posts\/220\/revisions\/226"}],"wp:attachment":[{"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/media?parent=220"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/categories?post=220"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kindsonthegenius.com\/data-science\/wp-json\/wp\/v2\/tags?post=220"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}