SuperStore Sales Analysis and Visualization using Python and Plotly(Part1)

Aman Kumar
6 min readMay 9, 2021

Detailed Data Analysis and result-oriented storytelling of SuperStore Sales Dataset.

Designed by vectorjuice / Freepik

This is an analysis report which generates some insight from a dataset named “Sample Super Store”. The SuperStore dataset contains data on order details of customers for orders of a superstore in the US. It includes various columns like order date, shipping date, the product ordered, state, region, etc. The goal of this analysis is to generate some insight from the data and answer some business questions. The question can like Top product in terms of sales, Which Region is most profitable etc. This is Part1 of this blog series so it will be having a basic analysis of data.

The python libraries used in the exploratory data analysis include NumPy, Pandas, and Plotly.

Let’s Get Started ….

1. Importing the required libraries for Analysis

2. Loading the data into the data frame.

Loading the data into the pandas data frame is certainly one of the most important steps in EDA, as my file is in CSV format So all I have to do is just read the CSV into a data frame and pandas data frame does the job for us. After loading we can look over the first “N” observation using “df.head(N)” similarly for the last “N” observation using “df.tail(N)”.

First 5 Observation
Last 5 Observation

3.Checking the description and types of data

Here we check for the info of the dataset to get an overview of data like datatypes, column names, Non-null count. Sometimes the Sales value might be stored as a string or object, if in that case, we have to convert that string to integer data only then we can plot the data via a graph. Here, in this case, the data is already in their correct format.

df.info()

After loading the data and seeing that we have no Null values we can move further to the Data Visualization Part.

Now we can begin working with our data by working on the objectives.

4. Analysis and Visualization

4.1 Region wise Revenue

Let’s start on the basis of the region we will see which region is having the highest revenue share. For this, we need two columns from the data namely Sales and Region. For plotting, we need a set of values from the data to be arranged in a particular manner. It can be achieved by using methods like groupby(), max(), sort_values(), etc. The code for aggregating and finding the sales value for each region.

For this analysis, I’ll be using a pie chart for the visualization part. I’ll be using the same plotly library for the visualization. We use the pie() function of plotly to visualize the Region Wise sales. We passed all the required column names and set the title using update_layout() and finally fig.show() to show the plot.

4.2 State-wise Revenue

After region-wise analysis, I’ll be analyzing the data on the basis of states. The objective of the analysis is to find which state is having the highest revenue share. For this, we need two columns from the data namely Sales and State. For plotting, we need a set of values from the data to be arranged in a particular manner. It can be achieved by using methods like groupby(), max(), sort_values(), etc. The code for aggregating and analyzing the data for the state is as follows.

For this type of analysis, I’ll be using a bar chart for the visualization part. Again I’ll be using the same plotly library for the visualization. We use the bar() function of plotly to visualize the state-wise revenue. We passed all the required column names and set the title using update_layout() and finally fig.show() to show the plot.

4.3 City wise Revenue

After state-wise analysis, I’ll be analyzing the data on the basis of cities. The objective of the analysis is to find which city is generating the highest revenue.

Again for this type of analysis, I’ll be using a bar chart for the visualization part. We use the bar() function of plotly to visualize the city-wise revenue. We passed all the required column names and set the title using update_layout() and finally fig.show() to show the plot.

4.4 Segment Wise Revenue

The objective of this analysis is to find which segment is having the highest revenue share.

For this analysis, I’ll be using a pie chart for the visualization part and I’ll be using the same plotly library for the visualization. We use the pie() function of plotly to visualize the Segment Wise revenue. We will pass all the required column names and set the title using update_layout() and finally fig.show() to show the plot.

4.5 Product Category Wise Sales

The objective of this analysis is to find which product category is having the highest revenue share.

I’ll be again using a pie chart for the visualization part and I’ll be using the same plotly library for the visualization.

5. Conclusion and Insights

From the above visualization, we can see that the West region is having the highest share in revenue with $725K whereas the South region is the lowest share in revenue with $392K

We can see that California is the topmost state in terms of revenue with a value of $458K followed by New York with a revenue value of $311K.

We can see that New York City is the top city in terms of revenue with a value of $256K followed by Los Angeles with a revenue value of $176K.

From the above visualization, we can see that Consumer goods are having the highest share in revenue with $1161K whereas Home Office is the lowest share in revenue with $430K

From the above visualization, we can see that Technological products is having the highest in revenue with 36.4% share worth $836K whereas Office Product is the lowest share in revenue with 31.3% share worth $430K.

6. References and Future Work

In the next part, I'll be focusing on the profit analysis and how the revenue and profit are in different States, Segments, Region, Categories, etc. We can predict the sales of the next 7 days from the last date of the Training dataset by using Time Series Analysis

Resources:

  1. Dataset
  2. Data Analysis Course
  3. Tutorialspoint
  4. Pandas documentation
  5. Plotly documentation
  6. W3schools
Photo by Priscilla Du Preez on Unsplash

--

--

Aman Kumar

Computer science engineer with a keen interest in solving a complex business problem