Maven Toy Store Analysis using Python and Plotly

Detailed Data Analysis and result-oriented storytelling of Maven Toy Store.

Aman Kumar
6 min readMay 18, 2021
“image: Freepik.com”. This cover has been designed using resources from Freepik.com

This is an analysis report which generates some insight on Sales & inventory data for a fictitious chain of toy stores in Mexico. This Maven Toy Store dataset contains data on order details of customers for orders of a toy store in Mexico. It includes information about products, stores, daily sales transactions. The goal of this analysis is to generate some insight from the data and answer some business questions. The question can like Top product in terms of sales.

The python libraries used in the exploratory data analysis include NumPy, Pandas, and Plotly.

Let’s Get Started ….

1. Importing the required libraries for Analysis

2. Loading the data into the data frame.

Loading the data into the pandas data frame is certainly one of the most important steps in EDA, as my file is in CSV format So all I have to do is just read the CSV into a data frame and pandas data frame does the job for us. After loading we can look over the first “N” observation using “df.head(N)” similarly for the last “N” observation using “df.tail(N)”.

3.Checking for description, types and Null Values

Here we check for the info of the dataset to get an overview of data like datatypes, column names, Non-null count. Sometimes the Sales value might be stored as a string or object, if in that case, we have to convert that string to integer data only then we can plot the data via a graph. Here, in this case, the data is already in their correct format.

After loading the data and seeing that we have no Null values we can move further to the Data Visualization Part.

Now we can begin working with our data by working on the objectives.

4. Analysis and Visualization

4.1 Top 10 revenue generating Product

Let’s start on the basis of the Prdouct_Name we will see which are the top 10 revenue generating product . For this, we need two columns from the data namely Sales and Prdouct_Name. For plotting, we need a set of values from the data to be arranged in a particular manner. It can be achieved by using methods like groupby(), max(), sort_values(), etc. The code for aggregating and finding the sales value for each region.

For this analysis, I’ve used a bar_chart for the visualization part. I’ve used the plotly library for the visualization. We use the Bar() function of plotly to visualize. We pass all the required column names and set the title using update_layout() and finally fig.show() to show the plot.

4.1 Top 10 profitable Product

Let’s start on the basis of the Prdouct_Name we will see which are the top 10 profitable product . For this, we need two columns from the data namely Profit and Prdouct_Name. For plotting, we need a set of values from the data to be arranged in a particular manner. It can be achieved by using methods like groupby(), max(), sort_values(), etc. The code for aggregating and finding the sales value for each region.

For this analysis, I’ve again used a bar_chart for the visualization part. I’ve used the plotly library for the visualization.

4.3 Location wise Revenue and Profit

I’ll be analyzing the data on the basis of location. The objective of the analysis is to find which store locations is generating the most of the revenue and profit for the store.

This is a type of comparative analysis so I’ve used subplots of pie chart. For this I’ve used make_subplots() and pie() function of plotly to create the visualization.

4.4 Profit to Sales Ration Analysis for Store Location

The objective of this analysis is to find what is the Profit to Sales ratio. In this analysis I have created a new calculated field which calculates the ratio of profit to sales for different store locations.

For this analysis, I‘ve used Line Chart as the visualization part and I’ll be using the same plotly library for the visualization. We use the go.Scatter() function of plotly and pass all the required column names and set the title using update_layout() and finally fig.show() to show the plot.

5. Conclusion and Insights

From the above Analysis 4.1 , we can see that the Lego Bricks, Colorbuds, Magic Sand are the top 3 product in revenue worth $2.39M, 1.56M, and 0.97M. From this it can be concluded that these product are in demand and we should have surplus quantity in stock.

From the Analysis 4.2 , we can see that Colorbuds is leading at the top for most profitable product followed by Action Figure(However Lego Bricks was top revenue generating product) in profit worth $835K and $348K. It can be concluded that the Colorbuds and Action figure are high profit margin product.

In the Analysis 4.3 , we see Store located in Downtown area has generated the most of profit as well as revenue with share of 56.9% in Sales and 56% in Profit (almost same) whereas store located in Airport area has generated least sales as well as revenue with share of 8.93% in sales and 9.42% in profit as compared to other locations.

In the Analysis 4.4, we calculated profit to sales ratio and find out that store located in Airport area are having highest ratio value as 0.29. This means that store located in airport area is selling the high profit margin product and if somehow we increase our number of stores in airport area may be our profit may also increase.

6. References and Future Work

We can also find out YOY of growth for overall business or Store-wise YOY growth. Quarterly analysis can also be done on this data. We can also predict the sales of the next 7 days from the last date provided in the Dataset by using Time Series Analysis

Resources:

  1. Dataset
  2. Data Analysis Course
  3. Tutorialspoint
  4. Pandas documentation
  5. Plotly documentation
  6. W3schools
Photo by Priscilla Du Preez on Unsplash

--

--

Aman Kumar

Computer science engineer with a keen interest in solving a complex business problem