Part A: Exploratory Data Analysis
Tiffahanie Seger
GM533 Applied Managerial Statistics
Professor Keyhani
Project Part A: Exploratory Data Analysis:
A. Brief Introduction
AJ Davis, a popular department store chain, retains many credit customers.
Therefore, they completed an in-depth study to find out more about these
customers who are essential to their future success as a company. A sample of fifty
customers were selected to conduct this study. Next data was collected in relation
to the following five variables:
1. Location
2. Income
3. Size
4. Years
5. Credit Balance
The following report was generated to represent a statistical analysis of …show more content…
these fifty
credit customers to represent AJ Davis.
B. First Individual Variable
The first individual variable is the customers’ location, which is a qualitative
or categorical variable. Qualitative variables describe the data by placing them into
broad categories or subcategories.
The three subcategories of location include:
urban, rural, and suburban.
Frequency Distribution: Location
Frequency Distribution: | Location | Frequency | Urban | 21 | Suburban | 15 | Rural | 14 |
Pie Chart: Location
Analysis:
Looking at the statistical analysis of both the frequency distribution and the
pie chart for the location variable, the majority of the credit customers live in urban
areas. Urban areas, which are 21 and 42% respectively, are followed by suburban
and rural areas for AJ Davis’ credit customers.
C. Second Individual Variable
The second individual variable is the customer’s household size, which is a
quantitative variable. Through the quantitative variable of household size, the
number of people living in each of the fifty customer’s households can be physically
counted.
Frequency Distribution: Size
Frequency Distribution: | Size | Frequency | 1 | 5 | 2 | 15 | 3 | 8 | 4 | 9 | 5 | 5 | 6 | 5 | 7 | 3 |
Bar Graph: Size
Descriptive Statistics Summary: Size
Descriptive Statistics Summary: Size | Minimum | 1 | Maximum | 7 | Range | 6 | Sum | 171 | Count | 50 …show more content…
| Mean | 3.4200 | Median | 3 | Mode | 2 | Standard Deviation | 1.7390 | Sample Variance | 3.0241 | Skewness | 0.5279 |
Analysis:
For the size variable, the statistical analysis of the frequency distribution, the
bar graph, and the descriptive statistics summary were completed. According to the
frequency distribution and bar graph, the majority of credit customers have a
household size of two. According to the descriptive statistics summary, the mean is
3.42, the median is 3, the mode is 2, and the standard deviation is 1.74. This data is
in alignment with the frequency distribution and bar graph data.
D. Third Individual Variable
The third individual variable is the customer’s credit balance, which is also a
quantitative variable. Through the quantitative variable of the credit balance, each
customer’s current store credit card balances can be counted from the sample of
fifty.
Frequency Distribution: Credit Balance
Frequency Distribution: Credit Balance | Credit Balance ($) | Frequency | Relative Frequency | $1500 - $2000 | 1 | 0.0200 | $2000 - $2500 | 2 | 0.0400 | $2500 - $3000 | 6 | 0.1200 | $3000 - $3500 | 6 | 0.1200 | $3500 - $4000 | 8 | 0.1600 | $4000 - $4500 | 12 | 0.2400 | $4500 - $5000 | 7 | 0.1400 | $5000 - $5500 | 6 | 0.1200 | $5500 - $6000 | 2 | 0.0400 | Histogram: Credit Balance
Descriptive Statistics Summary: Credit Balance
Descriptive Statistics Summary: Credit Balance | Minimum | 1864 | Maximum | 5678 | Range | 3814 | Sum | 198203 | Count | 50 | Mean | 3964 | Median | 4090 | Mode | 3890 | Standard Deviation | 933.49 | Sample Variance | 871411 | Skewness | -0.13 |
Analysis:
For the credit balance variable, the statistical analysis of the frequency
distribution, the histogram, and the descriptive statistics summary were completed.
According to the frequency distribution and histogram, the majority of credit
customers have an average store credit card balance in the $4,000 to $4,500 range.
According to the descriptive statistics summary, the mean is $3,964, the median is
$4,090, the mode is $3,890, and the standard deviation is $933.49. This data is in
alignment with the frequency distribution and histogram data. E. First Pairing of Variables
The relationship between the two variables the years and credit balance is
represented in the scatter plot below.
Scatter Plot: Years vs. Credit Balance
According to the scatter plot, there is no direct connection or relationship
between years and credit balance. There is no pattern or significant clustering
recognized above. In conclusion, there is no correlation between the years and
credit balance variables.
F. Second Pairing of Variables
The relationship between the two variables the income and credit balance is
represented in the scatter plot below.
Scatter Plot: Income vs. Credit Balance
According to the scatter plot, there is a direct connection and relationship
between income and credit balance.
There is a pattern and significant clustering
recognized above. Since the points cluster in a band running from lower left to
upper right, there is a positive correlation. If x increases, then y increases also. So
as income increases, the credit balance also increases. In conclusion, there is a
correlation between the income and credit balance variables.
G. Third Pairing of Variables
The relationship between the two variables the income and size is
represented in the scatter plot below.
Scatter Plot: Location vs. Income
According to the scatter plot, there is no direct connection or relationship
between location and income. There is no pattern or significant clustering
recognized above. In conclusion, there is no correlation between the location and
income variables. H. Conclusion
In conclusion, some variables, like income vs. credit balance do impact AJ
Davis’ sales. However, other variables, like years vs. credit balance and location vs.
income do not show a direct correlation to each other, thus not showing a
purchasing pattern of AJ Davis’ credit customers’ spending
habits.