Week 2 Project Part A
Size is the most understandable data to me because it clearly shows that most of the customers sampled had a household size of 2. As you can see from the graph below size 2 is over double the size of most of the other data. The mean of the household size data is 3.42 and the standard deviation is 1.739. According to the histogram the data is mostly skewed to the right.
The Years data shows that most customers that were sampled have lived at their current location for more than ten years. There is a high concentration of customers located within the 10-18 year range and since both 14 and 18 had the most single occurrences of 5 apiece both of these are used as the mode. The standard deviation for this data is 5.086.
The data for credit balance shows that it has a normal distribution with no skew. The Mean (3970), Median (4090), and Mode (3890) are all within very close proximity and the Mean is at the peak of the chart. All other data points fall nearly equal on either side of the Mean. The standard deviation is fairly high at 931.9.
In the next graph we will look at the relationship between location and income. The rural areas income reaches its peak at $50,000 while both the suburban and urban areas have much higher incomes both climbing to $67,000. Although both suburban and urban areas have higher incomes they also have a much higher range than rural areas. The range for rural areas is 28 while the range for suburban and urban areas is 45 and 46 respectively. The same is true for standard deviation, rural areas have a standard deviation of 7.793, suburban areas have a standard deviation of 15.258, and urban areas have a standard deviation of 14.478.
Another set of data I believe are related is location and credit balance. According to the scatter plot below rural customers have much lower credit balances than both suburban and urban customers. I believe this has a lot to do with the fact that rural