Data Analysis & Interpretation-ORGL3333
Major League Baseball Final Project
“I thought you said we didn't have any high priced talent.” Lou Brown played by James Gammon in the movie “Major League”
Major League Baseball (MLB) is a professional baseball league consisting of teams that play in the American League and the National League. The league is one of the major professional sports leagues of the United States and Canada. It is composed of 30 teams — 29 in the United States and one in Canada. MLB has the highest season attendance of any sports league in 2011. There are approximately 1200 players in the league.
In creating this report, I will analyze and interpret the data set which will include a discussion of the data sampling distribution, summary descriptive statistics, data analysis and interpretation with supporting data in tables, charts, graphs, plots and verbiage. All doing this by using a the baseball data set containing a random sample of 30 teams and from those teams, 254 players with their respective “stats,” investigate the linear relationship, if any, between baseball players’ performance and pay, and determine the statistical significance. Performance variables to be examined are batting average (AVG) and homerun (HR).
We will not be analyzing the names of the players or teams since this data type is qualitative, cross-sectional, and with a nominal measurement and are only used to help analyze all the other variables. The other variables are quantitative variables and include the players’ salary which is a cross-sectional date type, a discrete variable and uses a ratio measurement. However, the games played (G), hits (H) homeruns (HR), runs batted in (RBI) are all time series type data sets, discrete variables and use a ratio measurement. The batting average (AVG) is a time series type data, continuous variable with an interval measurement.