Preview

Linear Probability Model

Powerful Essays
Open Document
Open Document
3043 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Linear Probability Model
The linear probability model, ctd.
When Y is binary, the linear regression model
Yi = β0 + β1Xi + ui is called the linear probability model.
• The predicted value is a probability:
• E(Y|X=x) = Pr(Y=1|X=x) = prob. that Y = 1 given x
• Yˆ = the predicted probability that Yi = 1, given X
• β1 = change in probability that Y = 1 for a given ∆x:
Pr(Y = 1 | X = x + ∆x ) − Pr(Y = 1 | X = x ) β1 =
∆x
5

Example: linear probability model,
HMDA data
Mortgage denial v. ratio of debt payments to income
(P/I ratio) in the HMDA data set (subset)

6

1

Linear probability model: HMDA data, ctd. n = -.080 + .604P/I ratio deny (.032) (.098)

(n = 2380)

• What is the predicted value for P/I ratio = .3? n Pr( deny = 1| P / Iratio = .3) = -.080 + .604×.3 = .151
• Calculating “effects:” increase P/I ratio from .3 to .4: n Pr( deny = 1| P / Iratio = .4) = -.080 + .604×.4 = .212
The effect on the probability of denial of an increase in P/I ratio from .3 to .4 is to increase the probability by .061, that is, by 6.1 percentage points.

7

Linear probability model: HMDA data, ctd
Next include black as a regressor: n = -.091 + .559P/I ratio + .177black deny (.032) (.098)
(.025)
Predicted probability of denial:
• for black applicant with P/I ratio = .3: n Pr( deny = 1) = -.091 + .559×.3 + .177×1 = .254
• for white applicant, P/I ratio = .3: n Pr( deny = 1) = -.091 + .559×.3 + .177×0 = .077
• difference = .177 = 17.7 percentage points
• Coefficient on black is significant at the 5% level
• Still plenty of room for omitted variable bias…
8

2

The linear probability model:
Summary
• Models Pr(Y=1|X) as a linear function of X
• Advantages:
• simple to estimate and to interpret
• inference is the same as for multiple regression (need heteroskedasticity-robust standard errors)
• Disadvantages:
• Does it make sense that the probability should be linear in X?
• Predicted probabilities can be <0 or >1!
• These disadvantages can be solved by using a nonlinear probability model: probit and logit

You May Also Find These Documents Helpful

  • Good Essays

    Nt1330 Unit 3

    • 1201 Words
    • 5 Pages

    17. Suppose X is a random variable with mean µX and standard deviation σX. Suppose Y is a random variable with mean µY and standard deviation σY. The mean of X + Y is…

    • 1201 Words
    • 5 Pages
    Good Essays
  • Satisfactory Essays

    Regression Model

    • 1130 Words
    • 5 Pages

    By running the above regression model for each brand, we got the following elasticity matrix and the figures for “V” and “C.” Note that we used the average price and quantity for P and Q to calculate each brand’s elasticity.…

    • 1130 Words
    • 5 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Linear Math Scenarios

    • 327 Words
    • 2 Pages

    The teacher's hypothesis is horribly inaccurate. First of all, Scenario A is the only linear function in the group consisting of A,B, and C. Scenario B is a function, but not linear. Scenario C is not a function.…

    • 327 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    a. What is the probability that one or more customers will be turned away on a given day?…

    • 5518 Words
    • 23 Pages
    Good Essays
  • Good Essays

    Linear Modeling Project

    • 597 Words
    • 3 Pages

    The purpose of this experiment is to determine whether a player’s statistics in baseball are related to the player’s salary. The sample set was taken out of 30 players who were randomly selected from the top 100 fantasy baseball players in 2007. We displayed the information with a scatter plot, and then determined with a linear equation the line of best fit. Along with the line of best fit we are going to analyze the Pearson Correlation Coefficient. This value is represented as an “r-value”. The closer this number is to 1 the better the relationship between the two variables being compared. The three statistics that we compared to the player’s salaries are; Homeruns, RBI, (runs batted in), and batting Average.…

    • 597 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    a.|choosing a letter from the alphabet that has line symmetry|c.|choosing a pair of parallel lines that have unequal slopes|…

    • 5784 Words
    • 24 Pages
    Powerful Essays
  • Good Essays

    remember the titans essay

    • 607 Words
    • 2 Pages

    The movie Remember The Titans was a great example of character education. Throughout the movie there are many great examples that represent the acronym “PRIDE,” which stands for patience, responsibility, integrity, dedication, and empathy. The characters represent all of these qualities well in this great movie.…

    • 607 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Math Probabilities

    • 602 Words
    • 3 Pages

    [10 pts] 3. List the jobid, podate, custid, and name for any jobs with purchase orders dated (podate) since February 1, 2006.…

    • 602 Words
    • 3 Pages
    Good Essays
  • Better Essays

    “Hills Like White Elephants”, a short story by Ernest Hemingway, presents many interesting insights into relationships between men and women from the era when it was written. During the 1920’s, an era referred to as the Roarin’ Twenties, women were slowly progressing out from their stereotypical household roles to lives of entertainment and partying. In this short story, Hemingway's characters reveal the lingering differences in stature between men and women in this period. Hemingway, in this story, provides detailed descriptions and well thought out dialogue between his two main characters, an American man and a girl called Jig. The dialogue in the “Hills Like White Elephants”, allows the reader to understand the interactions between men and women…

    • 1326 Words
    • 6 Pages
    Better Essays
  • Good Essays

    Statistic and Probablilty

    • 481 Words
    • 2 Pages

    Suppose you hear an "old-timer" say, "Why, in my day, kids were much more respectful and didn't cause as much trouble as they do nowadays!" Formulate a hypothesis related to this statement that you could test. How would you test it?…

    • 481 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Opposition to Slavery DBQ

    • 977 Words
    • 3 Pages

    During the time span of 1776 to 1844, the opposition to slavery grew immensely in the United States of America. There were many contributing factors and reasons as to why this happened, including both underlying forces and specific events. Some people or groups made efforts to fight against slavery in hopes that it would be abolished completely in the United States. They did so by organizing groups, meetings, and even developing escape routes for slaves—an example is the Underground Railroad. The North feared its practice spreading throughout America. However, there were other people—mostly from the South—who viewed slavery as a positive in society and believed it benefitted the country as a whole. These people didn’t understand that slavery was morally wrong and went against the principles of democracy, and the Declaration of Independence; instead they worried about how slavery could benefit themselves.…

    • 977 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Qmb- Probability

    • 339 Words
    • 2 Pages

    Tab 1----All graphs, including the histogram should have an appropriate title and the x and y axis should be labeled. Bin and frequency does not give any information as to what is being represented by the numerical data in the histogram (hint: Electricity cost (in $) and one-bedroom apartments). As Professor Ellis stated in the lectures, graphs should be able to stand alone. “A Graph should sing its song!” Bin ranges are correct. However, the largest percentage does not lie between 139, 179. Both are upper boundaries. Following this logic would mean that there are a total of 31 data values as being the largest percentage, which your graph does not support. In determining between what two amounts does the largest percentage of observation lie? You should identify the tallest bar or view the bin-frequency table. That location will be one of the two numbers. Where would that range start (range cannot start at an upper boundary)? That is the other number you are to identify; will be the number starting the next range. So, if 139 is an upper boundary; where would the next range start if it ends at 159?…

    • 339 Words
    • 2 Pages
    Good Essays
  • Good Essays

    (this is assuming that we sold everything we made as what we made them as)…

    • 1666 Words
    • 7 Pages
    Good Essays
  • Powerful Essays

    Probability (or likelihood[1]) is a measure or estimation of how likely it is that something will happen or that a statement is true. Probabilities are given a value between 0 (0% chance or will not happen) and 1 (100% chance or will happen).[2] The higher the degree of probability, the more likely the event is to happen, or, in a longer series of samples, the greater the number of times such event is expected to happen.…

    • 2893 Words
    • 12 Pages
    Powerful Essays
  • Satisfactory Essays

    Cov (x, y) = ∑(x- μx)*(y- μy) PXY(x,y) = ∑ x y PXY(x,y) – μx* μy…

    • 361 Words
    • 2 Pages
    Satisfactory Essays