sales sqft adv_cost inventory distance district_size storecount 231 1.47 7.62 897 10.9 79.48 40 232 1.53 9.57 892 9.4 51.154 12 156 1.68 8.37 542 7.9 60.358 41 157 1.355 6.73 552 6.8 55.561 68 10 1.33 1.66 242 3.5 89.624 14 10 1.33 1.17 235 3.6 86.898 62 519 1.89 12.96 3670 18.5 108.857 56 520 1.885 12.02 3657 19.1 100.685 75 437 1.7 12.29 3345 17.4 90.138 59 487 1.86 12.5 3322 16.5 111.284 22 299 1.4 9.86 1784 11.5 75.606 26 195 1.63 7.22 1230 9.8 64.245 27 20 1.24 5.23 483 2.4 55.929 11 68 1.51 3.93 114 4.5 73.187 33 428 1.78 11.04 2829 16.4 101.192 51 429 1.725 9.43 3410 15.7 80.694 16 464 1.72 12.19 2873 15.8 105.254 84 15 1.2 1.17 289 3.2 80.937 31 65 1.47 6.56 292 3.9 80.187 97 66 1.51 5.55 312 3.8 85.897 66 98 1.24 5.79 235 6.4 90.219 75 338 1.65 3.34 1160 12.1 121.988 84 249 1.513 2.23 1184 9.7 115.277 12 161 1.4 6.95 399 7.9 50.188 14 467 1.46 13.17 2062 16.1 101.211 89 398 1.84 11.68 2103 15.9 95.406 49 497 1.68 12.11 2743 18 80.195 14 528 1.94 10.98 3779 18 110.025 58 529 1.765 11.11 3916 18.9 103.26 52 99 1.31 4.35 782 4.8 111.732 52 100 1.525 3.79 804 4.7 99.7 41 1 1.45 4.68 1116 3.4 85.882 50 347 1.65 10.08 2223 13.4 94.181 49 348 1.811 7.87 2180 12.1 95.242 50 341 1.64 10.34 1494 14.3 70.693 28 557 1.66 13.55 3522 18.5 94.329 43 508 1.698 11.53 3521 16.7 99.917 50 In the “HomeSales” dataset, the response variable, sales, depends on six potential predictor variables, sq_ft, adv_cost, inventory,...

60.1K

Verified Solution

Question

Basic Math

salessqftadv_costinventorydistancedistrict_sizestorecount
2311.477.6289710.979.4840
2321.539.578929.451.15412
1561.688.375427.960.35841
1571.3556.735526.855.56168
101.331.662423.589.62414
101.331.172353.686.89862
5191.8912.96367018.5108.85756
5201.88512.02365719.1100.68575
4371.712.29334517.490.13859
4871.8612.5332216.5111.28422
2991.49.86178411.575.60626
1951.637.2212309.864.24527
201.245.234832.455.92911
681.513.931144.573.18733
4281.7811.04282916.4101.19251
4291.7259.43341015.780.69416
4641.7212.19287315.8105.25484
151.21.172893.280.93731
651.476.562923.980.18797
661.515.553123.885.89766
981.245.792356.490.21975
3381.653.34116012.1121.98884
2491.5132.2311849.7115.27712
1611.46.953997.950.18814
4671.4613.17206216.1101.21189
3981.8411.68210315.995.40649
4971.6812.1127431880.19514
5281.9410.98377918110.02558
5291.76511.11391618.9103.2652
991.314.357824.8111.73252
1001.5253.798044.799.741
11.454.6811163.485.88250
3471.6510.08222313.494.18149
3481.8117.87218012.195.24250
3411.6410.34149414.370.69328
5571.6613.55352218.594.32943
5081.69811.53352116.799.91750

In the “HomeSales” dataset, the response variable,sales, depends on six potential predictor variables,sq_ft, adv_cost, inventory, distance,district_size, and storecount. Fit four simple linearregression (SLR) models corresponding to the four predictors,sq_ft, adv_cost, inventory, anddistance. Then, for each model, create a normalprobability plot and a histogram for the residuals, together withthe two residual scatterplots: residuals vs. fitted values andresiduals vs. observation order.

What do the residual plots for the model with sq_ft asthe predictor indicate about the validity of this regression modeland assumptions made about the errors?

What do the residual plots for the model with adv_costas the predictor indicate about the validity of this regressionmodel and assumptions made about the errors?

What do the residual plots for the model with inventoryas the predictor indicate about the validity of this regressionmodel and assumptions made about the errors?

What do the residual plots for the model with distanceas the predictor indicate about the validity of this regressionmodel and assumptions made about the errors?

One objective of this analysis is to obtain an appropriatesimple linear regression model that can be used to estimate theaverage sales based on a single predictor. State your “best” choicebased on your conclusions in parts (a)–(d).

Complete the table below, using the regression analysis resultsof the four simple linear regression models considered in parts(a)–(d). Based on the table entries, would you change your “best”choice from part (e).

Model predictor

S

R2

t-stat

sqft

110.75

66.44%

8.32

adv_cost

inventory

distance

A model including the predictor variable adv_cost is ofspecific interest. Obtain appropriate residual plots and determineif adding either district_size or storecount asan additional predictor to the SLR model with predictoradv_cost is likely to improve its fit.

Answer & Explanation Solved by verified expert
3.9 Ratings (384 Votes)
aModel 1 The predictor variable sqft to predictsalesCalllmformula sales sqft data dataResiduals Min1Q Median3Q Max200740 80410 7266 47567277668CoefficientsEstimate Std Error t value PrtIntercept 92165 14555 6332282e07 sqft76094 9141 8324816e10 Signif codes 0 0001 001 005 01 1Residual standard error 1108 on 35 degrees of freedomMultiple Rsquared 06644 Adjusted Rsquared06548Fstatistic 6929 on 1 and 35 DF pvalue 8161e10RESIDUAL PLOTSIn generalThe residual plot should be symmetric around zeroFrom the plot 1residuals vs fitted plotwe can see that thereis no trend and the points follow a horizontal band pattern thusthe error    See Answer
Get Answers to Unlimited Questions

Join us to gain access to millions of questions and expert answers. Enjoy exclusive benefits tailored just for you!

Membership Benefits:
  • Unlimited Question Access with detailed Answers
  • Zin AI - 3 Million Words
  • 10 Dall-E 3 Images
  • 20 Plot Generations
  • Conversation with Dialogue Memory
  • No Ads, Ever!
  • Access to Our Best AI Platform: Flex AI - Your personal assistant for all your inquiries!
Become a Member

Other questions asked by students