STT 422 EXAM 1
Due Monday Apr 08, 11:59 pm, 25% of Final Grade = 80 points
There are total 8 questions.
Each question has subparts.
1
1 Problem 1
Consider the data set bank wage.csv. Using R or otherwise answer the following questions:
- (2 points) Plot wages versus LOS and circle the outlier with the highest value of wage. (Drop
this observation for remaining parts.) - (1 point) Find the least squares regression line for the regression of wages on LOS.
- (4 points) Give the significance test for the slope of LOS. (Clearly mention the hypothesis test,
test statistic, pvalue and conclusion). - (3 points) Give a 95% prediction interval at LOS=55.
- Problem 2
Consider the data set student gpa.csv. Consider a regression model for predicting GPA using IQ,
gender and self-concept. Using R or otherwise answer the following questions: - (4 points) Give the F-statistic for testing
H0 : βIQ = βgender = βselfconcept = 0
Also provide the degrees of freedom for this F-statistic. - (4 point) Run correlation tests to check if GPA is correlated to
(a) IQ
(b) GENDER - Problem 3
Consider the data set biomarkers.csv. Consider a regression model for predicting VO+ using OC,
TRAP and VO-. Using R or otherwise answer the following questions: - (2 points) Give the statistical model for this including all assumptions.
- (2 point) Give the multiple regression regression line to predict VO+ from OC, TRAP and
VO-. - (4 points) Make a table with t-statistics and pvalues for all the explanatory variables. Which
is the least significant variable among OC, TRAP and VO-. - (4 points) Consider the full model and the one without the least significant variable. Give the
anova table to compare these two models.
2 - Problem 4
Do people from different cultures experience emotions differently? Here is a summary of the data:
Are the means same across different cultures? - (2 points) Should you use a pooled standard deviation? If yes, what is its value?
- (4 points) Construct an ANOVA table for this problem.
- (2 points) State the hypothesis test for this problem.
- (2 points) Provide the p-value for hypothesis test in part 3.
- Problem 5
Consider the data set price promotion.csv. Using R or otherwise answer the following questions. - (2 points) Construct a contrast which can compare the average of promotions 1 and 7 to the
average of promotions 3 and 5. - (3 points) Give a 95% confidence interval for the contrast in part 1.
- (4 points) Use the Bonferroni or another multiple-comparisons procedure to compare different
price promotion groups. - Problem 6
Consider the data set intervene program.csv. Using R or otherwise answer the following questions. - (3 points) Plot the means. Do you think there is an interaction between Group and Time.
3 - (2 points) Give an estimate for the main effect of group 1.
- (4 points) Construct the two way anova model for this problem with group and time as the
factors. - (2 points) Can you accept the hypothesis that there is a main effect of time?
- Problem 7
Consider the data set plants1.csv. Using R or otherwise answer the following questions. - (4 points) Find the means for each species-by-water combination. Plot these means versus
water for the four species, connecting the means for each species by lines. - (2 points) Give the interaction effect between species level 1 and water level 6.
- (4 points) Give the two-way analysis of variance with species and water as factors.
- Problem 8
A study of 170 franchise firms classified each firm as to whether it was successful or not. Attached is
the data. - (2 points) What proportion of exclusive territory firms are successful?
- (2 points) Find the log odds for the answer in part 1.)
- (6 points) Let x = 1 for exclusive territories and x = 0 for other territories. Using R or otherwise. (a) (3 points) The Fit for logistic regression model. (B) (3 points) Odds ratio for exclusive territory versus no exclusive territory. WX: codehelp