simple task in stata anaysis
Homework 5: Non-Linear Regression
(Due: 8:50am CST, Nov 29, 2024)
Let’s suppose you are interested in how life satisfaction is associated with internet use, age, gender, and marital status (i.e., married vs. non-married) for older adults (or how older adults’ internet use, age, gender, and marital status influence life satisfaction).
You set up your research question in the following model:
(Eq. 1)
To examine your research question empirically, you figured out that the Health and Retirement Study (HRS) is the secondary data that best suits your goal. Thus, you want to work with 2018 HRS core data.
1. Please go to the following web:2. Download the relevant files (e.g., h18sta.zip files… ).3. In Section LB: Leave Behind Questionnaires (Respondent), you want to take the information about life satisfaction. Use the relevant dta file and keep only three variables: hhid, pn, and QLB002C.HHID HOUSEHOLD IDENTIFICATION NUMBERSection: LB Level: Respondent Type: Character Width: 6 Decimals: 0...........................................................................17146 010003-959738. Household Identification Number
PN RESPONDENT PERSON IDENTIFICATION NUMBERSection: LB Level: Respondent Type: Character Width: 3 Decimals: 0
...........................................................................9738 010. Person Identifier654 011. Person Identifier32 012. Person Identifier1 013. Person Identifier5480 020. Person Identifier180 021. Person Identifier21 022. Person Identifier1 023. Person Identifier353 030. Person Identifier44 031. Person Identifier3 032. Person Identifier1 033. Person Identifier586 040. Person Identifier47 041. Person Identifier4 042. Person Identifier1 043. Person Identifier
QLB002C Q02C. SATISFIED WITH LIFESection: LB Level: Respondent Type: Numeric Width: 1 Decimals: 0Please say how much you agree or disagree with the following statements. (Mark (X) one box for each line.)I am satisfied with my life.
....................................................................262 1. STRONGLY DISAGREE323 2. SOMEWHAT DISAGREE348 3. SLIGHTLY DISAGREE362 4. NEITHER AGREE OR DISAGREE734 5. SLIGHTLY AGREE1841 6. SOMEWHAT AGREE1774 7. STRONGLY AGREE11502 Blank. INAP (Inapplicable); Partial Interview4. Rename the variable QLB002C to life_sat.
5. Drop individuals who reported the missing (.) in life_sat variable.
6. Label the values of life_sat variable
1. STRONGLY DISAGREE2. SOMEWHAT DISAGREE3. SLIGHTLY DISAGREE4. NEITHER AGREE OR DISAGREE5. SLIGHTLY AGREE6. SOMEWHAT AGREE7. STRONGLY AGREE7. Save the data, temp_lb_2018.dta.
8. In Section W: Event History, Internet Use and Social Security (Respondent), you want to take the information about internet use. Use the relevant dta file and keep only three variables: hhid, pn, and QW303.HHID HOUSEHOLD IDENTIFICATION NUMBERSection: W Level: Respondent Type: Character Width: 6 Decimals: 0
.........................................................................17146 010003-959738. Household Identification Number
==========================================================================
PN RESPONDENT PERSON IDENTIFICATION NUMBERSection: W Level: Respondent Type: Character Width: 3 Decimals: 0
...........................................................................9738 010. Person Identifier654 011. Person Identifier32 012. Person Identifier1 013. Person Identifier5480 020. Person Identifier180 021. Person Identifier21 022. Person Identifier1 023. Person Identifier353 030. Person Identifier44 031. Person Identifier3 032. Person Identifier1 033. Person Identifier586 040. Person Identifier47 041. Person Identifier4 042. Person Identifier1 043. Person Identifier
QW303 REGULAR USE OF WEB FOR EMAILSection: W Level: Respondent Type: Numeric Width: 2 Decimals: 0Ref: EventHistory.W303_
Do you regularly use the Internet (or the World Wide Web) for sending and receiving e-mail or for any other purpose, such as making purchases, searching for information, or making travel reservations?User Note: Interviewer-administered item.
...........................................................................14 -8. Web non-response10068 1. YES6911 5. NO9 8. DK (Don’t Know); NA (Not Ascertained)22 9. RF (Refused)122 Blank. INAP (Inapplicable); Partial Interview
9. Rename the variable QW303 to internet_use.
10. Drop individuals who reported the values -8, 8, and 9 of the “internet_use.”
11. Replace the value 5 (NO) with 0.
12. Save the data, temp_w_2018.dta.
13. Merge two data files, temp_lb_2018.dta and temp_w_2018.dta.
14. Keep individuals that are exactly matched (i.e., Keep individuals shown both in the master and using data).
15. Save the merged data, temp_merged_lb_w_2018.dta.
16. Please download the h18_trk.dta from Canvas and keep the variables: hhid, pn, qage, gender, and qmarst.
HHID HOUSEHOLD IDENTIFICATION NUMBERSection: W Level: Respondent Type: Character Width: 6 Decimals: 0010003-959738. Household Identification Number===========================================================================PN RESPONDENT PERSON IDENTIFICATION NUMBERSection: W Level: Respondent Type: Character Width: 3 Decimals: 0QAGE AGE AT 2018 INTERVIEWSection: TR Level: Respondent Type: Numeric Width: 3 Decimals: 0999: Not applicable
GENDER GENDERSection: TR Level: Respondent Type: Numeric Width: 1 Decimals: 0...........................................................................19180 1. Male24234 2. Female144 Blank. UnknownQMARST 2018 MARITAL STATUSSection: TR Level: Respondent Type: Numeric Width: 1 Decimals: 0This variable may not be completely consistent with core data. Corrections have been made to this variable based on cross-wave information. See Section 5B1 in the tracker data description for more information............................................................................9758 1. Married3625 2. Separated/Divorced3024 3. Widowed1343 4. Never Married260 5. Marital Status Unknown25548 Blank. No core interview from household, or not in sample this wave17. Rename the variable qage to age.
18. Drop individuals who reported 999 in the age variable.
19. Drop individuals who reported missing (.) in the gender variable.
20. Rename the variable qmarst to marital_st.
21. Drop individuals who reported 5 or missing (.) in the marital_st variable.
22. Generate an indicator variable for married (i.e., i_married) and code 1 if individuals reported 1 in the marital_st variable and 0 if individuals reported 2, 3, or 4 in the marital_st variable.
23. Save the data, temp_trk_2018.dta.
24. Merge two data files, temp_trk_2018.dta and temp_merged_lb_w_2018.dta.
25. Keep individuals that are exactly matched (i.e., Keep individuals shown both in the master and using data).
This time, instead of using the OLS method, you want to use
non-linear regression models.26. Run the ordered probit regression to estimate (Eq. 1) above.
27. Interpret your estimate for internet_use.
28. Run the ordered logit regression to estimate (Eq. 1) above, and get the estimates of odds ratio (i.e., use the “or” option in the ologit command).
29. Interpret your estimate for internet_use.
30. Your null hypothesis: Gender does not affect life satisfaction. Based on your ordered logit estimate for gender in Q28, test your hypothesis
using the 95% confidence interval.31. Visualize the conditional predicted probability of each response category (1. SDA, … 7. SA) for individuals who are female and aged between 20-80.
Submit your 1) do file and 2) MS Word document.
4