These data are for full-time workers, defined as workers employed more than 35 hours per week for at least 48 weeks in the previous year.

FEMALE: 1 if female; 0 if male

YEAR: Year

AHE: Average Hourly Earnings (Dependent variable, Y)

BACHELOR: 1 if worker has a bachelor’s degree; 0 if worker has a high school degree

AGE: Age (Independent variable, X)

1. Draw a scatter plot between AHE (Y-axis) and AGE (X-axis).

2. Repeat #1 but create separate scatterplots with the BACHELOR variable. [You should have two scatterplots here!]

3. Run descriptive statistics for the dataset.

4. Repeat #3 but separately for males and females.

5. A researcher believes that AHE for female workers is the same irrespective of their educational status. Run a hypothesis test to verify their claim. Use a 5% level of significance (i.e. α = 0.05). What is the null and alternate hypothesis of this test? Interpret the result.

6. Calculate the correlation coefficient between AHE and AGE.

7. a. Run a regression for males and females separately.

b. Interpret the coefficients.

c. For males, what is AHE when AGE is 33? What about for a 33-year-old female?

