All work to be completed

maths_task.docx

Unformatted Attachment Preview

Don't use plagiarized sources. Get Your Custom Essay on

Maths task

Just from $10/Page

PART 1

PART 2

1.

To investigate the effect of outlier on the correlation coefficient, r.

In order to see if the drink sales from a kiosk might be predicted from daily temperatures, a

survey recorded how many cold drinks were sold over 13 days in summer.

Temp °C

Number of

Cold Drinks

(a)

22

27

31

36

25

34

39

38

36

28

33

24

29

31

41

42

51

39

50

40

57

7

44

49

36

45

Using your calculator, construct a scatterplot for the data.

Print or copy-out a good sketch of the plot and label the axes. (Make sure you put

the explanatory and response variables on the right axes!) Can you see any

outlier(s) on your scatterplot?

(b)

Using your own visual judgement (‘by inspection’), describe the association

between the variables in terms of form, strength and direction.

(c)

(i) Now use your calculator to find the value of Pearson’s correlation

coefficient, r then

(ii) interpret the value of r in terms of what it can tell us about any association

between the variables temperature and number of cold drinks.

(iii) Does the calculated correlation coefficient value from c (i) support your visual

judgement from (b) regarding the strength of the relationship between the

variables?

(iv) Calculate the coefficient of determination, r2 and interpret its value: what can

the value of r2 tell us about the association here?

What does the interpretation imply regarding the predictive power of the

association?

(d)

(i) Use your calculator to find the upper and lower quartiles for the temperature

data, considered as a univariate set. Now calculate upper and lower fences and

do the test for outliers, show all working.

(ii) Use your calculator to find the upper and lower quartiles for the number of

cold drinks sold. Then do the test for outliers, show your working.

(e)

Remove any data pairs for the outlier values you have found and recalculate the

Pearson’s correlation coefficient for the new set of pairs with outliers removed

(f)

Compare the new r (without outliers) with your result for r in c (i).

Does this new value for r support or contradict your visual judgement in (b)?

(g)

What conclusions can you draw from this activity, regarding the effect of

outlier(s) on the value of Pearson’s correlation coefficient?

(h)

(i) Calculate the new value for the coefficient of determination, r2, for the data set

with outliers removed.

(ii) Give an interpretation to this new coefficient of determination in terms of the

variables then comment on the implications for the predictive power of the

association.

(i)

Use your CAS to produce a scatterplot for the new set of data with outliers

removed. Perform a linear regression analysis of the plot. Write out the equation

for the regression line, in terms of the variables in this model and comment on the

reliability of this linear regression model for making predictions.

2. Researchers found a correlation of 0.86 between the number of churchgoers and the

number of burglaries committed in 50 towns. This is a strong positive correlation.

Does the result mean that attending church makes people become burglars? Or, does it

mean that burglars deliberately target the houses of churchgoers on Sunday mornings? Is

there a hidden factor that might explain the results? Discuss.

3. As part of her studies, a university student visited a school holiday childcare centre and

gathered information about the sleeping patterns of the children in the centre. The scatter

Hours of sleep

plot for the data she gathered is graphed below:

Age (years)

(a) Copy the scatter plot to your working and draw a line of best fit through the points

‘by eye’—i.e. just by drawing a line which you think looks like it captures the

general trend of the data.

(b) Using your ‘by eye’ regression line from (a), read off from your graph:

(i) the expected number of sleeping hours of a child of age 5½ years.

(ii) the estimated age if a child sleeps for 12.5 hours.

(c) Here is the data set the student collected and put into the plot above:

{(1, 20), (2, 15), (3, 18), (4, 12), (5, 13), (6, 15), (7, 11), (8, 14), (9, 5)}

Put these values into your CAS and use them to calculate the least squares

regression line for the data. Write your equation out, using the variable names.

(d) Use the regression equation obtained in (e) to predict

(i) the expected number of sleeping hours of a child of age 5½ years

(ii) the estimated age if a child sleeps for 12.5 hours

(e) Are your answers to parts (b) and (d) the same (or at least very similar)?

Should they be the same? Why might they differ? Give reasons for your answer.

Exam Practice

In Further Mathematics there are two end-of-year examinations. Examination 1 consists of

multiple-choice questions covering the two CORE modules—Data Analysis & Financial

Modelling—and the two chosen modules— at Geometry & Trigonometry and Matrices.

The work you have just completed is a part of the Data Analysis module

In VCE Exam 1, there are a total of 40 questions to be completed in 90 minutes. So, on

average, you should plan to take around 2 minutes per question. One mark is given for each

correct answer.

In order to practice working under exam conditions, we suggest you attempt the five

multiple-choice questions below at this rate: i.e. you should time yourself and take no more

than 10min for them all.

Restrict your time to 10 minutes only.

You should then submit your answers to these questions electronically using the week 4

online quiz. That way you will get immediate feedback on your answers.

If you get less than 4/5, this indicates that you need to spend more time reviewing the work

for the week.

It is not necessary to show your working or include these questions in your SEND

document as credit is given for correct answers only.

Circle the letter beside the correct answer.

1 The following scatter plot shows the relationship between age and the number of alcoholic

NKS

drinks consumed on the weekend by a group of people.

The value of the correlation coefficient is closest to:

A – 0.8

B – 0.4

C – 0.1

D 0.4

E 0.8

2 The length (in metres) and wingspan (in metres) of eight commercial aeroplanes are

displayed in the table below:

70.7

70.7

63.7

58.4

54.9

39.4

36.4

33.4

Wingspan 64.4

59.6

60.3

60.3

47.6

35.8

28.9

28.9

Length

Correct to four decimal places, the value of Pearson’s product moment correlation

coefficient for this data is

A 0.9371

B 0.9583

C 0.9681

D 0.9793

E 0.9839

3 Data was collected from a group of students concerning the number of hours they spend

reading for recreation each week, and their score on an English examination. The plot for

the data shows a linear association with the correlation coefficient r = 0.72.

From this we can say that:

A 72% of the students read for leisure.

B Most of the students who read for leisure scored 72 on the English examination.

C Those students who spent more time reading for leisure tended to score lower on the

English examination.

D Those students who spent more time reading for leisure tended to score higher on the

English examination.

E There is only a very weak relationship between score on the English examination and

number of hours spent reading for leisure.

4 Researchers conduct a study in order to predict weight from height. If the correlation

between height and weight for a group of people is determined to be 0.75, then we can say

that approximately:

A 75% of the variation in weight is explained by the variation in height

B 75% of the variation in height is explained by the variation in weight

C 56% of the variation in weight is explained by the variation in height

D 87% of the variation in height is explained by the variation in weight

E

height

and

weight

are

negatively

correlated.

5 For which one of the following plots would it be appropriate to calculate the value of the

correlation coefficient, r?

A

B

0

30

-10

20

-20

-30

10

-40

-50

0

7

0

1

2

3

4

5

8

9

10

20

0

6

C

30

10

D

30

100

20

50

10

0

0

0

1

2

3

4

5

6

7

8

9

10

3

0

E

0

-5

1

2

4

5

6

7

8

9

10

-10

-15

0

1

2

3

4

5

6

7

8

9

10

Work out the solutions to the following questions then check them by putting your answers into

the week 5 online quiz. That way you will get immediate feedback on your answers.

If you get less than 80% this indicates that you need to spend more time reviewing the work for

the last two weeks.

It is not necessary to upload your working to your teacher this week, when you have

finished putting your answers into the online quiz, you just need to go to the Work for

submission [W05] link and type in a brief reflection on your learning progress.

TRUE OR FALSE?

Are the following statements true (T) or false (F)? Type T or F for each statement.

1.

Pearson’s correlation coefficient is insensitive to outliers.

2.

A prediction within the original range of the date is call an interpretation

3.

To calculate a residual value we subtract the observed value from the predicted value

MULTIPLE CHOICE QUESTIONS

4.

The product moment correlation coefficient (Pearson’s r) was found to be –0.3951 for

the data displayed. If the point (7, 25) was replaced with (7, 5) and Pearson’s r

recalculated, the new value of r would be.

A

Unchanged

C

negative but closer to zero D

positive but closer to zero E

B

positive but closer to 1

closer to –1

negative but

25

10

20

5

15

0

0

2

4

6

8

10

5. The scatterplot shows the game scores achieved by a group of players in two games.

Which equation is closest to the least squares regression equation of G2 on G1?

A

G2 = G1 + 15

B

G2 =

G1

3

C

G1 + G2 = 15

3G 1

D

E

15

G2 =

Score second game (G2)

45

40

35

30

25

20

5

15

3G

10

1

G2 =

5

15

5

0

0

10

20

30

40

Score first game (G1)

50

Given that r = –0.873, s x = 5.832 and s y = 6.001, the slope, b, of the regression line

6.

y = a+ b x is closest to:

A

– 0.90

B

– 0.87

C

– 0.85

D

0.87

E

0.90

7. The following least squares regression line relates the value of sales made per month, in

$’000s, at a large car yard to the number of salespersons on the staff:

Car sales (thousands of dollars) = 13 + 36 Number of salespeople

Thus we can say:

A

on average, sales are increasing by $36 000 per month.

B

on average, sales are decreasing by $36 000 per month.

C

for each additional salesperson employed we predict an increase in sales of $36 000.

D

on average, sales are increasing by $13 000 per month.

E

for each additional salesperson employed we predict an increase in sales of $13 000.

8. The plot below shows a least squares regression line together with the data used in the

calculation of that line.

The corresponding residual plot for this least squares regression line is closest to:

A

B

C

D

E

9. A person’s weight is known to be positively associated with their height. To investigate this

association for 12 men, a scatterplot is constructed as shown. When a least squares regression

line is used to model this data, the coefficient of determination is found to be 0.3146.

The scatterplot shows a very clear outlier. If the outlier is removed from the data, and a least

squares regression line refitted to the data of the remaining 11 men, the value of the

coefficient of determination will

A

remain the same

B

increase

C

decrease

D

be halved

E

not be able to be determined

10.

The table opposite shows the number of years employed and annual income (in

thousands of dollars) for 12 graduates employed by a large accounting firm.

The least squares regression line which would

enable annual salary to be predicted from

years of service is closest to:

A

Salary = 21.715 – 5.829 years

B

Salary = –21.715 + 5.829 years

C

Salary = 5.829 + 21.715 years

D

Salary = 5.829 – 21.715 years

E

Salary = 21.715 + 5.829 years

Years of

service

5

0

8

1

5

4

9

6

7

9

2

6

Annual salary ($’000)

52

23

68

25

45

38

75

50

62

64

32

88

11. A student fits a least squares line to a set of bivariate data, as shown in the scatterplot

below.

The residual plot for this least squares line would look like:

12. When the correlation coefficient, r, was calculated for the data displayed in the

scatterplot below, it was found to be − 0.3951. If the point (7, 25) was replaced with the

point (7, 5) and the correlation coefficient, r, recalculated, then the value of r would be:

A unchanged

B positive but closer to 1

C negative but closer to 0

D positive but closer to 0

E negative but closer to −1

13. A least squares regression line has been fitted to a scatterplot, as shown below.

The equation of this line is closest to:

A

y = 0.8 − 10x

B

y = 110 + 0.8x y

C

y = −1.25 + 110x

D

y = 110 − 1.25x

E

y = 110 − 0.8x

PROBLEM SOLVING I : Finding Equation of the LSR line using Formula method

14.

We wish to find the equation of the least squares regression line that will enable height (in

cm) to be predicted from femur length (in cm). Femur is just the thigh bone

Complete the following sentences by filling in the missing information.

Which is the explanatory variable (EV) and which is the response variable (RV)?

(i)

Explanatory variable is:

( height or femur length)

(ii)

Response variable:

15.

Use the summary statistics given below to determine the gradient (2 d.p.) and y-

( height or femur length)

intercept (2 d.p.) of the equation of the least squares regression line that will enable

height (y) to be predicted from femur length (x).

Summary statistics

r = 0.9939

x = 24.246

s x = 1.873

y = 166.092

s y = 10.086

Write the least square equation in terms of height and femur length. Fill in the boxes

below.

[Warning: if you do not round off the gradient and y-intercept to the nearest 2 decimal places

correctly, the computer will mark your answers as incorrect!]

=

16.

+

×

Interpret the slope of the regression equation in terms of height and femur length

( fill in the blanks below with the correct words and numbers to two decimal places):

For every 1 cm increase in

17.

,

increases by

cm.

Determine the value of the coefficient of determination (2 d.p.) and interpret this in terms

of height and femur length.

% of the

in

can be explained by the

in

PROBLEM SOLVING II – A Full Regression Analysis

The cost of preparing meals in a school canteen each day is assumed to be linearly associated to

the number of meals prepared. To help the caterers predict the costs, data were collected on the

cost of preparing meals for over 11 days. The data are shown below.

Complete the following sentences by filling in the missing information.

18. In this situation, the explanatory variable is _

(cost or number of meals?)

19. Use your calculator to find the equation of the least square regression line relating the cost

of preparing meals to the number of meals produced (giving all values to one decimal

place).

cost =

20.

+

× number of meals

Use the equation to predict the cost of producing:

(i) 48 meals.

Answer : $

(to nearest cent)

(ii) In making this prediction are you interpolating or extrapolating?

21.

Use the equation to predict the cost of producing:

(i) 21 meals.

Answer : $

(to nearest cent)

(ii) In making this prediction are you interpolating or extrapolating?

22.

Interpret the y-intercept and gradient/slope of the regression line:

The y-intercept of the regression line predicts that the fixed costs of running the canteen

(even if no meals are prepared) is $

_ (to nearest cent)

The slope of the regression line predicts that, for each additional meal produced, meal

preparation costs increase by $

(to nearest cent).

23.

Use your calculator to find the correlation coefficient, r and the coefficient of

determination, r2. Use these values to complete the following interpretation sentences.

(i) The correlation coefficient, r equals

This suggests that there is a

(4d.p.).

(strength), _

(direction)

association between the cost of preparing meals to the number of meals produced.

(ii) The coefficient of determination, r2 equals

This indicates that

(4d.p.).

% of the variation in the cost of preparing meals can be

by the variation in the number of meals produced.

24.

Use your calculator to plot the data, fit a regression line and plot the residuals.

Does the residual plot confirm or contradict the assumption that there is a linear

association present here?

SEND: Work for Submission – Exam Practice

In Further Mathematics there are two end-of-year examinations. Examination 1 consists of

multiple-choice questions covering the two CORE modules—Data Analysis & Financial

Modelling—and the two chosen modules— at Geometry & Trigonometry and Matrices.

The work you have just completed is a part of the Data Analysis module

In VCE Exam 1, there are a total of 40 questions to be completed in 90 minutes. So, on average,

you should plan to take around 2 minutes per question. One mark is given for each correct

answer.

In order to practice working under exam conditions, we suggest you attempt the five multiplechoice questions below at this rate: i.e. you should time yourself and take no more than 10min

for them all.

Restrict your time to 10 minutes only.

You should then submit your answers to these questions electronically using the week 6 online

quiz. That way you will get immediate feedback on your answers.

If you get less than 4/5, this indicates that you need to spend more time reviewing the work for

the week.

It is not necessary to show your working or include these questions in your SEND

document as credit is given for correct answers only.

Circle the letter beside the correct answer.

1 The relationship between the two variables x and y as shown in the scatterplot is non-linear.

Which of the following transformations could possibly linearize the scatterplot?

y

A log x

40

B log y

35

C 1

x

30

1

D

y

25

E y2

20

15

10

5

x

0

0

5 10 15 20 25 30 35 40 45 50

2 A student uses the following data to construct the scatterplot shown

To linearize the scatterplot she applies an

x-squared transformation. She then fits a

least squares regression line to the

transformed data with y as the dependent

variable. The equation of this least squares

regression line is closest to

A y = 7.1 + 2.9×2

B y = 29.5 + 26.8×2

C y = 26.8 29.5×2

D y = 1.3 + 0.04×2

E y = 2.2 + 0.3×2

3 The plot on the right shows a least

squares regression line together with the

data used in the calculation of that line.

The plot is clearly non-linear. The plot

can be linearized by applying a log x

transformation.

When this is done, a least squares

regression line is fitted to the

transformed data giving the equation:

Life expectancy = 14.3 + 14.5 log(GNP).

Using this equation, the average life expectancy of a country with a GNP of $7000 is

predicted to be closest to

A 63

B 65

C 67

D 70

E 73

The following information relates to questions 4 and 5

Cars depreciate in value over time. The table gives the average value of a car at different ages.

Table 1

When a scatterplot is drawn for this data it indicates that a logarithmic transformation of the

horizontal axis may linearize the data.

The original data has been reproduced in Table 2 with an extra row for the transformed variable

being added. Table 2 is incomplete.

Table 2

4 What value, correct to two decimal places, is needed to complete the table?

A 1.0

B 0.94

C 0.95

D 0.954

E 0.96

5 The equation of the least squares regression line fitted to the transformed data is

A value = 10800 18300 × log(age)

B value = 1200 + 17500 × log(age)

C value = 10800 + 18300 × log(age)

D value = 18300 10800 × log(age)

E value = 17500 1200 × log(age)

…

Purchase answer to see full

attachment

The price is based on these factors:

Academic level

Number of pages

Urgency

Basic features

- Free title page and bibliography
- Unlimited revisions
- Plagiarism-free guarantee
- Money-back guarantee
- 24/7 support

On-demand options

- Writer’s samples
- Part-by-part delivery
- Overnight delivery
- Copies of used sources
- Expert Proofreading

Paper format

- 275 words per page
- 12 pt Arial/Times New Roman
- Double line spacing
- Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Delivering a high-quality product at a reasonable price is not enough anymore.

That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more