Faculty of Engineering and Applied Science
2010 Fall
[Topics after Test 2]
[Note: The "Symbol" font is used in several
places in these solutions.
Some solutions will not display correctly unless the
"Symbol" font is installed on your computer.
Click here to download the
"Symbol" font.]
A production process gives components whose strengths are normally distributed with a mean of 400 N and a standard deviation of 11.5 N. A modification is made to the process which cannot reduce but may increase the mean strength. It may also change the variance. The strengths of nine randomly selected components from the modified process, in Newtons, are:
396 402 409 409 414 398 394 436 418 |
Test, at a five per cent level of significance, the hypothesis that the mean strength has not increased.
Let X = component strength (N)
X ~ N(µ,
σ2)
Test
:
µ = 400
vs.
:
µ > 400
at α = .05
σ2 is unknown for the
new process.
Data: |
n = 9 | åx = 3676 | åx2= 1502838 | a = 0.05 |
![]() |
![]() |
![]() |
s = 13.21089617 |
Method 1
![]() |
![]() |
\
Reject
in favour of
at a = 0.05
OR
Method 2
\
Reject
.
OR
Method 3
From t-tables, t.05, 8 = 1.859...
and t.025,8 = 2.306...
.025 < P[T>tobs] < .05
and
a = .05
[From appropriate software,
P[T>tobs] = .045 <
a]
\
Reject
.
OR
Confidence Interval Method
\ we are 95% sure that
µ > 400.25....
µ = 400 is not inside this CI.
\
Reject
.
At a = 5%, there is sufficient evidence
to conclude that
there has been an increase in mean strength. |
---|
An Excel spreadsheet file and a Minitab project file are also available to illustrate this solution.
A transport firm is very suspicious of the tyre company’s claim that the average lifetime of its tyres is at least 45,000 kilometres. The transport company decides to check this claim by fitting forty of these tyres to its trucks, the tyres being a random sample.
A mean lifetime of 44,164 km with a sample standard deviation of 2,106 km is observed. What may the transport firm conclude about the claim at a level of significance of one per cent?
The burden of proof is on the transport firm, (to disprove the tyre companys claim).
Test:
:
µ >
45000 (tyre companys claim.)
vs.
:
µ < 45000
at a = .01
Data: |
n = 40 |
![]() |
s = 2106 |
Method 1
![]() |
![]() |
OR
Method 2
\
Reject
.
OR
Method 3
tobs = –2.51...
From t-tables,
t.01, 39 = 2.425... ,
P[T < tobs] < 0.01
(= a)
[From appropriate software,
P[T < tobs] = 0.008... <
a]
\
Reject
.
OR
Confidence Interval Method
The one-sided 99% confidence interval estimate for µ
is µ < 44972 km.
This CI does not include µo = 45000
\
Reject
.
\ the transport company does have sufficient evidence to dispute the tyre companys claim. |
---|
A garage wants to know if a more expensive type of radial tyre has a tread life significantly more than 10,000 km beyond the tread life of a cheaper bias-ply tyre. Only if this is the case will the garage invest in the more expensive type of tyre. A random sample of forty tyres of each type is tested and the tread lives are measured. The radial tyres have a mean tread life of 36,500 km with a standard deviation of 2,200 km, while the bias ply tyres have a mean tread life of 23,800 km with a standard deviation of 1,500 km.
Based on these data, should the garage invest in the radial tyres?
Let XR = life of a randomly chosen radial
tyre
and XB = life of a randomly chosen bias-ply
tyre
The burden of proof is on µR >
µB + 10000
Free choice of a. I choose
a = 0.01
Test:
:
µR –
µB = 10000 (or less)
vs.
:
µR –
µB > 10000
Data: | nR = 40 | ![]() |
sR = 2200 |
nB = 40 | ![]() |
sB = 1500 |
An unpaired two-sample t-test is
appropriate;
measurements are on two different sets of individuals,
(not pairs of measurements on the same one set of individuals).
Assume that
(The values of sR & sB lend
some support to this assumption)
[The sample sizes are just large enough to use the two sample
z test instead, which does not require any assumption
about the two population variances. The eventual decision
will be the same.]
Pooled sample variance:
Method 1
![]() |
![]() |
\
Reject
in favour of
YES the garage should invest in the radial tyres. |
---|
OR
Method 2
\
Reject
OR
Method 3
tobs = 6.41...
From the t-tables,
P[T > tobs] << .005
[From appropriate software,
P[T > tobs]
»
5 × 10–9]
\
Reject
at any reasonable a.
OR
Confidence Interval Method
The one sided CI for µR
– µB has its boundary
at:
= 12700 – 2.37...×421.0...
= 11700.5...
This 99% CI for µR
– µB is
µR –
µB > 11701 (to nearest km).
The CI does not include 10000.
\
Reject .
An Excel spreadsheet file is also available to illustrate this solution.
A particular type of motor is known to have an output torque whose range in normal operation follows a normal distribution. Seven motors are chosen at random and are tested with the old and new methods of controlling the range of torque values. The results of the tests are as follows:
Motor: | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
New method: | 5.25 | 3.16 | 4.43 | 6.12 | 5.75 | 2.21 | 6.01 |
Old method: | 7.83 | 6.22 | 7.46 | 8.83 | 8.19 | 5.64 | 8.88 |
Justify your choice of method in (b) below.
The data are pairs of measurements on a
single set of individuals (the seven motors).
\ the appropriate method is a
paired two-sample t-test |
---|
Conduct an appropriate hypothesis test to determine whether there is sufficient evidence to conclude that the range of torques with the new method is at least 2 units less than with the old method.
Let X = torque (new method)
Y = torque (old method)
and D = X – Y
Test:
:
µD = –2
vs.
:
µD < –2
Free choice of a. I choose a = 0.05
Motor: | 1 | 2 | 3 | 4 | 5 | 6 | 7 | Sum | SSq |
x = New Method: | 5.25 | 3.16 | 4.43 | 6.12 | 5.75 | 2.21 | 6.01 | ||
y = Old Method: | 7.83 | 6.22 | 7.46 | 8.83 | 8.19 | 5.64 | 8.88 | ||
d = Difference: | -2.58 | -3.06 | -3.03 | -2.71 | -2.44 | -3.43 | -2.87 | -20.12 | 58.5004 |
Data: | n = 7 | åd = –20.12 | åd2 = 58.5004 |
Þ | ![]() |
![]() |
sD = 0.334108622 |
Method 1
![]() |
![]() |
\
Reject
in favour of
.
OR
Method 2
\
Reject
.
OR
Method 3
tobs = –6.92...
Clearly
P[T < –6.92] << .01
[From software, P[T < –6.92]
» 0.0002]
\
Reject
at any reasonable a.
An Excel spreadsheet file is also available to illustrate this solution.
Use the simple linear regression model on these data to find the equation of the line of best fit to these data.
Note that the “line of best fit” depends on which variable is the predictor and which is the response. The roles of X and Y are interchanged if the new method is the response, (which is the more natural assignment).
Summary statistics (response = Y = old method):
n = 7 | åx = 32.93 | åx2= 168.6941 |
åxy = 260.7758 | åy = 53.05 | åy2= 411.3579 |
Therefore, (correct to 3 s.f. in each coefficient),
y = 0.814 x + 3.75 |
---|
OR
Summary statistics (response = Y = new method):
n = 7 | åx = 53.05 | åx2= 411.3579 |
åxy = 260.7758 | åy = 32.93 | åy2= 168.6941 |
n Sxx = 65.2028
n Sxy = 78.4941
n Syy = 96.4738
Therefore, (correct to 3 s.f. in each coefficient),
y = 1.20 x – 4.42 |
---|
An Excel spreadsheet file is available to illustrate the first version of this solution.
Find the coefficient of determination R2 and use it to comment on your answer to part (a) above.
Irrespective of which of old or new methods is the regressor,
\ R 2
= 97.9% (correct to 3 s.f.)
Very high correlation
Þ
cannot use unpaired t-test.
\ Choice in (a) is correct.
A Minitab project file and a Minitab Report Pad RTF file are also available to illustrate this solution.
A designer claims that a new type of hull increases the average sustained speed of a speedboat by more than 2 km/h over the average sustained speed of the existing hull design. Random samples of speedboats of the two designs are tested and their sustained speeds (in km/h) are measured on a test course under identical water conditions:
New design (xA):42 36 38 37 39 36Old design (xB):
30 37 33 31 34
Conduct an appropriate hypothesis test at a level of significance of 5%. Is there sufficient evidence to accept the designer’s claim? What assumptions have you made?
Calculation of summary statistics:
New design speeds xA |
Old design speeds xB |
xA2 | xB2 |
---|---|---|---|
42 | 30 | 1764 | 900 |
36 | 37 | 1296 | 1369 |
38 | 33 | 1444 | 1089 |
37 | 31 | 1369 | 961 |
39 | 34 | 1521 | 1156 |
36 | 1296 | ||
228 | 165 | 8690 | 5475 |
nA = | 6 | nB = | 5 |
nA = | 5 | nB = | 4 |
![]() |
38 | ![]() |
33 |
sA2 = | 5.2 | sB2 = | 7.5 |
The designers claim is that
µA –
µB > 2
(which is where the burden of proof lies).
Test
:
µA –
µB = 2
vs.
:
µA –
µB > 2
We assume that the two populations are independent,
nearly normal and
have a common population variance.
Then and
σ2 is unknown, so use the pooled sample variance
The sample standard error is
=
The total number of degrees of freedom is n = nA + nB = 9
Method 1
If
\
reject Ho
in favour of
|
|
OR
Method 2
|
tc = ta, n = t.050, 9 = 1.833 tobs > tc |
OR
Method 3
As above, tobs = 1.986 . \ reject Ho. |
|
YES, we can accept the designers claim. |
---|
[Note: if we had taken
a = .01,
then we would have reached the opposite conclusion,
because the p-value is .039 (to 3 d.p.).]
An Excel spreadsheet file and a Minitab project file are also available to illustrate this solution.
The true mean tensile strength of a new type of lightweight cable is claimed to be more than 20 kN. The distribution of actual strengths is known to be normal to a good approximation, with a standard deviation of 1.4 kN. A random sample of five of the new cables has a mean of 21.5 kN
Is there sufficient evidence to support the claim?
Let X = strength of a randomly chosen cable.
X ~ N(µ,
σ 2)
σ = 1.4 kN
n = 5
= 21.5 kN
Test:
:
µ = 20
vs.
:
µ > 20
σ is known,
therefore use z instead of t.
Free choice of a. I choose
a = 0.01
Method 1
= 21.500
> c (just barely)
\
Reject
in favour of
at
a = 0.01
(or any higher a)
OR
Method 2
z.01 = 2.32...
zobs > z.01
\
Reject
.
OR
Method 3
F(–2.39)
» .008 < .01
\
Reject
.
OR
Confidence Interval Method
The 99% CI for µ is µ > 20.04 ,
which does not include µ = 20.
\
Reject
.
An Excel spreadsheet file is also available to illustrate this solution.
Now suppose that the standard deviation
is unknown and that
the measured standard deviation of the random sample is
s = 1.4 kN.
Is there sufficient evidence to support the claim?
Now σ is not known and n = 5, therefore replace z by t.
Method 1
\
do NOT reject
at a = 0.01 .
[If a = 0.05 were chosen, then
Þ opposite conclusion]
OR
Method 2
t.01, 4 = 3.74...
tobs < t.01, 4
\
do NOT reject
at a = 0.01 .
[t.05, 4 = 2.13...
tobs > t.05, 4
\
Reject
at a = 0.05]
OR
Method 3
t.05, 4 = 2.13...
tobs = 2.39...
and
t.01, 4 = 3.74...
0.01 < P[T > tobs]
< 0.05
\
do NOT reject
at a = 0.01,
[but do reject
at a = 0.05]
p-value (from software) is
P[T > tobs] = 0.037...
OR
Confidence Interval Method
The 99% one-sided CI for µ is
µ > 19.15 ,
which includes µ = 20.
\
do NOT reject
.
[The 95% CI for µ is µ > 20.17 ,
which does not include µ = 20
\
Reject
.]
An Excel spreadsheet file
is available to illustrate this solution, with
a = 1%.
An Excel spreadsheet file
is also available for a = 5%
Random samples are drawn from two independent populations, producing the following summary statistics:
Are these data consistent with the hypothesis that the two population means are equal?
The burden of proof is on the alternative hypothesis that the
two population means are not equal. Therefore test
:
µX –
µY = 0
vs.
:
µX –
µY ¹ 0
From the question, the random quantities X and Y are
independent, but we cannot assume equality of the variances.
[In this course, we have not investigated how to test for equality of
variances. See section 9.5 of the Devore textbook.
As it happens, the data are consistent with
The Central Limit Theorem leads to
and
Approximate the unknown
sX2
and
sY2
by the point estimates
sX2 and
sY2 respectively,
and we may approximate
ta, n by
za.
If Ho is true,
then, to an acceptable approximation,
Method 1:
Therefore do not reject Ho. |
|
OR
Method 2:
|
In both cases,
| zobs | < zc, so |
OR
Method 3:
zobs = –1.603... (as in method 2 above). The p-value =
P[| Z | > | zobs |]
= 2 P[ Z < –1.603...] Therefore do not reject Ho at any reasonable value of a. |
|
Answer:
YES, these data ARE consistent with equal population means, |
---|
[See also the Excel spreadsheet file for this question.]
[Note:
If a common variance had been assumed, then
sX2 and
sY2 would have been
replaced by the pooled sample variance
sP2 = 124260/138.
More precise estimates of the boundaries of the rejection region could
have been obtained by using
With both changes in place, the boundaries of the rejection region
for
t.025, 138 » 1.98.
The decision to retain
Ho at any reasonable
value of a is unchanged.]
A study of company performance in two nearby cities was conducted to test for any significant difference between the companies in those cities. A random sample of ten companies in city A had a sample mean performance index of 74.3 with a standard deviation of 3.2 . A random sample of ten companies in city B had a sample mean performance index of 73.2 with a standard deviation of 2.9 .
Is there a significant difference in company performance index between these two cities? State carefully your assumptions and your hypotheses.
We are dealing with two different sets of individuals, (the two sets of companies), the performances of which can be assumed to be independent of each other. The sample variances are close enough to each other to support the assumption that the population variances are equal. One further assumption (that is needed because of the small sample sizes) is that the performances are normally distributed to a fair approximation.
We will therefore conduct an
unpaired two sample t-test. |
---|
XA ~ N(µA, s2) , XB ~ N(µB, s2)
Summary statistics:
nA = 10 ,
A = 74.3 ,
sA = 3.2
nB = 10 ,
B = 73.2 ,
sA = 2.9
nA
= nB
= n – 1 = 9
Test :
µA –
µB = 0 vs.
:
µA –
µB ¹ 0
Pooled sample variance:
Standard error:
= 1.365 65...
Total number of degrees of freedom = n = 18.
We have a free choice of a.
Even with an unusually large choice for a
of 10%:
Method 1
c = Do ±
ta/2,
n × (standard error) [If a = .05 is chosen,
then c = ±2.869 |
![]() |
OR
Method 2
ta/2,
n = t.05, 18
= 1.73...
[or t.025, 18 = 2.10... or
t.005, 18 = 2.87...]
|t| < ta/2,
n
t is nowhere near the rejection region.
[A computer package will show that the p-value exceeds
.40 !]
Therefore do not reject
.
There is no significant difference in performance index between the two cities. |
---|
[See also the associated Excel file.]
A study was conducted to analyze the relationship between advertising expenditure and sales. The following data were recorded:
X | Y |
---|---|
Advertising ($) | Sales ($) |
20 | 310 |
24 | 340 |
30 | 400 |
32 | 420 |
35 | 490 |
Assume a simple linear regression between sales Y and advertising X. Calculate the coefficients β0 and β1 of the line of best fit to these data and estimate the sales when $28 are spent on advertising.
Is there a significant linear association between Y and X?
Extending the table:
x | y | x2 | x y | y2 | |
---|---|---|---|---|---|
20 | 310 | 400 | 6200 | 96100 | |
24 | 340 | 576 | 8160 | 115600 | |
30 | 400 | 900 | 12000 | 160000 | |
32 | 420 | 1024 | 13440 | 176400 | |
35 | 490 | 1225 | 17150 | 240100 | |
Sum: | 141 | 1960 | 4125 | 56950 | 788200 |
n Sxx =
n å x2
–
(å x)2
= 5 × 4125 – (141)2 = 20625 – 19881 = 744 |
n Sxy =
n å xy
–
(å x
å y) = 5 × 56950 – 141 × 1960 = 284750 – 276360 = 8390 |
n Syy =
n å y2
–
(å y)2 = 5 × 788200 – (1960)2 = 3941000 – 3841600 = 99400 |
The regression line (to 4 s.f.) is
y = 11.28 x + 73.99 |
---|
x = 28
y = 11.277... × 28 + 73.992...
= 389.744 623 7...
When $28 is spent on advertising, we predict $390 of sales |
---|
There are many choices of
method for an hypothesis test for linear association.
Test :
r = 0 vs.
:
r ¹ 0 ,
or, equivalently,
test :
b1 = 0 vs.
:
b1
¹ 0 .
just over 95% of all variation in y is explained
by the linear regression. The correlation (.975...) is very
strong, (which suggests that there is a linear association), but
the sample size is very small, so we should proceed with a formal
hypothesis test.
or
or
calculate the entries in the ANOVA table as follows:
SSE = SST –
SSR = 957.392 48
MSR = SSR / nR
= SSR = 18 922.607 52
MSE = SSE / nE
= SSE / 3 = 319.130 826 6...
f = MSR / MSE = 59.294...
d.f. | SS | MS | f | |
---|---|---|---|---|
R | 1 | 18 922.607 52... | 18 922.607 52... | 59.294... |
E | 3 | 957. 392 48... | 319.130 826 6... | |
T | 4 | 19 880 |
or
Compare the observed t = 7.700... to
t.005, 3 = 5.841...
t > ta/2, 3
for any reasonable choice of a.
[The p-value is less than .0046 .]
Therefore reject
in favour of
YES, there is a significant linear association between Y and X. |
---|
[See also the associated Excel and Minitab project files.]
[Devore 6th ed., Ch. 12 p. 551 q. 73 - parts (b) & (c) are bonus questions only]
The accompanying set of data is a subset of the data that appeared in the paper Radial Tension Strength of Pipe and Other Curved Flexural Members (J. Amer. Concrete Inst., 1980, pp. 33-39). The variables are age of a pipe specimen (x in days) and load necessary to obtain a first crack (y in 1000 lb/ft).
x | 20 | 20 | 20 | 25 | 25 | 25 | 31 | 31 | 31 |
---|---|---|---|---|---|---|---|---|---|
y | 11.45 | 10.42 | 11.14 | 10.84 | 11.17 | 10.54 | 9.47 | 9.19 | 9.54 |
Summary statistics:
n = 9 | å x = 228 | å y = 93.76 |
å x2 = 5 958 | å x y = 2 348.15 | å y2 = 982.293 2 |
n Sxx = 1 638 | n Sxy = –243.93 | n Syy = 49.701 2 |
The regression line (to 4 s.f.) is
y = 14.19 – 0.1489 x |
---|
A Minitab project file is also available to illustrate the solution to this part of this question.
The slope
b1 =
the expected increase in load per day increase in age.
The burden of proof is on showing that the [negative] slope is
steeper than –0.10.
Test :
b1 =
–0.10 vs.
:
b1 <
–0.10 at
a = .05.
A chain of calculations follows.
or, directly,
But t.05, 7 = –1.89... .
Therefore do not reject
.
NO, the data do not contradict the assertion that b1 ³ –0.10. |
---|
Note that the non-zero value of b1 in the null hypothesis renders the other versions of the t statistic incorrect. They are based on the true [null] value of the slope being zero.
For the original data,
For the suggested new values of x ({20, 21, ..., 30, 31}),
n = 12,
å x = 306,
å x2 = 7946
n Sxx = 1716
Sxx = 1716/12 = 143 .
143 < 182 and we want
to be as large as possible because it appears in the denominator
of the expression for the variance of the slope estimator,
.
A smaller value of sb leads
to more precise estimates of the slope.
[OR, the width of the confidence and prediction intervals
contains a term
.
These intervals are narrower when the denominator
is larger.]
NO, the original data set is preferable, |
---|
95% Confidence interval for E[Y|28] :
= 10.020... ± 0.4223...
Therefore
the 95% confidence interval, to 2 d.p., is
9.60 £ E[Y|28] £ 10.44 |
---|
[See also the associated Excel file.]
[Bonus question, to provide practice in the supplementary topic of type II error probabilities.]
Find the probability of committing a type II error when the true population mean is µ = 104 and an upper-tail hypothesis test is conducted at a level of significance of five per cent with a random sample of size 25 on the null hypothesis that µ = 100. It is known that the population variance is 100.
Repeat your calculation in the case when the level of significance of the hypothesis test is one per cent.
:
µ = 100 vs.
:
µ > 100
s2 = 100
s = 10
n = 25
standard error = 2
a = .05
Use method 1.
c = µo
+
za ×
(s / Ö(n)) Probability of committing type II error |
![]() |
b(104) = .361 (approx.) |
---|
a = .01
Use method 1.
c = µo +
za ×
(s / Ö(n))
= 100 + 2.32... × 2
= 104.652
Probability of committing type II error when
µ = 104 and a =
.01
= F(+0.326)
b(104) = .606 (approx.) |
---|
As expected, b increases as a decreases.
[Bonus question, to provide practice in the
supplementary topic of type II error probabilities.
This question is a modification of Devore, 6th ed.,
Ch. 8.3, pp. 343-344, q. 38.]
A university library ordinarily has a complete shelf inventory done once every year. Because of new shelving rules instituted the previous year, the head librarian believes it may be possible to save money by postponing the inventory. The librarian decides to select 800 books at random from the library’s collection and to have them searched in a preliminary manner. If the evidence indicates strongly that the true proportion of misshelved or unlocatable books is less than .02, then the inventory will be postponed.
Among the 800 books searched, 12 were misshelved or unlocatable. Test the relevant hypotheses (at a level of significance of .05) and advise the librarian what to do.
Let p represent the true
proportion of books that are misshelved or unlocatable.
Postpone the inventory if and only if there is
strong evidence for p < .02 .
Therefore we need to test
:
p = .02 vs.
:
p < .02
n = 800
a = .05
za = 1.644...
Observed sample proportion
Method 1 = .02 –
0.008 142 334 5... |
![]() |
Reject if
and only if
< .0119
But = .0150 > c
Therefore do not reject
Ho
In the remaining parts of this question, we need to know the value of c (the boundary of the rejection region when the null hypothesis is just barely true) in order to calculate the probability of rejecting (or not rejecting) the null hypothesis when the true value of p is something other than .02 . For this reason, we choose to use method 1 in part (a) of this question, rather than methods 2 or 3.
[In method 2, the z score is
–1.010, which is closer to
z = 0 than
the critical value of
–1.645 .]
[In method 3, P[Z <
–1.010] = .1562
> a.]
proceed with the inventory. |
---|
If the true proportion of misshelved and lost books is actually .01, what is the probability that the inventory will be [unnecessarily] taken?
Now we know that p = .01 P[inventory taken] The boundary c of the
rejection region |
![]() |
P[inventory taken] = .298 (3 d.p.) |
---|
With linear interpolation,
F(0.528 07) »
F(0.52) +
.807 ×
{F(0.53)
- F(0.52)}
= .6985 + .807 ×
(.7019 – .6985)
= .6985 + .807 × .0034
= .6985 + .0027...
= .7012 = .701 (to 3 d.p.)
P[inventory taken] = .299.
If the true proportion is .05, what is the probability that the inventory will be postponed?
Now we know that p = .05 >
.02 .
The correct decision now is to proceed with the inventory.
The boundary c of the P[inventory postponed] |
![]() |
=
It is therefore very unlikely indeed that the inventory will be postponed. |
---|
What types of errors are the events described in parts (b) and (c) above?
In (b),
we are told that the true value of p
is .01 .
This is less than .02, so that the
alternative hypothesis is true.
The correct decision is to
reject the null hypothesis; that is, to postpone the inventory.
Taking the inventory = failure to reject the null hypothesis.
Failure to reject the null hypothesis when it is false is a
type II error. |
---|
In (c),
we are told that the true value of p is
.05 .
This is greater than .02, so that the
null hypothesis is true.
The correct decision is to
accept the null hypothesis; that is, to proceed with the
inventory.
Postponing the inventory = rejecting the null hypothesis.
Rejecting the null hypothesis when it is true is a
type I error. |
---|
[Bonus question only in 2008 Fall]
[Devore 6th ed., Ch. 9.4 p. 398 q. 55]
Two different types of alloy, A and B, have been used to manufacture experimental specimens of a small tension link to be used in a certain engineering application. The ultimate strength (in ksi) of each specimen was determined and the results are summarized in the frequency distribution tabulated below.
Alloy: | A | B |
---|---|---|
26 - under 30 | 6 | 4 |
30 - under 34 | 12 | 9 |
34 - under 38 | 15 | 19 |
38 - under 42 | 7 | 10 |
sample sizes | 40 | 42 |
Compute a 95% confidence interval for the difference between the true proportions of all specimens of alloys A and B that have an ultimate strength of at least 34 ksi.
Let pA = proportion of alloy A
that has ultimate strength ³ 34 ksi
and pB = proportion of alloy B
that has ultimate strength ³ 34 ksi.
Then and
The point estimate of
is
A 95% CI for
pA – pB is
= –0.140 476 19 ± 0.208 129 882
The CI is –.349 <
pA – pB <
+.068 (to 2 d.p.)
Can you conclude that there is a significant difference between these two population proportions?
NO because pA = pB
is included in the 95% CI.
[See also the
Excel spreadsheet file for this
question.]
[Devore, ex. 12.4, q. 50 modified]
An experiment to measure the macroscopic magnetic relaxation time in crystals (in microseconds, µs) as a function of the strength of the external biasing magnetic field (in kiloGauss, kG) yielded the following data (“An Optical Faraday Rotation Technique for the Determination of Magnetic Relaxation Times”, IEEE Trans. Magnetics, June 1968: 175-178, with data read from a graph that appeared in the article).
x | 11.0 | 12.5 | 15.2 | 17.2 | 19.0 | 20.8 | 22.0 | 24.2 | 25.3 | 27.0 | 29.0 |
y | 187 | 225 | 305 | 318 | 367 | 365 | 400 | 435 | 450 | 506 | 558 |
The summary statistics are:
Therefore the equation of the SLR line, (correct to 4 s.f.), is
y = 18.87 x – 8.779 |
---|
The value of R 2, (the coefficient of
determination), is required.
Therefore, (to 3 s.f.), the proportion of the total variation in
Y that is explained by the linear regression model is
98.0% |
---|
At x = 18, the expected value of Y is
The point estimate s 2 of the unknown
2
is the mean square error (MSE):
The 95% CI for E[Y | x = 18] is
or, (correct to 2 d.p.),
[318.71, 343.18] µs |
The only change is an additional term under the square root:
or, (correct to 2 d.p.),
(291.40, 370.48] µs |
[Also see the associated Excel file.]