GEOG 2P12 Brock University Math Hypothesis Testing Questions

User Generated

rinalh

Mathematics

GEOG 2P12

Brock University

GEOG

Description

Unformatted Attachment Preview

GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University TESTING HYPOTHESES ABOUT SAMPLE MEANS: t-TESTS Readings: • De Veaux et al 2006, Chapter 23-25 I. What is a t-test? A. Parametric test B. t-sampling distribution model C. Used to make confidence intervals and tests hypotheses about sample means. D. Has degrees of freedom II. When can we use a t-test? III. Testing hypotheses about sample means... A. One sample mean to a population mean B. Two independent sample means C. Two paired sample means 1 GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 2 WHAT IS A t-TEST? T-tests are hypothesis tests of sample means. Like the Z-test, this is a parametric test. • Parametric tests make assumptions about the shape of the underlying population from which the sample is drawn. • Shape of Z- distribution is always Normal. • As n increases, shape of t-distribution becomes Normal. t-test based on a t-sampling distribution model. • t-distribution also used to construct confidence intervals around sample means (see De Veaux et al 523, 548 and 579 for formulas). GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University CONFIDENCE INTERVALS USING t-DISTRIBUTIONS One-sample t-interval (De Veaux et al 2005: 523) and where the standard error of the mean is... The critical value t*n-1 depends on the specified á-level and degrees of freedom. Two-sample t-interval (De Veaux et al 2005: 548) where the standard error of the difference of the means is... The critical value of t*df depends on the specified á-level and degrees of freedom. Paired t-interval (De Veaux et al 2005: 579) where the standard error of the mean difference is... The critical value t*n-1 depends on the specified á-level and degrees of freedom. 3 GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 4 t-tests differ from Z-tests in three ways 1. t-tests based on a t-distribution; Z-tests on Zdistribution. • However, when the n is small, tdistributions have fatter tails –> the fewer the degrees of freedom, the fatter the tails of the t-distribution. 2. t-tests are used when you do not know or can not infer the population standard deviation, ó • NB: Z-tests used w/ proportions because we can calculate the SD(p) from p. • t-tests used to test hypotheses about sample means. We cannot calculate ó from X-bar. 3. t-tests and t-distributions are characterized by degrees of freedom a.k.a. d.f. GEOG 2P12 Lecture 33-36 • Jeff Boggs Brock University 5 DEGREES OF FREEDOM? If we do not know ó, we use the standard error of the mean, aka SE(X-bar), and d.f. to [1] calculate appropriate critical bounds for failing to reject & rejecting null hypotheses (i.e., testing a hypothesis), and [2] construct confidence intervals around sample means. If we do not use d.f., we underestimate standard error of the mean, SE (X-bar). An underestimate of SE(X-bar) makes it easier to reject H0 . And that is bad. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University WHEN CAN WE USE A T-TEST? Must meet four conditions: 1. Random sampling condition: 2. Independence Assumption: 3. 10% Condition: 4. Nearly Normal Condition: The population from which we sample is approximately normally distributed. –> check histogram. • If n compare X-bar1 to X-bar2 to see if both could come from same population. • two paired sample means –> compare changes in a sample before and after a treatment. • more than two sample means requires an Ftest, something your textbook discusses in Chapter 28 (on the CD only). 8 GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University ONE SAMPLE MEAN T-TEST One sample t-test for the mean (De Veaux et al 2005: 527) where the standard error of the sample mean is... Associated Hypothesis Test: H0: ì = ì0 HA: ì ì0 if a two-tailed test HA: ì > ì0 or HA: ì < ì0 if a one-tailed test. 9 GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 10 EXAMPLE: After 4th year internship with the Ministry of the Environment, you get a job monitoring pollutants discharged into 12 Mile Creek, home of the magnificent Canada Goose. If the creek is too polluted (i.e., has 100 parts per million of DDT or more) Geese lay eggs with weak shells. If that happens, you must locate eggs and warm them artificially. • Thus, if the pollutants greater than 100 parts per million of DDT, then eggs weak-shelled. Your boss wants to know if she needs to start interviewing anyone to take care of weakshelled goose eggs. –> wants to know if Twelve Mile Creek is too polluted or acceptable. So you go and collect 9 random samples along 12 Mile Creek. (X-bar)=137 ppm, s= 42 ppm. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University What are hypotheses? Ho: ì = 100 ppm diesel contamination HA: ì < 100 ppm diesel contamination What is the alpha level? Your boss will fire you if you screw this up. You love your job, a point evident in your HA. To be safe, set á=0.01. This is a one-tailed ttest, with critical region in left-tail. DRAW t-DIAGRAM What is the corresponding critical t-score? We look at the t-table on page A-58, and realize we need to calculate degrees of freedom. Since n= 9, our d.f.= 8. The corresponding t-crit = -2.896. 11 GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 12 What is our t-test statistic? Reject or fail to reject null hypothesis? We compare t-critical of -2.896 w/ calculated tscore of 2.64, and realize our sample mean is nowhere near critical region. Thus, we fail to reject the null hypothesis. What does this mean to your boss? You tell your boss you are 99% confident that the diesel levels are at least 100 parts per million, so she better start interviewing people to collect and take care of goose eggs. You also tell her that you could be wrong one time out of 100. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University TWO SAMPLE MEAN T-TEST Two independent sample means t-test (De Veaux et al 2005: 552) X-bar1 = mean from first sample X-bar2 = mean from second sample Ä0 = hypothesized difference between the two sample means. This is often zero. The standard error of the difference between the sample means: Associated Hypothesis Test: H0: ì1 - ì2 = Ä0 HA: ì1 - ì2 Ä0 if two-tailed test HA: ì1 - ì2 > Ä OR HA: ì1 - ì2 < Ä0 if one-tailed test 13 GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 14 Degrees of freedom for two independent sample means (De Veaux et al 2005: 547, 547f) Simple way: the smaller of (n1 or n2)-1. Precise way: The simple easy way usually underestimates the number of degrees of freedom you would get compared to the more complicated formula. Minitab uses the precise way. Thus, the simple way usually makes it harder to reject the null hypothesis. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 15 EXAMPLE: Your boss promotes you to investigate surface water pollution throughout Niagara Region. • Abandoned refinery is located on level ground halfway between two creeks. Its storage tanks leak diesel fuel, contaminating at least one of these creeks. You wonder if only one or both creeks are contaminated. • Ground is level; geology is uniform; thus, each creek should be equally contaminated. • If not equally contaminated, might mean only some tanks leak, which might mean a less expensive clean-up. • Random samples along each of these creeks within one kilometer downstream of the abandoned tank park–> West Creek: n=8, w/ X-bar = 45 ppm, s = 12 ppm. East Creek: produces n=4, w/ X-bar = 53 ppm, s = 24 ppm. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 16 What are the hypotheses? H0: ì1 = ì2 –> diesel contamination same in each creek HA: ì1 ì2 –> diesel contamination differs between creeks What are the alpha-levels? You are sure both creeks dreadfully contaminated; not sure if much can be done to remove contamination. You are not really sure of the consequences of being wrong. You set the á= 0.1. Two-tailed test, thus each tail= á/2. What is the corresponding critical t-value? See t-table on page A-58 –> need to calculate degrees of freedom. Can calculate d.f. precisely or simply. We choose simple; means we are even more likely to fail to reject H0. n2 is smaller sample size; thus, d.f.=n–1=4-1= 3. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 17 What is the resulting t-score? First, calculate the standard error of the difference of the two means: = Then, calculate the t-score: Reject or fail to reject null hypothesis? Fail to reject the null since -0.63 falls well within t-crit of ± 2.353. What does this mean to your boss? Both creeks are equally contaminated, and we are wrong one time in ten. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 18 T-TEST FOR TWO PAIRED SAMPLE MEANS Sometimes we have two samples that are not independent, but that we wish to compare. • For instance, we evaluate a weight loss program by weighing clients before and after a ten weeks of exercise. These observations are not independent: one’s prior weight affects one’s later weight. • Such sampling designs are called matched pair sampling designs. • The random sampling happens when we first select elements to measure. If the sample sizes are not equal, then the observations cannot be matched. Matching implies that the number of pairs is constant. The formula for degrees of freedom is n–1. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University Paired sample means t-test (De Veaux et al 2005: 576) d-bar = the mean of the pairwise differences Ä0 = hypothesized mean of the pairwise difference. This is usually zero. where the pooled standard error of the difference between the sample means is... Associated Hypothesis Test: H0: ìd = Ä0 , where Ä0 is almost always zero. HA: ìd Ä0 if a two-tailed test HA: ìd > Ä0 OR HA: ìd < Ä0 if a one-tailed test. 19 GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 20 EXAMPLE: At the Ministry of the Environment, you become waterfowl specialist of Region Niagara. One day, a tanker truck filled with diesel fuel flips and spills contents into a watershed. Shortly afterwards, Ducks Unlimited Canada contacts your office, and complains that the quacking of the ducks in that watershed can scarcely be heard. • You call biogeographer from McMaster who researches and records duck vocalizations. • This biogeographer has tagged all the watershed’s ducks with unique ID. She has device that simultaneously records each duck’s quack and id number. GEOG 2P12 Lecture 33-36 • Jeff Boggs Brock University 21 She notes that all the ducks in her random sample seem to be there. So the population has not changed. However, their collective quacks have become mysteriously quieter since the diesel fuel spilled into the watershed. She shares her data on ducks’ quacks from before and after the spill: Duck ID Loudest Quack in decibels (db) Paired Differences Before Spill After Spill 0032A2 35 23 35-23 = 12 0036X3 46 35 46-35 = 11 0931GS 65 70 65-70 = -5 Qckr037 55 42 55-42 = 13 0012B2 36 65-36 = 29 65 Mean of the paired differences —> d-bar = (12+11-5+13+29)/5 = 60/5 = 12 decibels GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 22 Let’s calculate the standard deviation of the paired differences: Duck ID Paired (di - d-bar) (di - d-bar)2 differences 0032A2 12 12 -12 = 0 02 = 0 0036X3 11 11-12 = -1 -12 = 1 0931GS -5 -5-12= -17 -172 = 289 Qckr037 13 13-12 = 1 12 =1 0012B2 29 29-12 = 17 172 = 289 G(di - d-bar)2 = 580 GEOG 2P12 Lecture 33-36 • • Jeff Boggs Brock University 23 H0 –> diesel spill had no effect on ducks’s loudness of quacks measured in decibels. Thus, H0: d-bar = 0. HA–> diesel spill had adverse effect on ducks, specifically, that their loudness has decreased. Thus, HA: d-bar >0. • Remember: the mean difference will be positive if the initial values were higher, but are now lower. • Conversely, the mean difference will be negative if the initial values were lower, but are now higher. So let’s conduct the hypothesis test: GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University 24 What are hypotheses? H0: d-bar = 0 –> ducks’ loudness unchanged since spill HA: d-bar > 0 –> ducks’ loudness diminished since spill What is alpha-level? Ducks Unlimited Canada people pretty upset, since they fear they might not have enough diesel-free ducks for their needs. You also like ducks, so you set á=0.05. This is a one-tailed test, with critical region in the right-hand tail. What is corresponding critical t-value? We look at the t-table on page A-58, and realize we need to calculate degrees of freedom. n = 5, so d.f. = n–1 = 5-1 = 4. The corresponding t-crit is 2.132. GEOG 2P12 Lecture 33-36 Jeff Boggs Brock University What is the resulting t-score? First, calculate the standard error of the difference: Then, calculate the t-score: Reject or fail to reject null hypothesis? Reject the null since 2.23 falls into the critical region which starts at t-crit of 2.132. What does this mean? The ducks are significantly quieter, and you are wrong one time in twenty. 25 An Overview of Hypothesis Testing Jeff Boggs Assistant Professor Department of Geography Brock University St. Catharines, Ontario 15 October 2007 Revised 1 January 2009 Most introductory statistics books discuss one or two approaches to hypothesis testing: the classical approach and the p-value approach. In turn, both of these approaches combine elements of two earlier approaches to hypothesis testing: the Neyman-Pearson approach, and Fisher’s approach (Armstong 2005). The classical approach to hypothesis testing was developed before computers were in widespread use. As a result, all you need to conduct a hypothesis test are the tables that correspond to the underlying population model, be it Z, t, and so on. The p-value approach requires that you have access to either a sophisticated scientific calculator or a computer and statistical software. I teach the classical approach. The textbook teaches the p-value approach. The Classical Approach to Hypothesis Testing There are a few steps here, and different sources will sort them slightly differently. STEP 0: Become curious about something. Ultimately, curiosity motivates us to bother to test hypotheses. This curiosity eventually manifests as a claim. Most textbooks leave out this step. STEP 1: Make a claim. Normally this claim is supported by some larger body of scholarly literature as well as knowledge of the topic at hand. For instance, as an economic geographer I know there is a large body of scholarly literature that examines the relationship between regional development, worker productivity, and wage rates. In general, this literature claims that places with more productive workers tend to also be places with higher wage rates. I also know something about the Niagara Region as a result of living here (and looking at Statistics Canada data). For the sake of argument, let us say that St. Catharines has a sizeable number of manufacturing employees who are relatively well paid. Let us also say that Niagara Falls has many employees in the tourism sector (e.g., casino workers, chefs, wait staff, and hotel staff) who tend to receive lower wages. However, I don’t really know if these differences in wages are all that pronounced between the two locations. I claim there is probably a difference in wages. 1 STEP 2: Convert this claim into a null and alternate hypothesis. I claim that there is probably a difference in wages between St. Catharines and Niagara Falls. This claim is called the alternate hypothesis. The claim that there is no difference in wages between these two locations is the null hypothesis. STEP 3: Express the null and alternate hypotheses in numerical terms. There are many claims that can be made that are not measurable, and this includes claims informed by scholarly work. If you want to make systematic, rigorous claims about things relating to cultural geography, social geography and the like, then you best pay attention in GEOG 2P10. You can express the null and alternate hypotheses either as proportions, or as means and variances. De Veaux et al adopts the position that Z-tests are used only on proportions, whereas t-tests, chi-squared tests and ANOVA are used on means and variances. To avoid confusing you, I abide by their decision, and further restrict our focus to Z- and t-tests. For the sake of illustration, I will focus on proportions, and in turn, Z-tests. I set my null hypothesis to be that the proportion of employees whose wages exceed 100,000 CAD is the same in both St. Catharines and Niagara Falls. My alternate hypothesis is that there is a difference in the proportion of adults whose wages exceed 100,00 CAD between St. Catharines and Niagara Falls. I am not specifying a direction, and hence this is a two-tailed test. If we decided to state the alternate hypothesis as “proportion is higher in St. Catharines than in Niagara Falls...” then we would have a directional hypothesis test, which always is a one-tailed test. Here, Ho: p1 = p2 and HA: p1 p2 . You might ask: “Where are the measureable bits?” They are implied in this instance. This null hypothesis implies a difference of zero, whereas this alternate hypothesis implies that the difference is not zero. Thus, these can be re-expressed as Ho: p1 - p2 = 0 and HA: p1 - p2 0, a convention that your textbook does not follow. STEP 4: Determine the critical regions. The critical regions help us decide if we reject the null hypothesis, or if we fail to reject the null hypothesis. This follows on the logic of critical rationalism. Texts that are still steeped in logical positivism will state that you need critical regions to decide if you reject or accept the null hypothesis. Customarily, we set the critical regions by establishing a boundary called the significance- or alpha-level. If our calculated test statistic (a Z-score in this example) falls into the critical region, then we reject the null hypothesis. If our test statistic does not fall into the critical region, then we fail to reject the null hypothesis. The significance level or alpha level (sometimes written as á-level, or just á =) can be thought of as the border between the land of rejection and failing to reject. It is useful to remember that the 2 lands of rejection are always in the tail or tails of the distribution. Our example here sets both tails to be the critical region, and hence is called a two-tailed test. If our alternate hypothesis stated that one proportion was greater than another, then we would use a one-tailed test, and only one tail would be the critical region. We select our significance level or á-level based on the cost of making what is called a Type I error. This cost is the cost of rejecting the null hypothesis when it is actually true. Customarily, these are set at á = 0.1, 0.5 or .01. These correspond to the probabilities of rejecting the null hypothesis when it is true. Thus, if the result of rejecting the null when it is true is not very costly (i.e., little or no money will be lost, nobody will be harmed), then we might set á = 0.1 or 0.05. Medical research, however, is generally set á = 0.05 or 0.01, as the costs of failing to reject the null could be much costlier. These significance levels correspond to critical values. Critical values are expressed in terms of your test-statistic. In our example, they would be expressed in terms of Z. You might think of them as they the gate or gates (depending on the number of tails) into the land of rejection. STEP 5: Calculate your test statistic. Here is the plug-and-chug portion of your evening. Substitute numbers for variables. Be sure to check your work. STEP 6: Compare your test statistic to your critical value. Is your test statistic larger than your critical value? In other words, does your test statistic fall into the critical region? If your test statistic falls into the critical region, then you reject your null hypothesis. If your test statistic does not fall into the critical region, then you fail to reject your null hypothesis. STEP 7: Interpret your findings. The last step involves telling your reader two things. First, tell the reader whether you rejected or failed to reject the null hypothesis. Then tell the reader what this means in plain English. Bibliography Armstrong, J. Scott (2005) “Why We Really Don’t Know What ‘Statistical Significance’ Means: A Major Educational Failure.” Unpublished paper. The Wharton School of Business, University of Pennsylvania. Accessed on 3 November 2007 at the following URL: http://marketing.wharton.upenn.edu/ideas/pdf/Armstrong/StatisticalSignificance.pdf For an extended discussion of the procedures for the classical approach, see this description: http://wind.cc.whecn.edu/~pwildman/statnew/section_2_and_3_-_hypothesis_testing_about_the_ mean_-_large_samples.htm 3 1. Identify the steps of hypothesis testing. 2. Explain the difference between the null and alternate hypothesis. 3. Demonstrate that semantic difference between failing to reject and accepting a null hypothesis. 4. Explain how the p-value and a test statistic are related to each other. 5. Explain how the p-value or test statistic is used to evaluate a hypothesis. 6. Explain when one uses a two-instead of a one-tailed hypothesis test. Instructions Scenario: Your friend told two other friends that you provide helpful explanations about statistical concepts. These three friends are confused about the steps involved in hypothesis testing. Prompt: Your friends have five questions: 1 What are the steps involved in hypothesis testing? 2. What is the key difference between the null hypothesis and the alternate hypothesis? 3. Why do we say we reject or fail to reject the null hypothesis instead of saying we reject or accept the null hypothesis? 4. How is a p-value related to a test statistic? 5. What is the key difference between a one- and two-tailed hypothesis test? To keep things simple, focus on the one-sample mean t-test if you need to talk about a specific procedure, though you suspect that just dealing with the ideas more generally will be fine. Although your friends have read Rogerson Statistical Methods for Geography, Chapter 5, they are still confused. If you can't find your copy of Rogerson, the Sakai site has a link to an electronic copy of the first edition. Rubric Level 1 (Worst) Level 2 Level 3 Level 4 Novice Organization No clear sequence Advanced Material explained in a logical sophisticated, and engaging manner Intermediate Proficient Material Material explained explained/prese... in a logical in a confusing manner that manner relates each of questions two through five back to the steps identified in answer to question 1 Displays some Displays sufficient understanding of understanding of the concepts. the concept Leaves important information out Examples are Most but not all usually not helpful examples are in clarifying helpful in answers. clarifying answers. Knowledge of Content Displays little/no understanding of the concept Displays thorough understanding of content. Extra research is evident Use of examples No useful examples provided. As LEVEL 3, plus examples include figures, tables or scenario that makes concepts easier to understand. All questions answered Fewer than two Number of questions answered questions answered Three questions answered correctly. Four questions answered correctly. correctly.
Purchase answer to see full attachment
Explanation & Answer:
11 Questions
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Hi! Here's the document. :) Please let me know if you have any further questions. 😁

Hypothesis Testing
The steps involved in hypothesis testing are as follows:
a.
b.
c.
d.
e.
f.

State the null hypothesis, 𝐻0 .
State the alternate hypothesis, 𝐻𝐴 .
Set a significance level, 𝛼.
Choose a statistical test and calculate the observed statistic.
Find the critical value/s of the test statistic.
Decide whether to accept or reject 𝐻0 by comparing the observed test statistic with the
critical value of the test statistic.

Sample Problem: One-Sample t-Test
A researcher wants to know if the mean number of weekly shopping trips in
neighborhood A of an urban area differs from 3.1, which is the corresponding mean for
the urban area as a whole.
Solution:
a. State the null hypothesis, 𝐻0 .
Let’s assume that the mean number of shopping trips in neighborhood A is equal to
the mean number of shopping trips for the entire urban area.
𝐻0 : 𝜇 = 3.1
b. State the alternate hypothesis, 𝐻𝐴 .
The true mean number of weekly shopping trips in neighborhood A differs from 3.1.
𝐻𝐴 ∶ 𝜇 ≠ 3.1
c. Set the significance level, 𝛼.
The significance level refers to the likelihood of rejecting a true hypothesis or making
a Type I error. The value of 𝛼 depends on the analyst but commonly used values of 𝛼
are 0.01, 0.05, and 0.10. For this problem, let’s use 𝛼 = 0.05.
d. Choose a statistical test and calculate the observed statistic.
In this problem, we can use a one-sample t-test to compare the mean number of
shopping ...


Anonymous
Nice! Really impressed with the quality.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4