Module 3

The Problem of Truncated Range



In the last section we had a scenario where since we did not have a sample representative of the entire population, we obtained a correlation coefficient which probably underestimated the true value of r. In the example below we have a restricted range of the population because the individuals we are studying, students at Princeton, are a very homogenous group on the variable of interest, namely SAT scores. Here, too, we will obtain a correlation coefficient which seriously underestimates the value of r. Whenever our sample is a very homogenous group we are dealing with a restricted range.

Example: Suppose you work in the Admissions Office at Princeton University. You want to find out if there is any correlation between scores on the SAT and students' GPAs. Since you want to use students' SAT scores to predict their GPAs, "SAT scores" is called the predictor variable. The variable that we are predicting, here the "GPA", is called the criterion variable or outcome variable. So, you decide to look at the SAT combined scores on Reasoning (math) and Critical Reading (verbal) and the GPAs of 20 students you've selected at random. The combined Reasoning and Critical Reasoning scores range from 400 to 1600. GPAs range up to 4.0. Below is what you find:

                Combined
Student	    SAT scores	   GPA
 #1               1420          3.7
 #2               1540          3.7
 #3               1300          3.6
 #4               1470          3.7
 #5               1390          3.6
 #6               1370          3.8
 #7               1440          3.7
 #8               1510          3.4
 #9               1310          3.5
 #10              1450          3.5
 #11              1490          3.9
 #12              1320          3.5
 #13              1380          3.5           
 #14              1400          3.5
 #15              1520          3.8
 #16              1500          3.5
 #17              1490          3.5
 #18              1520          3.7
 #19              1360          3.6
 #20              1330          3.5


3. Plot the given data on the following graph.


Check answer for the graph.

 

Suppose now we broaden our sample base to include in our data the SAT scores and GPAs of students from colleges other than Princeton. In the table below, in addition to our original Princeton students (Students #1-20), we have the scores of an additional 40 students (Students #21-60) who are not at Princeton or any other Ivy League college.

          TABLE
Student	SAT 		GPA
#1	 	1420		3.7
#2		1540		3.7
#3		1300		3.6
#4		1470		3.7
#5		1390		3.6
#6		1370		3.8
#7		1440		3.7
#8		1510		3.4
#9		1310		3.5
#10		1450		3.5
#11		1490		3.9
#12		1320		3.5
#13		1380		3.5
#14		1400		3.5
#15		1520		3.8
#16		1500		3.5
#17		1490		3.5
#18		1520		3.7
#19		1360		3.6
#20		1330		3.5
#21		 800		1.5
#22	 	 820		2.0
#23	 	 850		2.2
#24	 	 880		2.1
#25		 880		1.7
#26	 	 900		1.5
#27	 	 910		2.0
#28		 910		2.1
#29		 920		2.8
#30		 930		2.6
#31		 940		2.8
#32		 940		2.7
#33	 	 940		2.9
#34		 960		2.5
#35	 	 990		2.5
#36		1000		3.0
#37		1020		2.9
#38		1030		3.1
#39		1030		3.1
#40		1050		3.2
#41		1050		3.0
#42		1080		3.1
#43		1080		3.3
#44		1100		3.3
#45		1120		3.3
#46		1150		3.2
#47		1170		3.4
#48		1200		3.3
#49		1220		3.3
#50		1220		3.4
#51		1230		3.3
#52		1240		3.6
#53		1250		3.5
#54		1250		3.4
#55		1270		3.7
#56		1300		3.4
#57		1300		3.3
#58		1300		3.7
#59		1300		3.7
#60		1310		3.4

Plot these scores on the graph below.


Question about this website?
Please email: Dr. Barbara Rumain - barbara.rumain@touro.edu
Copyright © 2007-2017, Touro College and University System.

|TOP|