Movie Theater Survey
Ever wonder how many people still go to the movie theaters after rising ticket prices or convenience of Netflix at home? This movie survey was created to see how many people still go to the Movie Theater, how much they spend, preferences regarding movie categories and whether they prefer 2D or 3D movies. This survey was created by a group of 6 people who used the social media (primarily Facebook) to obtain a sample from a population above the age of 17, both male and female who all lived in the United States. A total of 139 people responded to this survey. All responses were voluntary (voluntary response sample) and since there was no systemic way we reached out to our population this would be considered a convenience sample not a random. The people in this convenience sample were easy to reach.
The following questions were included in the survey:
1. How many times a year, on average, do you go to a movie theater to watch a movie?
2. How much, per person, do you spend in total each time you go to the movie theater?
3. Do you prefer to see the movie in 2D or 3D?
4. Which movie category is your favorite?
a. Drama
b. Action/Thriller
c. Comedy/Rom Com
d. Animated/Children/Family
e. SciFi/Horror
f. Other
Looking at a Categorical Variable
The pie chart below shows the results for the question, which movie category is your favorite?
<results 1>
This pie chart shows that the favorite movie category was Comedy/RomCom with 27.34%. Action/Thriller were the second favorite at 25.18% Both Drama and SciFi/Horror came in as third favorite with 20.86%.
The bar plot will display the difference between all the categories and who prefers 2D or 3D.
<results 2>
The bar plot shows that people clearly prefer 2D movies over 3D movies. The people that chose “other” only preferred 2D. The favorite category Comedy/RomCom favored 2D over three times as much than people that favored Comedy/RomCom in 3D. Although Comedy/RomCom had the highest frequency in 2D, Action/Thriller was had the highest frequency in the 3D. The Drama also had a large difference with a preference in 2D rather than 3D.
Looking at a Numerical Value
A Histogram and Boxplot was created in response to the question below along with summary statistics:
How much per person do you spend in total each time you go to the movie theater?
<results 3>
<results 4>
Summary statistics:
Column 
n 
Mean 
Variance 
Std. dev. 
Std. err. 
Median 
Range 
Min 
Max 
Q1 
Q3 
N2 
139 
19.467626 
68.873944 
8.2990327 
0.70391477 
20 
60 
0 
60 
15 
25 
<results 5>
The Histogram is skewed to the right. The tallest bar represented $20 that is spent per person at the movies. The majority of people spent $15$25 at the movies. There is a gap in the histogram between 50 and 60. The mean is 19.467626, median is 20, range is 60 and standard deviation is 8.2990327. The IQR (Q3Q1) is 10 which compares the upper and lower quartiles or range of the middle half of the data. The median and mean are about the same at 20, on the histogram this represents the peak bar which is 20, the median and mean affect the skew because the bars start a down slope after 20 which causes a skew to the right. The range/4= (60/4=15), 15 is not a good approximation of the standard deviation because there is a large difference between 15 and 8.2990327. The outlier causes the range rule of thumb to be inaccurate overestimating the standard deviation. The midrange (60+0)/2=30 is not a good representation of center for this histogram because the tallest bar is 20 so it overestimates. 45 and 60 are outliers in the data set. I don’t think that these outliers are in error because some people may spend more on drinks and snacks at the theater.
Looking for a Relationship between Two Numerical Variables
<result 6>
The direction of the scatter plot first seems to go in a positive direction but then at about 20 it starts going in negative direction. I would describe this scatter plot to have more of a linear pathway going straight up. In regards to scatter the points stay close to a defined path.
Simple linear regression results:
Dependent Variable: N2
Independent Variable: N1
N2 = 18.648697 + 0.17539466 N1
Sample size: 139
R (correlation coefficient) = 0.14671873
Rsq = 0.021526385
Estimate of error standard deviation: 8.2391289
Parameter estimates:
Parameter 
Estimate 
Std. Err. 
Alternative 
DF 
TStat 
Pvalue 
Intercept 
18.648697 
0.84313635 
≠ 0 
137 
22.118246 
<0.0001 
Slope 
0.17539466 
0.10102878 
≠ 0 
137 
1.7360861 
0.0848 
Analysis of variance table for regression model:
Source 
DF 
SS 
MS 
Fstat 
Pvalue 
Model 
1 
204.59977 
204.59977 
3.0139951 
0.0848 
Error 
137 
9300.0045 
67.883245 

Total 
138 
9504.6043 

<results 7>
Statistically the above information shows that there is no significant correlation between the two variables N1 (# of times per year a person goes to a movie theater) and N2 (how much $ is spent per person). The absolute value of r had to be greater than 0.167. The correlation coefficient was 0.14671873.
Summary statistics:

Simple linear regression results:
Dependent Variable: N2 Independent Variable: N1 N2 = 18.648697 + 0.17539466 N1 Sample size: 139 R (correlation coefficient) = 0.14671873 Rsq = 0.021526385 Estimate of error standard deviation: 8.2391289 Parameter estimates:
Analysis of variance table for regression model:

Already a member? Sign in.