PUB550: Application and Interpretation of Public Health Data
Topic 1: Data Management and Descriptive Statistics
 Evaluate methods of data organization.
 Compare characteristics of correlational, experimental, and quasiexperimental (observational) statistics variables.
 Identify the four levels of measurement.
 Differentiate between a population and a sample, and a parameter and a statistic (descriptive and inferential).
 Explain the role of quantitative and qualitative methods and sciences in describing and assessing a population’s health. PUB550: Application and Interpretation of Public Health Data
 Evaluate public health data sources.
 Apply methods to calculate and communicate descriptive statistics.
ORDER A CUSTOMWRITTEN, PLAGIARISMFREE PAPER HERE
Data Management 
A local community organization was interested in learning about general health behaviors in the area and the relationships between health behaviors and environmental and social determinants. They decided to conduct a brief survey based on a convenient sample of people visiting the local shopping mall. They offered a $5 incentive for completing the survey. The Topic 1 Example dataset includes 30 observations from this survey. Use this data to complete the relevant assignments in this course.  
Education Level  
1  Less than High School  
2  Graduated High School  
3  Graduated College  
Annual Income = US Dollars  
ID  Sex  Smoker  Education_Level***  Minutes_Exercise  Age  Employed  Annual_Income*  Neighborhood 
101  Female  No  2  90  45  Yes  51000  B 
102  Male  No  2  50  58  No  23000  C 
103  Female  Yes  3  65  31  Yes  35000  B 
104  Male  No  1  20  54  No  10000  C 
105  Female  Yes  1  50  30  Yes  28000  B 
106  Female  Yes  2  25  18  No  5000  C 
107  Female  No  3  110  39  Yes  46000  A 
108  Male  Yes  1  50  37  Yes  36000  B 
109  Female  Yes  2  40  44  Yes  51000  C 
110  Male  No  2  80  24  No  12000  A 
111  Female  No  3  120  42  Yes  78000  A 
112  Male  No  1  80  50  Yes  34000  D 
113  Female  Yes  1  60  20  No  15000  B 
114  Male  No  3  150  35  Yes  28000  B 
115  Male  No  2  75  61  Yes  28000  A 
116  Male  No  1  80  59  No  24000  B 
117  Female  No  2  110  36  Yes  55000  D 
118  Male  Yes  3  80  35  Yes  62000  B 
119  Male  Yes  2  100  29  No  32000  D 
120  Female  No  1  0  32  No  7000  C 
121  Female  Yes  2  50  26  No  17000  B 
122  Female  No  3  200  42  Yes  64000  D 
123  Male  No  2  60  52  No  5000  A 
124  Male  No  1  65  49  No  14000  D 
125  Female  No  1  40  21  No  20000  C 
126  Male  Yes  3  65  48  Yes  72000  A 
127  Female  Yes  3  70  40  Yes  85000  A 
128  Female  No  1  45  53  No  15000  B 
129  Male  No  3  75  46  Yes  64000  C 
130  Male  Yes  3  50  42  Yes  27000  B 
Topic 1 DQ 1 
Mixed methods research is becoming an important approach in generating public health evidence. Based on the resources supplied, discuss the benefits of a mixed methods approach. Include an explanation of the differences between qualitative and quantitative research and the purpose of each.
Mixed methods research has become increasing popular, however the definition of mixed methods research has yet to be agreed upon (Ozawa & Pongpirul, 2014). Essentially, mixed methods research studies incorporate quantitative and qualitative data to utilize the strengths of both types of research methods (Ozawa & Pongpirul, 2014). In health systems, mixed methods research is critical because it allows researchers to see issues from various perspectives, contextualize information, have a better understanding of the issue, form results, quantify difficult measures, create illustrations for trends, and examine processes (Ozawa & Pongpirul, 2014).To make sense of the assembly of mixed method research designs, there are four categories; the triangulation design, the embedded design, the explanatory design, and the exploratory design (Almalki, 2016). The triangulation design is practical because this type of research gathers data from different sources and utilizes different methods, which all work together as wellorganized design (Almalki, 2016). With the embedded design, less resources are needed, and it produces less data, making it easier for researchers to grasp (Almalki, 2016). The explanatory design is easy to implement, and it enables the focus of the research to be maintained (Almalki, 2016). With the exploratory design, separate stages are easy to apply, also qualitative information is acceptable to quantitative researchers (Almalki, 2016).Quantitative research regards the world as being outside of themselves. The purpose is to gain an understanding about the social world (Almalki, 2016). The qualitative approach gains a perspective of issues by investigating them in their own specific setting. The purpose is to observe occurrences and bring meaning to them (Almalki, 2016). The differences between quantitative and qualitative research is as follows:
Quantitative Approach  Qualitative Approach 
Deductive  Inductive, with underlying assumptions reality is a social construct 
Subdivides reality into smaller, manageable pieces  Places emphasis on exploring and understanding 
Observations are made and hypotheses can be tested among variables  Variables are difficult to measure 
Primacy of subject matter  
Conclusions are made with regard to the hypothesis, following a series of observations and analysis of data  Data collected will consist of an insider’s viewpoint 
(Almalki, 2016).
References
Almalki, S. (2016). Integrating Quantitative and Qualitative Data in Mixed Methods Research – Challenges and Benefits. Journal of Education and Learning. doi:10.5539/jel.v5n3p288. Retrieved from https://files.eric.ed.gov/fulltext/EJ1110464.pdf
Ozawa, S. & Pongpirul, K. (2014). 10 best resources on…mixed methods research in health systems. Health Policy and Planning. Retrieved from https://academic.oup.com/heapol/article/29/3/323/581455
ORDER A CUSTOMWRITTEN, PLAGIARISMFREE PAPER HERE
The delivery of healthcare is becoming more complex as evidence by the rising number of individuals with comorbidities and the shift towards the quality of care versus quantity. Addressing challenges that are generated by this complex system requires research that not only produces statistical data, but also understands a population’s natural setting and provides insight how he research can be applied to that setting. Mixed methods research is becoming an important approach in generating public health evidence because it combines both qualitative and quantitative research. Qualitative research answers clinical question regarding meaning and quality improvement and provides descriptive data while quantitative research answers clinical question regarding therapy, etiology, diagnosis, prevention, and prognosis and produces numerical data (Winona State University, 2014). Favorable characteristics of mixed method research include consistency between the research question, purpose and methodological choices; verifiable and transparent techniques that demonstrate trustworthiness; potential for replicability; opportunity for selfcorrection; and ability to explain the phenomena under investigation (Newman and Hitchcock, 2012). Furthermore, benefits to mixed methods include answering questions that qualitative or quantitative research cannot answer alone; provides better understanding of connections or contradictions between qualitative and quantitative data; it gives participants an opportunity to have a voice and share the experience across the research process [which is important within public health]; it facilitates different avenues of exploration that enhance the quality of evidence and enables questions to be answered more deeply (Shorten & Smith, 2017). A mixed method approach uses the combine strengths of qualitative and quantitative data. Its unique design is appropirate to addressing complex public health issues.Hitchcock, J. H., & Newman, I. (2012). Applying an Interactive QuantitativeQualitative Framework. Human Resource Development Review, 12(1), 36–52. https://doi.org/10.1177/1534484312462127Shorten, A., & Smith, J. (2017). Mixed methods research: expanding the evidence base. Evidence Based Nursing, 20(3), 74–75. https://doi.org/10.1136/eb2017102699Winona State University. (2014). Research Hub: Evidence Based Practice Toolkit: Levels of Evidence. Retrieved from Winona.edu website: https://libguides.winona.edu/c.php?g=11614&p=61584
Topic One, Discussion Question 2:Statistics are ways to summarize data in a way that will answer a specific question (Corty, 2016). There are several key words that help with defining statistics, such as population, sample, parameter and statistic.During investigation studies researchers look for subjects to study. These subjects from large groups called a population (Corty, 2016). If the research only wanted to look at a small group of this population, they would call that a sample (Corty, 2016).For example – If I were to do a research study on obesity, I could use the state of Kentucky as my population. However, if I wanted to only look at Shelbyville, Kentucky that would be a sample of Kentucky.Data from either the sample or the population which can be reduced to a simple number like an average to summarize the group (Corty, 2016). If it is characterizing the sample, it is called a statistic; if it is characterizing the population it is called a parameter. Sample statistics use Latin letters as their symbol and population parameters use Greek letters (Corty, 2016).Then there is descriptive and inferential statistics. Descriptive is the summary statement about the set of cases (Corty, 2016). It reduces a set of data to a meaningful value to describe the characteristics of the group being observed – for example: 63% of the class were females. Inferential statistics uses a sample of cases to draw a conclusion about the larger population and reduces the data down to a single value that inferences about the population (generalization from the sample to a population – for example: Students who are female at GCU have a 15% higher GPA on average than males (Corty, 2016).Public health researchers often limit or rather stop their analyses to descriptive statistics—reporting frequencies, means and standard deviation (Guetterman, 2019). This allows for missed opportunities for more advanced analyses. “For example, knowing that patients have favorable attitudes about a treatment may be important and can be addressed with descriptive statistics. On the other hand, finding that attitudes are different (or not) between men and women and that difference is statistically significant may give even more actionable information to healthcare professionals” (Guetterman, 2019). This missing piece about differences can be addressed through inferential statistical tests (Guetterman, 2019). Therefore, both are extremely important to public health research.
References:
BUY A PLAGIARISMFREE PAPER NOW
Corty, E. (2016). Using and interpreting statistics. A practical text for the behavioral, social, and health sciences 3^{rd} Edition. Retrieved from https://viewer.gcu.edu/GGdEcj
Guetterman, T., (2019). Basics of statistics for primary care research. Family Medicine Community Health. 7(2). Retrieved from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6583801/
Practicing Application of Descriptive Statistics in Excel and SPSS 
Calculating Confidence Intervals 
Topic 2 DQ 1 
Pvalues and confidence intervals are both used in hypothesis testing. Explain three reasons why it may be preferable to report a confidence interval over a Pvalue. Provide a specific example to justify your reasons.
Topic 2 DQ 1
De Prel et al. (2009) study found the following: Pvalues in scientific studies are used to determine whether a null hypothesis formulated before the performance of the study is to be accepted or rejected. In exploratory studies, pvalues enable the recognition of any statistically noteworthy findings. Confidence intervals provide information about a range in which the true value lies with a certain degree of probability, as well as about the direction and strength of the demonstrated effect. This enables conclusions to be drawn about the statistical plausibility and clinical relevance of the study findings. It is often useful for both statistical measures to be reported in scientific articles, because they provide complementary types of information (p.335).
According to de Prel et al. (2009) “For example, there might be no difference between two antihypertensives with respect to their ability to reduce blood pressure. The alternative hypothesis (H_{1}) then states that there is a difference between the two treatments. This can either be formulated as a twotailed hypothesis (any difference) or as a onetailed hypothesis (positive or negative effect). In this case, the expression “onetailed” means that the direction of the expected effect is laid down when the alternative hypothesis is formulated (p.335).
Reference
du Prel, J. B., Hommel, G., Röhrig, B., & Blettner, M. (2009). Confidence interval or pvalue?: part 4 of a series on evaluation of scientific publications. Deutsches Arzteblatt international, 106(19), 335–339. doi:10.3238/arztebl.2009.0335
ORDER A PLAGIARISMFREE PAPER NOW
Topic 2 DQ 2
The Central Limit Theorem is the fundamental theorem of statistics. In a nutshell, it says that for independent and identically distributed data whose variance is finite, the sampling distribution of any mean becomes more nearly normal (i.e., Gaussian) as the sample size grows (Chang, Wu, Ho and Chen, 2008). The sample mean ¯xn will then approach the population mean µ, in distribution. More formally, where N (0, 1) is the normal distribution and the symbol “d” in the equality means in distribution. σn is the standard deviation of a sampling distribution, σ is the standard deviation of the entire population the study (and which is often not known), and n the sample size. So, sample means vary less than individual measurements. (The square of the standard deviation is the variance.). The sampling distribution is a notional (imaginary) distribution from a very large number of samples, each one of size n, which approaches a normal distribution in the limit of large n. In practice, the Central Limit Theorem holds for n as low as 30, unless there are exceptional circumstances—e.g., when the population distribution is highly skewed—in which case higher values are needed. So, σn measures how widely the sample means of size n vary around the population mean µ (which is approached in the limit of large n). As expected, the results suggest that the distribution of the sample mean better approximates the normal distribution as the sample size increases. The results indicate that the true distribution of the sample mean when the sample is taken from a highly skewed distribution better approximates the normal distribution as the thickness of the tail of the population distribution increases.
Chang, H. J., Wu, C. H., Ho, J. F., & Chen, P. (2008). On sample size in using central limit theorem for gamma distribution
Topic 3: Hypothesis Testing
 Evaluate the importance of hypothesis testing in statistics and public health research.
Hypothesis Testing 
Topic 3 DQ 1 
Discuss the four potential outcomes of hypothesis testing and describe what is meant by type 1 and type 2 errors. Provide an example of when these errors might occur.
Topic 3 DQ 1
Banerjee et al., (2009) study found the following: Hypothesis testing is an important activity of empirical research and evidencebased medicine. A well worked up hypothesis is half the answer to the research question. For this, both knowledge of the subject derived from extensive review of the literature and working knowledge of basic statistical concepts are desirable. The present paper discusses the methods of working up a good hypothesis and statistical concepts of hypothesis testing (p.127)
Banerjee et al., (2009) study found the following: Just like a judge’s conclusion, an investigator’s conclusion may be wrong. Sometimes, by chance alone, a sample is not representative of the population. Thus, the results in the sample do not reflect reality in the population, and the random error leads to an erroneous inference. A type I error (falsepositive) occurs if an investigator rejects a null hypothesis that is actually true in the population; a type II error (falsenegative) occurs if the investigator fails to reject a null hypothesis that is actually false in the population. Although type I and type II errors can never be avoided entirely, the investigator can reduce their likelihood by increasing the sample size (the larger the sample, the lesser is the likelihood that it will differ substantially from the population) (p.127).
Banerjee et al., (2009) study found the following: Falsepositive and falsenegative results can also occur because of bias (observer, instrument, recall, etc.). (Errors due to bias, however, are not referred to as type I and type II errors.) Such errors are troublesome, since they may be difficult to detect and cannot usually be quantified (p.127).
Reference
Banerjee, A., Chitnis, U. B., Jadhav, S. L., Bhawalkar, J. S., & Chaudhury, S. (2009). Hypothesis testing, type I and type II errors. Industrial psychiatry journal, 18(2), 127–131. doi:10.4103/09726748.62274
Topic 3 DQ 1
The four potential outcome of hypothesis testing are
 Correct inference: Conclude that there is an association when one does exist in the population.
 Correct inference:Conclude that there is no association when one does not exist in the population.
 Incorrect inference (type 1): Conclude that there is an association when there actually is none (false positive).
 Incorrect inference (type 2): Conclude that there is no association when there is one (false negative) (Banerjee, Chitnis, Jadhav, Bhawalkar, & Chaudhruy, 2009)
When the sample is not representative of the population this leads to an erroneous inference and type 1 or type 2 errors. A type 1 error is a false positive, or an investigator rejecting a null hypothesis that is actually true (Banerjee, Chitnis, Jadhav, Bhawalkar, & Chaudhruy, 2009). A type 2 error is the opposite a false negative, an investigator rejecting a null hypothesis that is actually false in the population. These errors are impossible to completely avoid but the likelihood can be decreased by increasing the sample size and (Banerjee, Chitnis, Jadhav, Bhawalkar, & Chaudhruy, 2009).
Bibliography
Banerjee, A., Chitnis, U., Jadhav, S., Bhawalkar, J., & Chaudhruy, S. (2009). Hypothesis Testing, type 1 and type II errors. Indian Psychiatry , 127131.
Topic 3 DQ 2 
Review the Healthy People 2020 website. Identify one of the health issues and propose a scenario that would use a ztest as the first step in the six steps of hypothesis testing. Discuss the remaining five steps based on your scenario, including clearly articulating the null and alternative hypotheses for your scenario.
Topic 3 DQ 2
Sphweb (n.d) study found the following: The Centers for Disease Control (CDC) reported on trends in weight, height and body mass index from the 1960’s through 2002.^{1} The general trend was that Americans were much heavier and slightly taller in 2002 as compared to 1960; both men and women gained approximately 24 pounds, on average, between 1960 and 2002. In 2002, the mean weight for men was reported at 191 pounds. Suppose that an investigator hypothesizes that weights are even higher in 2006 (i.e., that the trend continued over the subsequent 4 years). The research hypothesis is that the mean weight in men in 2006 is more than 191 pounds. The null hypothesis is that there is no change in weight, and therefore the mean weight is still 191 pounds in 2006(n.d).
Sphweb (n.d) study found the following: In order to test the hypotheses, we select a random sample of American males in 2006 and measure their weights. Suppose we have resources available to recruit n=100 men into our sample. We weigh each participant and compute summary statistics on the sample data. Suppose in the sample we determine the following:
 n=100
 s=25.6
Sphweb (n. d). study found the following: Do the sample data support the null or research hypothesis? The sample mean of 197.1 is numerically higher than 191. However, is this difference more than would be expected by chance? In hypothesis testing, we assume that the null hypothesis holds until proven otherwise. We therefore need to determine the likelihood of observing a sample mean of 197.1 or higher when the true population mean is 191 (i.e., if the null hypothesis is true or under the null hypothesis). We can compute this probability using the Central Limit Theorem. Specifically, (n.d.).
Review of the “Nutrition and Weight Status” on the Healthy People 2020
Obesity in Adults (NWS9)
 Healthy People 2020 objective NWS9 tracks the proportion of adults with obesity (BMI ≥ 30).
 HP2020 Baseline: In 2005–2008, the rate of obesity was 33.9% among adults aged 20 years and over (age adjusted).
 HP2020 Target: 30.5%, a 10% improvement over the baseline.
 Most Recent: In 2013–2016, the rate of obesity was 38.6% among adults aged 20 years and over (age adjusted).
 Males aged 20 years and over had a lower rate of obesity than females (36.5% versus 40.5%, age adjusted) in 2013–2016. The rate for females was 11.0% higher than that for males.
 Among racial and ethnic groups, the nonHispanic Asian population had the lowest (best) rate of obesity, 12.5% of adults aged 20 years and over (age adjusted) in 2013–2016. Rates (age adjusted) for other racial and ethnic groups were:
 0% among the nonHispanic black population; more than 3.5 times the best group rate
 9% among the Hispanic population; more than 3.5 times the best group rate
 1% among the nonHispanic white population; 3 times the best group rate
ORDER A PLAGIARISMFREE PAPER NOW
Reference
Explore the Healthy People 2020 website.
URL:
https://www.healthypeople.gov/
http://www.realstatistics.com/hypothesistesting/nullhypothesis/
Topic 4: The tTest
Objectives:
 Differentiate the use of three types of ttests.
 Explain the assumptions of the ttest.
 Interpret ttest results to determine the difference in means.
Application of the tTest 
Topic 4 DQ 1 
Compare the three types of ttests by discussing when each is most appropriate to use and which types of questions each type of ttest best answers. Include specific examples to illustrate the appropriate use of each test.
Topic 4 DQ 1
In statistics, ttests are a type of hypothesis test that allows you to compare means. They are called ttests because each ttest boils your sample data down to one number, the tvalue. If you understand how ttests calculate tvalues, you’re well on your way to understanding how these tests work.
In this series of posts, I’m focusing on concepts rather than equations to show how ttests work. However, this post includes two simple equations that I’ll work through using the analogy of a signaltonoise ratio (Editor, M.B.,2019, n.d.).
Both the signal and noise values are in the units of your data. If your signal is 6 and the noise is 2, your tvalue is 3. This tvalue indicates that the difference is 3 times the size of the standard error. However, if there is a difference of the same size but your data have more variability (6), your tvalue is only 1. The signal is at the same scale as the noise (Editor, M.B.,2019, n.d.).
In this manner, tvalues allow you to see how distinguishable your signal is from the noise. Relatively large signals and low levels of noise produce larger tvalues. If the signal does not stand out from the noise, it’s likely that the observed difference between the sample estimate and the null hypothesis value is due to random error in the sample rather than a true difference at the population level (Editor, M.B.,2019, n.d.). PUB550: Application and Interpretation of Public Health Data
Many people are confused about when to use a paired ttest and how it works. I’ll let you in on a little secret. The paired ttest and the 1sample ttest are actually the same test in disguise! As we saw above, a 1sample ttest compares one sample mean to a null hypothesis value. A paired ttest simply calculates the difference between paired observations (e.g., before and after) and then performs a 1sample ttest on the differences (Editor, M.B.,2019, n.d.).
Understanding that the paired ttest simply performs a 1sample ttest on the paired differences can really help you understand how the paired ttest works and when to use it. You just need to figure out whether it makes sense to calculate the difference between each pair of observations (Editor, M.B.,2019, n.d.).
Reference
Editor, M. B. (n.d.). Understanding tTests: 1sample, 2sample, and Paired tTests. Retrieved September 27, 2019, from https://blog.minitab.com/blog/adventuresinstatistics2/understandingttests1sample2sampleandpairedttests.
Topic 4 DQ 2 
Step 2 of hypothesis testing involves reviewing the assumptions of the test you selected. Discuss the three assumptions of the ttest. Provide an example of the assumption that is not robust to violations and a situation when the assumption is violated. PUB550: Application and Interpretation of Public Health Data
Hoekstra (2012) study found the following:Using a statistical test is one of the frequently mentioned methods of checking for violations of assumptions (for an overview of statistical methodology textbooks that directly or indirectly advocate this method, see e.g., Hayes and Cai, 2007). However, it has also been argued that it is not appropriate to check assumptions by means of tests (such as Levene’s test) carried out before deciding on which statistical analysis technique to use because such tests compound the probability of making a Type I error (e.g., Schucany and Ng, 2006). Even if one desires to check whether or not an assumption is met, two problems stand in the way. First, assumptions are usually about the population, and in a sample the population is by definition not known. For example, it is usually not possible to determine the exact variance of the population in a samplebased study, and therefore it is also impossible to determine that two population variances are equal, as is required for the assumption of equal variances (also referred to as the assumption of homogeneity of variances) to be satisfied. Second, because assumptions are usually defined in a very strict way (e.g., all groups have equal variances in the population, or the variable is normally distributed in the population), the assumptions cannot reasonably be expected to be satisfied(p.1)Hoekstra (2012) study found the following:The assumptions of normality and of homogeneity of variances are required to be met for the ttest for independent group means, one of the most widely used statistical tests (Hayes and Cai, 2007), as well as for the frequently used techniques ANOVA and regression (Kashy et al., 2009). The assumption of normality is that the scores in the population in case of a ttest or ANOVA, and the population residuals in case of regression, be normally distributed. The assumption of homogeneity of variance requires equal population variances per group in case of a ttest or ANOVA, and equal population variances for every value of the independent variable for regression. Although researchers might be tempted to think that most statistical procedures are relatively robust against most violations, several studies have shown that this is often not the case, and that in the case of oneway ANOVA, unequal group sizes can have a negative impact on the technique’s robustness (e.g., Havlicek and Peterson, 1977; Wilcox, 1987; Lix et al., 1996)(p.2)ReferenceHoekstra, R., Kiers, H. A., & Johnson, A. (2012). Are assumptions of wellknown statistical techniques checked, and why (not)?. Frontiers in psychology, 3, 137. doi:10.3389/fpsyg.2012.00137
Topic 5: ANOVA Testing
 Compare and contrast the types of ANOVA tests and their application.
 Apply the results of an ANOVA to determine statistical difference between means and potential interactions.
 PUB550: Application and Interpretation of Public Health Data
Application of ANOVA 
Topic 5 DQ 1 
Compare the various types of ANOVA by discussing when each is most appropriate for use and which types of research questions each best answers. Include specific examples to illustrate the appropriate use of each test and how interaction is assessed using ANOVA.
Topic 5 DQ 1
Analysis of variance (ANOVA) is a unit of statistical tests used to compare the means of two or more groups (Corty, 2016). There are two types of tests: between subjects, oneway ANOVA and between subjects, twoway ANOVA. ‘Between subjects’ means independent samples and ‘way’ means explanatory view. ‘Way’ can be grouping variables or independent variables (Corty, 2016) PUB550: Application and Interpretation of Public Health Data.
The oneway ANOVA is a statistical test used when comparing the means of two or more independent samples when there is only one explanatory variable (Corty, 2016). A oneway ANOVA is most appropriate when used to assess the differences in one continuous variable between one grouping variable (Statistics Solutions, 2020). One way ANOVA allows more groups to be compared at once, allowing more complex questions to be addressed (Corty, 2016). For example, a oneway ANOVA would be appropriate if the goal of research is to assess for differences in job satisfaction levels between ethnicities (Statistics Solutions, 2020). This type of example would require a question regarding one dependent variable, job satisfaction, and one independent variable, ethnicity.
The twoway ANOVA allows researchers to examine the impact of two explanatory variables at one time (Corty, 2016). The twoway ANOVA is most appropriate to use when there are two or more influencing factors at one time. A twoway ANOVA answer the most complex questions involving multiple influencing factors (Corty, 2016). For example, a researcher performed a study on factors that influence altruism and has interest in both how the children are reared and what their nervous systems are like, nurture vs. nature (Corty, 2016). The study was preformed using adoptive children. Below is the design stud with multiple levels of altruism as the influencing factors:
Adoptive Parents
High on Altruism 
Adoptive Parents
Medium on Altruism 
Adoptive Parents
Low on Altruism 

Birth Parents
High Altruism 

Birth Parents
Low Altruism 
(Corty, 2016)
References
Corty, E.W. (2016). Using and Interpreting Statistics: A Practical Text for the Behavioral, Social, and Health Sciences. (3^{rd} Ed.) New York, NY: Worth Publishers
Statistics Solutions. (2020). The Various Forms of ANOVA. Retrieved from https://www.statisticssolutions.com/thevariousformsofanova/ PUB550: Application and Interpretation of Public Health Data
Topic 5 DQ 2 
Different types of software can be used for data management. Compare Excel and SPSS and discuss specific SPSS software features that make it preferable to Excel for data management. Provide examples illustrating when electing to use SPSS could be preferable to Excel and vice versa.
5 DQ 2
When it comes to statistical analysis there are a few different types of software that can be used for data management. There are two types used in the PUB 550 course at Grand Canyon University. The first is Excel, which is a spreadsheet software that can also be used for statistical analysis. The other is SPSS, Statistical Package for Social Sciences, is an actual statistical analysis software (Statistics Solutions, 2019). Excel is an easy to use software that allows researchers to format data into a table format or spreadsheets, with rows and columns, and then filter the data using formulas (Mittermeier, 2019). The primary purpose of Excel is to create records of data along with manipulation of the data into visual analysis in preparation for formal presentations and reports. SPSS is specifically made for statistical analysis. The software has been used by researchers for decades to perform quantitative analysis of data, allowing for import of statistical packages from other databases and spreadsheets (Statistics Solutions, 2019). Excel utilizes formulas to perform analyses, that the user is expected to be knowledgeable of, whereas SPSS has specific tools to recode and transform variables without additional knowledge of the user required. SPSS is specific to the social sciences as it allows for comparative studies and statistical techniques at a large scale, although limited as it is unable to perform analyses for large data sets from the medical field for clinical data (Statistics Solutions, 2019). PUB550: Application and Interpretation of Public Health Data. Although both software are capable of aiding researchers in performing statistical analysis on data they have collected, one is significantly more useful in the type of analysis researchers in public health and other social sciences need. SPSS is designed for data analysis in the social sciences and overall is the most user friendly and forgiving, as errors of accidental overwrites or sorting can be avoided unlike Excel worksheets (Mittermeier, 2019).
References
Mittermeier, E. (2019). Why you should move from Excel to SPSS. Retrieved from https://www.2×4.de/2019/06/11/whyyoushouldmovefromexceltospss/
Statistics Solutions. (2019). SPSS statistics help. Retrieved from https://www.statisticssolutions.com/spssstatisticshelp/
Topic 6: Regression
Objectives:
 Apply the steps of a regression analysis to determine the linear regression equation and its appropriateness based on the data.
 Interpret regression output to predict changes in a dependent variable based on changes in one or more predictor variables.
Application of the Pearson Correlation Coefficient and the ChiSquare Test 
Topic 7 DQ 1
Correlation between two variables proves only that there is an association it doesn’t guarantee that one causes the other (Corty, 2016). If two variables vary together systematically a cause and effect relationship may exist but there does not have to be, it is possible there is a third variable (Corty, 2016). When an association is demonstrated further research is needed to assess the strength of the relationship this is where the Pearson test comes in. The Pearson test calculates a correlation coefficient that summarizes the strength of the linear relationship between two variables, a strong relationship would suggest causation (Corty, 2016).
Data necessary to calculate a Pearson correlation coefficient must be interval and/or ratio, for an ordinal variable the Spearman rank order test can be used and for two nominal variables the chisquare test can be used (Corty, 2016).
A study that would be appropriate for the Pearson correlation coefficient would be a study of the need for prescription glasses and its relationship to age.
References
Corty, E. (2016). Using and interpreting statistics : a practical text for the behavioral, social, and health sciences. New York: Worth Publishers.
Topic 7 DQ 2 
Describe the conditions in which a nonparametric test would be a better selection than a parametric test. Illustrate your ideas with a specific example of when you would use each type of test using similar variables for each example.
Topic 7 DQ 2
Parametric tests should only be used when assumptions about the parameters are met (Corty, 2016). Nonparametric tests do not have to meet these same assumptions. There are two circumstances in which a nonparametric test should be used: PUB550: Application and Interpretation of Public Health Data
 The outcome variable is ordinal or nominal (Corty, 2016).
 During an experiment, if a nonrobust assumption is violated, the researcher can revert back to a nonparametric test from a parametric test (Corty, 2016).
Nonparametric tests are less restricted by assumptions and relatively simple to conduct, making them desirable. Although, they are often not as influential on the null hypothesis as parametric test (Corty, 2016). The reason nonparametric tests have less power is that they only contain nominal or ordinal data rather than interval/ratio data. Nominal and ordinal numbers contain less information, thus giving nonparametric tests less power (Corty, 2016). Generally, researchers prefer parametric tests, but when the assumptions are not met, nonparametric tests are used.
When the outcome is an ordinal variable or a rank, it is appropriate to use a nonparametric test. For example, a clinical trial is performed where study participants are asked to rate illness symptoms on severity for six weeks for a specific, assigned treatment. Symptom severity is measured on a 5point ordinal scale with the following response options:
 Symptoms got much worse
 Symptoms are slightly worse
 No change
 Slightly improved
 Much improved
(Sullivan, 2016).
Outcomes that are ordinal, ranked, subject to outliers or measured imprecisely are difficult to analyze with parametric tests without making major assumptions (Sullivan, 2016). An appropriate and most effective test for the example above is the nonparametric test.
Parametric tests can also be used for ordinal variables as long as the ordinal variables are continuous. For example, an experiment analyzing the weight and height of firefighters could use a parametric test because the ordinal variables are continuous.
References
Corty, E.W. (2016). Using and Interpreting Statistics: A Practical Text for the Behavioral, Social, and Health Sciences. (3^{rd} Ed.) New York, NY: Worth Publishers
Sullivan, L. (2016). Nonparametric Tests. Retrieved from http://sphweb.bumc.bu.edu/otlt/MPHModules/BS/BS704_Nonparametric/BS704_Nonparametric_print.html PUB550: Application and Interpretation of Public Health Data
Topic 8: Analyzing and Reporting Results
 Apply hypothesis testing steps to a data set.
 Communicate scientific information for public health practice.
 Select quantitative and qualitative data collection methods appropriate for a given public health context.
 Analyze quantitative and qualitative data using biostatistics, informatics, computerbased programming, and software, as appropriate.
 Interpret results of data analysis for public health research, policy, or practice.
Benchmark – Analyzing and Reporting Data 
Analyzing and Reporting Data – Overview
The purpose of this assignment is to give you experience conducting a basic secondary data analysis using realworld surveillance data. Secondary data analysis is faster and cheaper to conduct compared to primary data collection. However, there are also significant limitations. The data were likely collected for a different purpose, and may not include the specific variables required to answer your question. The sampling strategy might not be random and may not be representative of your target population. These are examples of such limitations you should be aware of as you work with existing data.
A key question is whether the data should determine the research question, or if the research question should determine the type of data you use. In practice, you would want your research question or hypothesis to determine the dataset you select. In this assignment, you are limited to three datasets and may need to adjust your initial research question to accommodate one of the three datasets. Avoid “mining” for significant results and stick to your initial research question as much as possible. For this project, you will select one of the three example datasets to complete a basic analysis and communicate your findings through a scientific poster presentation. PUB550: Application and Interpretation of Public Health Data
These steps will help you get started:
 Review the websites for each of the three datasets listed below. Be sure to understand the purpose of the survey, the sample used in the survey, and the main focus areas of each survey. Review the documentation provided on the websites to get to know the story behind the data and understand the population before reviewing the data.
 Select the dataset that is most appropriate for your interest area.
 Open the data in SPSS and get to know the data by reviewing the variables in “Variable View” mode. This view will allow you to read the variable labels and response labels for each variable.
 Based on your research interest and question, select variables that will help increase your understanding about that topic.
 Arrange the data as needed to organize and clean data, allowing you to focus on your specific question. Remember to save your analytic data file as a new file in case you need to go back to the original file. It is good practice to continually save new versions of the data file as you work with and manipulate the data.
 Follow the hypothesis testing steps to carry out your secondary data analysis. PUB550: Application and Interpretation of Public Health Data
For additional information on conducting a secondary data analysis, read the Topic Material, “Conducting HighValue Secondary Dataset Analysis: An Introductory Guide and Resources.”
Dataset Documents
 Demographic and Health Survey
The Demographic and Health Survey is a global monitoring survey administered by USAID. The sample dataset is the model data set put together by USAID to explore DHS data. The sample data is not from a specific country or year, but it gives you an idea of what can be obtained from various countries through these datasets. The datasets are free and publically available once you register with USAID to access the DHS data. For the purpose of this assignment, treat this dataset as coming from a country of your choice. Access to the Model Questionnaire, Recode Manual, and Data Video Tutorials, including a video on the sampling strategy, is found at http://dhsprogram.com/data/modeldatasets.cfm.
Note: You do not need to worry about weighting strategies for this assignment.
Use the http://dhsprogram.com/data/UsingDataSetsforAnalysis.cfm link to review the “StepbyStep Introduction to Analyzing DHS Data” for tips on how to access your own dataset for future use and to see what resources are available to help you navigate the model dataset for this assignment:
 Youth Risk Behavior Surveillance System (YRBSS)
The Youth Risk Behavior Survey is a national survey monitoring health behaviors among youth and young adults. It is administered by the Centers for Disease Control and Prevention. The example dataset for this assignment comes from the National Survey (not combined) dataset for 2015. General information about the survey is found at https://www.cdc.gov/healthyYouth/data/yrbs/index.htm.
Documentation and questionnaires can be found by accessing the “YRBSS Data and Documentation: website at https://www.cdc.gov/healthyyouth/data/yrbs/data.htm. PUB550: Application and Interpretation of Public Health Data
Please read the 2015 YRBS Data User’s Guide, listed in the “National YRBS Datasets and Documentation” page at https://www.cdc.gov/healthyyouth/data/yrbs/pdf/2015/2015_yrbsdatausers_guide_smy_combined.pdf.
The dataset includes calculated variables not found in the questionnaire that you might find helpful in determining your analysis for this assignment. The crosswalk to match the questions with the dataset can be found by viewing the “YRBS Questionnaire Content – 19912017” found at https://www.cdc.gov/healthyyouth/data/yrbs/pdf/2017/yrbs_questionnaire_content_19912017.pdf.
 National Health Interview Survey (NHIS)
The NHIS began in 1957, and has been used to monitor the health of the United States ever since. It is a householdlevel survey administered by the U.S. Census Bureau. Key topics in the survey include doctor’s visits, medical conditions, health insurance, and health behaviors. General information about the survey, including the sample design and data collection procedures, can be found at https://www.cdc.gov/nchs/nhis/about_nhis.htm.
A Survey Description of the 2015 National Health Interview Survey can be found at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/srvydesc.pdf.
The sample dataset is from the 2015 adult survey at https://www.cdc.gov/nchs/nhis/nhis_2015_data_release.htm.
Some of the variables have been deleted to decrease the size of the file, but none of the observations have been dropped. Please review the “2015 National Health Interview Survey (NHIS) Public Use Data Release” document at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/readme.pdf.
Review the “2015 NHIS Public Use Variable Summary” at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/samadult_summary.pdf.
After you identify a few variables you are interested in, review the complete description of the variable in the variable layout document at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/samadult_layout.pdf.
Checking the variable frequencies will help you determine the range of answers for each variable of interest, including the number of missing observations PUB550: Application and Interpretation of Public Health Data. If the number missing is high, consider using another variable. Variable frequencies can be found at ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2015/samadult_freq.pdf.
Topic 8 DQ 1 
Reporting public health information requires a clear understanding of the various statistical methods used to draw conclusions. These methods are then communicated within the larger story surrounding the public health issue. Identify a public health report or article and discuss what you would do differently to improve understanding and application if you were the author. Post the permalink to your article or report in the Main Forum.
BUY A PLAGIARISMFREE PAPER HERE
Topic 8 DQ 1
The world today does not have a shortage of public health issues. Researchers are constantly gathering more data to develop prevention and protection programs. In order to properly accomplish this, it is necessary to have a clear understanding of statistical methods to draw accurate conclusions. It is possible to find statistics on nearly all public health issues, which is why is it critical for the scientists and professionals to have a grasp on all statistical methods available.
One public health issue that is relevant today is obesity. The general population is aware that poor nutrition, lack of physical activity and obesity cause a number of healthrelated issues, however obesity is getting worse all over the world. An article that addresses the issue, published by the World Health Organization, is ‘Obesity and Overweight’. The link to the article is https://www.who.int/newsroom/factsheets/detail/obesityandoverweight.
Key facts from the article:
 Obesity has tripled worldwide since 1975
 In 2016, 1.9 million adults over the age of 18 were considered overweight; 650 million were obese
 40 million children under the age of 5 were overweight or obese. PUB550: Application and Interpretation of Public Health Data.
 In 2016, over 340 million children and adolescents aged 519 were obese or overweight
(WHO, 2020).
The article references multiple statistics regarding population obesity. The article also defines obesity and overweight in order to understand what determines who falls into the obese category and who falls into the overweight category. WHO also shared recent global estimates:
 In 2016, more than 1.9 billion adults aged 18 years and older were overweight. Of these over 650 million adults were obese.
 In 2016, 39% of adults aged 18 years and over (39% of men and 40% of women) were overweight.
 Overall, about 13% of the world’s adult population (11% of men and 15% of women) were obese in 2016.
 The worldwide prevalence of obesity nearly tripled between 1975 and 2016
(WHO, 2020).
While the information is interesting and useful, it would be helpful for the reader if WHO discussed the application used to develop the statistical data. WHO also shared the cause and prevention tactics for obesity. It would be helpful if WHO developed predictions for the next five years if people followed obesity prevention guidelines verses if these guidelines were not followed; all while providing the method to creating the statistical data. This would help the reader to better understand the importance of the public health issue. The sample size and surveillance methods should also be shared regarding obesity data retrieval.
Reference
World Health Organization (WHO). (2020). Obesity and overweight. Retrieved from https://www.who.int/newsroom/factsheets/detail/obesityandoverweight PUB550: Application and Interpretation of Public Health Data