The report is attached below.
“The Effect of Sitagliptin on Carotid Artery Atherosclerosis in Type 2 Diabetes”
A way to better understand biostatistics through the running of statistics on real life experiments.
Here’s my paper about how ocean acidification and ocean temperature increases are affecting a species of sea urchins:
Biostatistics research project
Statistical analysis of physical factors of patients who underwent a pulmonary bronchoscopy
The physical characteristics were recorded of 304 healthcare workers who perform pulmonary bronchoscopies and are suspected to have contracted pulmonary tuberculosis from improper precaution during the procedures. The first hypothesis proposed is to see if there is any correlation or linear relationship between the Body Mass Index (BMI) and the age of the workers. The second is to ascertain if TB and Smoking are independent of each other. The third is to determine if the BMI population mean between three levels of Smoking History are statistically different.
Healthcare workers that perform or are around patients who undergo a pulmonary bronchoscopy are recommended to take care when performing or are around the procedure. Pulmonary tuberculosis is a highly contagious disease, and particulate matter from the procedure can leave contagious particulates airborne. It is recommended that during the procedure face masks and equipment that can filter out these particulates are worn and such precaution is exceedingly important to take when a patient has pulmonary tuberculosis. However, if a patient nor the doctor knows they have TB, the patient can be unexpectedly diagnosed in the future which means the healthcare workers who performed the procedure can have been exposed to the disease. (Na et al., 2016) The paper that provides the data this study will used is a retrospective study of 1,954 healthcare workers for whom CT and bronchoscopy information was available from the Pusan National University Hospital in Busan South Korea. (Na et al., 2016) South Korea has a particularly high incidence rate of PTB, so determining risks of exposure is particularly important. 304 of the people used in the study are thought to be exposed to PTB from improper precaution. The paper states that there were no significant differences in the population used in the study in either age or body mass index. (Na et al., 2016) The smoking history of the patients were recorded as: never smoked (0), past smoker (1), or current smoker (2). The future diagnosis of the patient with PTB was determined from hospital records and was recorded as either diagnosed (1) or undiagnosed (0). (Na et al., 2016)
Descriptive Statistics for Numeric Variables
|Variable||N||N Miss||Minimum||Mean||Median||Maximum||Std Dev|
The mean is close to the median of both of the continuous variables, age and BMI, which suggests that the data is approximately symmetric in a normal bell curve. The BMI data has a range of 29.3 and the Age data has a range of 69.
Correlation and Linear Regression model of BMI and Age
To determine whether or not Age is correlated and has a linear relationship to BMI, a correlation and a linear regression model were used. These methods were chosen because both BMI and Age are continuous variables, and these models suggest whether or not they are correlated and have a linear relationship. As shown in the correlation table below, the Pearson correlation coefficient ,r, is only -0.004, which means that the two variables age and BMI are very weakly correlated. A strong positive correlation would be indicated by a coefficient of between 0.7 and 1 , and a strong negative correlation would be between -0.7 and -1. The correlation coefficient of -0.004 does not lie in either of these intervals and is close to 0, which represents a very weak correlation between the variables BMI and Age.
This can be further seen with the linear regression analysis. The r-squared value is -0.003, which means that only 0.3% of the variance in the data can be explained by the linear regression model. The slope of the linear regression model is -0.0007, which suggests as one increases a year in age, one’s BMI lowers by -0.0007, starting from age 0 at the y-intercept of 21.89, however because of the low correlation value, this model does not explain the variance in data well.
|Pearson Correlation Coefficients, N = 304|
|t Value||Pr > |t||
Chi Square: TB versus smoking-
H0: That Tuberculosis Diagnosis and History of smoking are independent
HA: That Tuberculosis Diagnosis and History of smoking are not independent
In order to see if the two categorical variables, TB and Smoking, are independent of each other, a Chi-square test was conducted. SAS calculated the test statistic χ2= 0.990 and the P-value, P(χ2>0.990)=0.6068. At the 0.05 significance level, one should not reject the null hypothesis (as 0.6068 > 0.05.) In conclusion, the chi-square test indicates that Tuberculosis Diagnosis and Smoking history are independent of each other.
Statistics for Table of Smoking by TB
ANOVA Test: Smoking Versus BMI
H0: The mean BMI for all levels of smoking history (never, past, and current) are equal.
HA: At least two of the mean BMI’s for all levels of smoking history (never, past, and current) are not equal.
To test whether or not the BMI population mean between three levels of smoking history are statistically different, an ANOVA test was used. SAS calculated a test statistic of F= 0.60 and a P-value of P(F>0.60)=0.50492. At the 0.05 significance level, one would not reject the null hypothesis (because 0.5492 > 0.05.) Thus, one cannot reject that the mean BMI for all levels of smoking history (never, past, and current) are equal. One can conclude that the population means are not statistically different.
|Source||DF||Sum of Squares||Mean Square||F Value||Pr > F|
By testing for correlation, linear relationships, independence, and difference in means, one can begin to make inferences about this data set. From observing Pearson’s correlation coefficient, it was concluded that Age and BMI were weakly correlated. From the chi-square hypothesis test, it was determined that Smoking History is independent of TB, and thus past smoking has no significant effect on contracting TB. From the ANOVA table, it was established that the mean BMI for all levels of smoking history (never, past, and current) are equal. Thus, one can better understand how the three variables of Age, BMI, and Smoking History interact with TB and with each other.
This website is a pretty good comprehensive resource.
The following is a series of videos about statistics in excel
How to get a bunch of descriptive statistics:
More details on how to do Chi-square: http://www.real-statistics.com/chi-square-and-f-distributions/independence-testing/
Gapminder has interactive graphs that add another dynamic to relaying information by allowing the viewer to manipulate certain variables.