  1. The dataset I found includes information about over 300 patients with diabetes, such as their age, gender, height, weight, glucose stability, cholesterol, and their frame build. This information was available through Vanderbilt University’s open dataset access in their Department of Biostatistics.

  2. This is a dataset from a past research project I participated in. Water samples were collected from Home Water Treatment Systems (essentially free-standing filters that utilize chlorine, a charcoal filter, and gravity to disinfect non-potable water) in rural areas of the Dominican Republic and tested for several chemical parameters as well as microbial presence -particularly that of E. coli. The continuous variable is the MPN of E. coli and the category variable could be pH range or free chlorine ranges.

  3. My dataset is a study of over 9000 patients whose BMI’s were measured to determine their degree of obesity. Age, sex, and race were all categorical variables mentioned. The continuous variable was the percentage of body fat (BMI number).

  4. I obtained a public health dataset from the WHO (2012) treating insufficient water supplies in low and middle income countries across the globe as an disease-like issue which leads to higher death rates and lower life spans. The category variable is the name of the country and the continuous variables are the total deaths (adults, children under the age of 5, and per 100,000 people) and disability adjusted life years (DALYs; adults and children under the age of 5).

  5. I obtained my data set from WHO. It looks at immunization of different students in both private and public schools. What type of school the student is from (private or public) is the categorical variable, whereas the percent immunized is continuous.

  6. I found this data set associated with a journal article answering the question of if “age-related social comparisons impair older people’s hand grip strength and persistence.”

  7. I found this data set used for a journal article evaluating inflammatory tissue in the lungs and the presence of Lactobacillus. I obtained this data set through Dryad and it is published online at BMJ online.

  8. This data set was obtained from a study conducted by the University of Virginia School of Medicine. The study collected data for 14 variables on 403 subjects, who were interviewed in a study to understand the prevalence of obesity, diabetes, and other cardiovascular risk factors in central Virginia for African Americans.

  9. This data was obtained from WHO’s Global Health Observatory Data Repository and shows the statistics of the prevalence of HIV among adults aged 15 to 49, with the estimates being categorized by country.

  10. The dataset I have chosen is the numbers regarding health in 205 countries. The category variable is country. The numeric variables are use of improved drinking water sources, use of improved sanitation facilities, immunization coverage,pneumonia, diarrhea, and malaria data.

