Data Analysis Paper

 

Mixing of specific Arabidopsis thaliana genotypes stabilize yield in diseased and non-diseased sample populations

Taylor Schilling

BIO 342

27 November 2016

Dr. Zeynep

 

 

On my honor, I have not given, nor received, nor witnessed any unauthorized assistance on this work. -Taylor Schilling

Objective

Statistical data analysis was done using the raw data from the study, Impact of disease on diversity and productivity of plant populations (Creissen et al. 2016). The purpose for this analysis is to better understand the overall spread of data from this experiment as well as the relationship between the variables used. The productivity of Arabidopsis thaliana can be shown by seed mass; in order to find out if vegetative production can also be measured by rosette size, the linear relationship was found between rosette size and seed mass using linear regression. Further, Anova tests were used to find out if population means of rosette size change depending on the genotype as well as if population means of rosette size differ depending on the number of genotypes per pot. This shows whether or not the specific genotype and number of genotypes present actually affect the population means of rosette size – and more importantly, whether or not the study was successful in finding significant data regarding their purpose. Last, a chi square test was done using the number days to flower (separated in low, medium and high categories) and the types of genotypic mix. This shows if the genotype mix affects the days A. thaliana takes to flower and further reinforce the ability of this study to portray results that are applicable to the overall population.

Lit Review

In Impact of disease on diversity and productivity of plant populations, Creissen et al. (2016) studied the effects that diversity in plant genotypes had on stabilizing plant productivity in Arabidopsis thaliana while being attacked by a pathogen Hyaloperonospora arabidopsidis (Hpa). They focused on plant competition and the effects on plant production when the pathogen is introduced as well as the effect of biodiversity on the system’s ability to buffer against the disease.

The research shows that pathogens promote plant biodiversity and prevent competitive exclusion – at least when a resistant genotype is present. Biodiversity is reduced when less competitive species are diseased. Additionally, species richness lessens the effect of disease and increases plant productivity. Four specific genotypes (Van-0, Ga-0, NFA-10 and NFA-8) of A. thaliana were chosen based on their fitness and planted in pots. There were four plants per pot – 20 pots of each of the 11 monocultures and mixtures in each pathogen treatment (220 pots total). The researchers measured the diseased leaf area after six and ten days, rosette leaf size, plant height and flowering time.

The results of this study show that, when diseased, the yield ultimately depended on the number and combination of certain genotypes. Hpa reduced seed production in all mixes with the most susceptible genotypes, NFA-8 and NFA-10. This is shown in the decrease in rosette diameter. There was an increased competitive ability in resistant genotypes, Ga-0 and Van-0 in the 2 and 4-way mixes. It is also important to note that Ga-0 is the most competitive with or without Hpa, while NFA-8 is highly competitive without Hpa, though less so with the presence of the pathogen. Without the pathogen, the pots with only highly competitive genotypes have the lowest yield while pots with less competitive genotypes have the highest yield. With the disease, the combination of the somewhat susceptible NFA-10 and the fully resistance Van-0 had the highest yield in monoculture and 2-way mix. Additionally, the study found that 2-way genotypic mixes had overall higher yields than monoculture and 4-way genotype mixed pots. In fact, 4-way mixes produced had the lowest yield without Hpa and the same yield as monoculture with Hpa. This shows that not only the combination of genotype matters in plant yield, but the number of different genotypes in the mix matters as well.

This research supports the ability of resistant genotypes to maintain productivity, stability and diversity. There is more resistance of plants to change their behavior (know as ecological resistance) and buffer negative effects during events such as the introduction of a pathogen in order to maintain their well-being and ensure their survival. With a pathogen, a high yield of a resistant genotype results especially when there is a mixture of resistant and susceptible genotypes. Disease helps to maintain genotypic diversity, which in turn enhances productivity because disease pressure leads to compensatory actions – in this case, the over-yielding of one genotype (for example, Van-0) compensating for the loss of another (NFA-10). These compensatory interactions are highest when the genotypes had different competitive abilities. They compensate depending on their specific response to disease, which ultimately leads to more production. Therefore, pathogens promote biodiversity by inhibiting competitive exclusion and supporting complementation. Mixtures, then, may reduce the effect of pathogens as well as the competitiveness between plants, as seen with the 2-way mix between Van-0 and NFA-10.

This research is applicable to those who work with agriculture because it helps them decide which plants and plant genotypes to plant to get the best yield and yield stability possible. It also highlights the importance of genotypes and number of genotypes in a mix to buffer the affect of a pathogen and to have the highest productivity possible.

Data Analysis

Descriptive Statistics: Days to Flower

Minimum Quartile 1 Median Quartile 3 Maximum
42 49 52 62 90
Mean Standard Deviation Interquartile Range Variance
55.4 8.7 13.0 74.8

The number of days that plants took to flower is between 42 and 90 days. The middle 50% of days it took to flower is between 49 and 62 days with the center of the sample being 52 days. With that being said, the mean is 55.4 days. The interquartile range shows the middle half of the data; 13 days less than quartile 1 (36 days), and 13 more than quartile 3 (75 days). The dispersion of days around the mean is around 8.7.

Descriptive Statistics: Rosette Size

Minimum Quartile 1 Median Quartile 3 Maximum
25 55 67 82 138
Mean Standard Deviation Interquartile Range Variance
69.4 18.1 27.0 328.8

The rosette size ranges between 25 and 138 mm. The middle 50% of rosette sizes is between 55 and 82 mm, with the middle of the sample being 67 mm. The mean is 69.4 mm. The interquartile range is 27 less than quartile 1 and more than quartile 3 (28-109 mm), which is more accurate as it disregards outliers. The dispersion of rosette sizes around the mean is 18.1.

Descriptive Statistics: Seed Mass

Minimum Quartile 1 Median Quartile 3 Maximum
0.01 0.20 0.28 0.35 0.68
Mean Standard Deviation Interquartile Range Variance
0.28 0.11 0.15 0.01

The seed mass ranges between 0.01 g to 0.68 g. The middle 50% of seed mass is between 0.20 to 0.35 g with the middle of the sample being approximately 0.28 g. The mean is also 0.28 g. The interquartile range is 0.15 g less than quartile 1 and more than quartile 3 (0.05-0.50 g), which is more accurate due to the exclusion of outliers. The dispersion of seed mass around the mean is 0.11.

Research question: Is there a linear relationship between the overall seed mass and rosette size in the plants?

Correlation

Seed Mass (g)
Rosette Size (mm) 0.49041

<0.0001 


Null hypothesis
: There is no relationship between the overall seed mass and rosette size in these plants.

Alternative hypothesis: There is a relationship between the overall seed mass and rosette size in these plants.

The correlation between seed mass and rosette size is ~0.49. This is a strong and positive correlation, which means that there is a strong and positive relationship between the two variables. As the seed mass increases, the rosette size increases. Further, the P-value is less than 0.0001, meaning that it is significant and the null hypothesis is rejected. Thus, there is statistical evidence to support the relationship between seed mass and rosette size. As such, it would be logical to calculate linear regression.

Regression

Seed Mass = 0.06989 + 0.00302*Rosette Size

P-value R-square
<0.0001 0.2405

 

  Parameter Estimate P-value
Intercept 0.06989 <0.0001
Rosette Size 0.00302 <0.0001


Null hypothesis:
There is no linear relationship between the overall seed mass and rosette size in these plants.

Alternative hypothesis: There is a linear relationship between the overall seed mass and rosette size in these plants.

The linear regression P-value (<0.0001) is less than the alpha value (0.05) meaning that the null hypothesis is rejected and there is a statistically significant linear relationship between seed mass and rosette size. The regression line shows that with every millimeter increase in rosette size, the seed mass increases by 0.00302 g. At 0 millimeters, the seed mass is 0.06989 g. The R-square value, however, is 0.2405, which means that 24.05% of the seed mass data is unexplained. This regression line is therefore not a good model for the linear relationship between seed mass and rosette size because the majority of the data is unexplained.

Research question: Are the population means of rosette size significantly different for each genotype?

Null hypothesis: The population means of rosette size are the same for each genotype.

Alternative hypothesis: The population means of rosette size are different for each genotype.

Due to the P-value (<0.0001) being less than alpha (0.05), the null hypothesis is rejected. There is statistically significant evidence to support that the population means of rosette sizes are different for each genotype (Ga-0, NFA-10, NFA-8, Van-0).

Research question: Are the population means of rosette size significantly different for each number of genotypes per pot?

Null hypothesis: The population means of rosette size are the same for each number of genotypes per pot.

Alternative hypothesis: The population means of rosette size are different for each number of genotypes per pot.

The P-value (<0.0001) is less than the alpha value (0.05), meaning that the null hypothesis is rejected and that there is statistically significant evidence to show that the population means of the rosette sizes are different for each number of genotypes per pot (1 genotype/per pot, 2 genotypes/pot and 4 genotypes/pot).

Research question: Do the plants take different numbers of days to flower between pots that are monocultures, 2-way genotypic mixes, and 4-way genotypic mixes?

Low 42-49 days to flower (lower 33%)
Medium 50-62 days to flower (middle 33%)
High 63-90 days to flower (upper 33%)

(Taken from 5-number summary of days to flower)

Observed and expected values
Mono 2-way mix 4-way mix Total
Low 86 (91) 263 (274) 97 (82) 446
Medium 106 (113) 392 (343) 59 (102) 558
High 63 (51) 116 (155) 74 (46) 252
Total 255 771 230 1256


Null hypothesis:
The number of days it takes to flower and genotypic mix are independent.

Alternative hypothesis: The number of days it takes to flower depends on the genotypic mix.

By calculating the obtained and expected values of days to flower (categories: low, medium and high number of days to flower) and type of genotype mix (categories: mono, 2-way mix, 4-way mix) the P-value (5.66×10-12) is found to be less than alpha (0.05). Therefore, the null hypothesis is rejected and there is statistically significant evidence to support that the days to flower and genotype mix are associated.

Conclusion

In order to better understand the raw data gathered from Creissen et al. (2016) during their study, correlation, regression, Anova and chi-square tests were performed. The conclusions made were that there is a linear relationship between seed mass and rosette size; population means of rosette size re different for each genotype; population means of rosette size are different for each number of genotypes per pot; and plants flower in different amounts of days depending on if they are in a monoculture, 2-way mix or 4-way mix. The study was therefore successful at gathering significant results that can be related to overall populations.

 

Creissen, H. E., Jorgensen, T. H., and Brown, J. K. M. (2016). Impact of disease on

diversity and productivity of plant populations. Functional Ecology 30, 649-657.

Leave a Reply

Your email address will not be published. Required fields are marked *