This chapter covered Monte Carlo, Parametric and Bayesian approaches to statistical analysis. Having really only worked with parametric statistics it was interesting to read about Monte Carlo and Bayesian Analysis and Im looking forward to hearing Nicole lecture on the latter. I'd never even heard of Bayesian statistics until last semester when we had one student give a presentation. As she studied systematics she was very enthusiastic about using this approach, however our Professor was alot more sceptical about its usefulness and had all the reserves that many ecologists have that were writtten in one of the footnotes, namely that specifying a prior reflects subjectivity and is considered to be unscientific. However, Bayesians argue that specifying a prior makes explicit all the hidden assumptions of an investigation and so it is a more honest and objective approach to doing science. This makes sense to me and I can't really see how this differs that much from a meta-analysis, where you are conducting your analyses on the data on many other studies, couldn't you use this information to construct priors? If your priors come from published peer review articles, and you use this as your null, this seems alot more appropriate than starting with a null that states there are no differences, I can see why Bayesians think that we would make more progress using this approach. It is interesting that this argument has lasted for centuries, perhaps I am not fully understanding the argument here from the frequentists point of view. I was also surprised to read G&E's impression of non-parametric statistics which also reflected the veiws of my last stats teacher, i.e. avoid them at all costs. Now I know about Monte Carlo, I wonder why people use non-parametric analyses, especially as G&E stated that by ranking data you may lose alot of information in your dataset and perhaps some of the subtlies that could be biologically meaningful to your system, perhaps this is simply due to difficulty of finding appropriate software?
Summary:)
Monte Carlo analysis:
1. Makes minimal assumptions about the underlying distribution of the data
2. Uses randomizations of the observed data as a basis for inference.
Assumptions:
1. The data collected represent random independent samples (common to all statistical tests)
2. The test statistic describes the pattern of interest.
3. The randomization creates an appropriate null distribution for the question.
Advantages:
1. Does not require that data are sampled from a specified probability distribution
2. You can tailor your statistical test to particular questions and datasets, so you
are not forced to use conventional tests that may not be the most powerful for your analysis
Disadvantages:
1. It is computer intensive and is not included in most traditional statistical packages.
2. Different analyses of the same dataset can yield slightly different results, which does not occur with parametric analyses. A parametric analysis assumes a specified distribution and allows for inferences about the underlying parent population from which the data were sampled, with Monte Carlo; inferences are limited to the specific data that have been collected. If the sample is representative of the parent population then the results can be generalized with caution.
Parametric analysis:
1. Assumes that the data were sampled from a distribution of known form
2. Estimates the parameters of the distribution from the data
3. Estimates probabilities from observed frequencies of events
4. Uses these probabilities as a basis for inference (frequentist inference).
Advantages:
Uses a powerful framework based on known probability distributions.
Greater power in making general inferences to the parent population.
Disadvantages:
May not be as powerful as sophisticated Monte Carlo models that are tailored to particular questions or data. In contrast to Bayesian analysis, parametric rarely incorporates a priori information or results from other experiments. Also parametric analyses are often robust to violations of this assumption thanks for the Central Limit Theorem??? But why?? Im not really sure I understand this, although its probably obvious!
Bayesian analysis:
1. Assumes that the data were sampled from a distribution of known form
2. Estimates parameters not only from the data but also from prior knowledge
3. Assigns probabilities to these parameters
Monday, March 5, 2007
Subscribe to:
Post Comments (Atom)
3 comments:
Good summary of the various statistical techniques!
I'm with you - I don't totally understand the controversy surrounding the use of Bayesian stats (though I'm working on trying to figure it out!). You already take into consideration prior information when choosing and designing your hypotheses and experiment. Who would ever begin a study without delving into the literature first?? Why not incorporate that information explicitly rather than implicitly? And regarding the potential for subjectivity - well, scientists can (and have been known to) influence their experiments or interpret their data (whether consciously or subconsciously) in such a manner as to produce the results they desire even with frequentist statistics. There has to be a presumption of honesty and objectivity for science to work.
Cheers,
Nicole
Just out of curiosity, what kind of statistics do you plan to use in your independent project?? Thanks for the thoughtful summary. I have definitely noticed that different scientists have their biases and preferences!
Hi Busy,
I am looking at the effects of several different diets on the development of caterpillars. However, because the caterpillars are not all the same weight at the beginning of the experiment due to natural variation this could affect the results of their final weight when they reach pupation. To correct for this, I will use ANCOVA to determine whether the differences in pupal weights between the treatments are significant despite the fact that the caterpillars were all a different weight at the start of the experiment.
Post a Comment