class: left, bottom, inverted, title-slide .title[ # The Bayes factor ] .subtitle[ ## Symposium
‘Emerging Alternatives to Traditional Null Hypothesis Significance Testing’
] .author[ ### Jorge Tendeiro
Hiroshima University
24 May 2024
https://statsedge.org/APS2024
] --- ## My talk Null hypothesis Bayesian testing (<span style='color: #cf7112;'>NHBT</span>) and its Bayes factor in a nutshell. --- class: mysep, middle, center background-image: url(Figures/SanFrancisco_SEP.png) background-size: cover # Null hypothesis significance testing (NHST) --- ## Null hypothesis significance testing (NHST) The <span style='color: #cf7112;'>most used</span> testing paradigm in science. -- <br> NHST is _riddled_ with problems: - It can't be used to draw support for `\(\mathcal{H}_0\)`. - <span style='color: #cf7112;'>Non-significant</span> results are hardly interpretable: > We _fail to reject_ `\(\mathcal{H}_0...\)` - <span style='color: #cf7112;'>Significant</span> results can be misleading: - Any difference can be deemed significant ('just' increase `\(N\)`). - `\(P(\text{data}|\mathcal{H}_1)\)` totally ignored. --- background-image: url(Figures/Bayes_background.png) background-size: cover # An alternative — Bayesian inference Let's see how to test hypotheses in the Bayesian realm. The framework is known as **Null hypothesis Bayesian testing (NHBT)**. --- ## Running example From a paper on psychological resilience in _Frontiers in Psychiatry_. <br> <!-- https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2020.608588/full --> <center> <img src="Figures/resilienceJASP.png" alt="Resilience." width="90%" /> </center> <br> `$$\mathcal{H}_0: \mu_\text{Male}-\mu_\text{Female}=0\qquad\text{vs}\qquad\mathcal{H}_1: \mu_\text{Male}-\mu_\text{Female}\not=0$$` --- ## Null hypothesis Bayesian testing (NHBT) In NHBT, we can test similar hypotheses: `$$\mathcal{H}_0: \mu_\text{Male}-\mu_\text{Female}=0\qquad\text{vs}\qquad\mathcal{H}_1: \mu_\text{Male}-\mu_\text{Female}\not=0$$` -- <br><br> For simplicity, we test the <span style='color: #cf7112;'>standardized difference</span>: `$$\mathcal{H}_0: \delta=0\qquad\text{vs}\qquad\mathcal{H}_1: \delta\not=0,$$` where `\(\delta=\frac{\mu_\text{Male}-\mu_\text{Female}}{\sigma}\)`. --- ## Null hypothesis Bayesian testing (NHBT) `$$\boxed{\mathcal{H}_0: \delta=0\qquad\text{vs}\qquad\mathcal{H}_1: \delta\not=0}$$` <br> In NHBT, we need the so-called <span style='color: #cf7112;'>prior distributions</span> for all parameters, under either hypothesis. <br> Prior distributions encapsulate our relative belief on the possible values of `\(\delta\)`, before looking at the data. --- ## Null hypothesis Bayesian testing (NHBT) .pull-left[ <img src="Figures/priorH0.png" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="Figures/priorH1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Bayes factor in JASP <center> <img src="Figures/resilienceJASP.png" alt="JASP" width="80%" /> </center> -- <br> .pull-left-60[ <iframe width="640" height="277" src="https://player.vimeo.com/video/934019299?share=copy" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ] -- .pull-right-30[ <br><br> <span style='color: #cf7112;'>Conclusion:</span> <br> `\(\begin{aligned} & BF_{01} \\ &= \color{#CF7112}{\frac{p(\text{data}|\mathcal{H}_0)}{p(\text{data}|\mathcal{H}_1)}} \\ &=4.2 \\ \end{aligned}\)` ] --- ## Bayes factor — Interpretation 1 `$$\boxed{BF_{01}=\frac{p(\text{data}|\mathcal{H}_0)}{p(\text{data}|\mathcal{H}_1)} = 4.2}$$` <br> > _The observed data are 4.2 times more likely under `\(\mathcal{H}_0\)` than under `\(\mathcal{H}_1\)`._ -- <br> This is a statement about the relative probability of the <span style='color: #cf7112;'>observed data</span>! --- ## Bayes factor — Interpretation 2 Alternatively, knowing that `$$\underbrace{\frac{p(\mathcal{H}_0)}{p(\mathcal{H}_1)}}_{\color{#CF7112}{\text{prior}} \text{ odds}}\times \underbrace{BF_{01}\vphantom{\frac{p(\mathcal{H}_0)}{p(\mathcal{H}_0)}}}_{\color{#CF7112}{4.2}}=\underbrace{\frac{p(\mathcal{H}_0|\text{data})}{p(\mathcal{H}_1|\text{data})}}_{\color{#CF7112}{\text{posterior}} \text{ odds}}$$` we can say this: > _We update our relative initial belief by a factor of 4.2-to-1 in favor of `\(\mathcal{H}_0\)`._ -- <br> Yes, in favor of `\(\mathcal{H}_0\)`! <br> This is quite different from the `\(p\)`-value. --- class: mysep, middle, center # Keep in mind --- ## [_The Bayes factor_ app](https://statsedge.org/shiny/LearnBF/) (https://statsedge.org/shiny/LearnBF/) <iframe width="1920" height="470" src="https://statsedge.org/shiny/LearnBF/" frameborder="10" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- ## To keep in mind 1. Bayes factors are **not** posterior odds! 2. <span style='color: #cf7112;'>Priors</span> matter! 3. Bayes factors provide <span style='color: #cf7112;'>relative</span> assessments! 4. Bayes factors **cannot** establish the <span style='color: #cf7112;'>presence</span> or <span style='color: #cf7112;'>absence</span> of an effect! 5. Bayes factors are **not** effect sizes! 6. <span style='color: #cf7112;'>Inconclusive</span> evidence is **not** evidence of <span style='color: #cf7112;'>absence</span>! 7. Using <span style='color: #cf7112;'>description labels</span> is **not** problem-free! --- ## Bayes factors are **not** posterior odds! `$$\boxed{\underbrace{\frac{p(\mathcal{H}_0)}{p(\mathcal{H}_1)}}_{\color{#CF7112}{\text{prior}} \text{ odds}}\times \underbrace{BF_{01}\vphantom{\frac{p(\mathcal{H}_0)}{p(\mathcal{H}_0)}}}_{\color{#CF7112}{4.2}}=\underbrace{\frac{p(\mathcal{H}_0|\text{data})}{p(\mathcal{H}_1|\text{data})}}_{\color{#CF7112}{\text{posterior}} \text{ odds}}}$$` -- <br> Do **not** say: > _ `\(\mathcal{H}_0\)` is 4.2 more likely than `\(\mathcal{H}_1\)`, after observing the data._ -- <br> This would be a 'posterior odds'-like interpretation. --- ## <span style='color: #cf7112;'>Priors</span> matter! `$$\boxed{\mathcal{H}_0: \delta=0\qquad\text{vs} \qquad \mathcal{H}_1:\left\{\begin{array}{l} \delta\not=0 \\ \color{#cf7112}{\text{under prior }p(\delta)} \end{array}\right.}$$` -- <br> <center> <img src="Figures/priors.png" alt="Priors." width="100%" /> </center> -- Always report the priors used. --- ## Bayes factors provide <span style='color: #cf7112;'>relative</span> assessments! `$$\boxed{BF_{01}=\frac{p(\text{data}|\color{#cf7112}{\mathcal{H}_0})}{p(\text{data}|\color{#cf7112}{\mathcal{H}_1})} = 4.2}$$` <br> Do **not** say: > _ There is evidence in favor of `\(\mathcal{H}_0\)`._ -- <br> Instead, say > _ There is evidence in favor of `\(\mathcal{H}_0\)` <span style='color: #cf7112;'>relative to (this) `\(\mathcal{H}_1\)`</span>._ --- ## Bayes factors **cannot** establish the <span style='color: #cf7112;'>presence</span> or <span style='color: #cf7112;'>absence</span> of an effect! Imagine that `\(BF_{01} = 1000\)`. -- <br> Do **not** say: > _There is no effect._ Bayes factors cannot establish the absence, or presence, of any effect! -- <br> Instead, say: > _There is very strong evidence in favor of `\(\mathcal{H}_0\)`, relative to (this) `\(\mathcal{H}_1\)`._ --- ## Bayes factors are **not** effect sizes! Imagine that `\(BF_{01} = 1000\)`. -- Do **not** say: > _The effect is extremely small._ -- <br> Bayes factors have this property (_consistency_): `$$\text{effect } \not=0 \quad\Longrightarrow\quad BF_{10}\underset{n\rightarrow\infty}{\longrightarrow}\infty$$` `$$\text{effect } =0 \quad\Longrightarrow\quad BF_{01}\underset{n\rightarrow\infty}{\longrightarrow}\infty$$` -- This holds regardless of the fact that the effect size is fixed in the population. <br> Bayes factors are therefore **not** effect size measures. --- ## <span style='color: #cf7112;'>Inconclusive</span> evidence is **not** evidence of <span style='color: #cf7112;'>absence</span>! Imagine that `\(BF_{01} = \frac{p(\text{data}|\mathcal{H}_0)}{p(\text{data}|\mathcal{H}_1)} = 1\)`. -- Do **not** say: > _There is (evidence of) no effect._ -- <br> Instead, say: > _The evidence is ambiguous._ or > _The observed data are equally predictable under either hypothesis._ --- ## Using <span style='color: #cf7112;'>description labels</span> is **not** problem-free! Imagine that `\(BF_{01} = 9.5\)`. -- This level of evidence may be qualified in various ways based on the literature: <center> <img src="Figures/labels.png" alt="Bayes factor labels." width="50%" /> </center> -- Instead, - Explain the amount of evidence in the <span style='color: #cf7112;'>context</span> of the research being conducted. - Look at <span style='color: #cf7112;'>belief</span> (posterior probabilities). --- class: mysep, middle, center background-image: url(Figures/SanFrancisco_SEP.png) background-size: cover ## Questions?