The Bayes factor

class: left, bottom, inverted, title-slide

.title[
# The Bayes factor
]
.subtitle[
## Symposium ‘Emerging Alternatives to Traditional Null Hypothesis Significance Testing’
]
.author[
### Jorge Tendeiro Hiroshima University 24 May 2024 <a href="https://statsedge.org/APS2024">https://statsedge.org/APS2024</a>
]

---

## My talk

Null hypothesis Bayesian testing (NHBT) and its Bayes factor in a nutshell.

---
class: mysep, middle, center
background-image: url(Figures/SanFrancisco_SEP.png)
background-size:  cover
# Null hypothesis significance testing (NHST)

---
## Null hypothesis significance testing (NHST)
The most used testing paradigm in science.

NHST is _riddled_ with problems:

- It can't be used to draw support for `$\mathcal{H}_0$`.

- Non-significant results are hardly interpretable:
> We _fail to reject_ `$\mathcal{H}_0...$`

- Significant results can be misleading:
 - Any difference can be deemed significant ('just' increase `$N$`).
 - `$P(\text{data}|\mathcal{H}_1)$` totally ignored.

---
background-image: url(Figures/Bayes_background.png)
background-size:  cover
# An alternative &mdash; Bayesian inference
Let's see how to test hypotheses in the Bayesian realm.

The framework is known as **Null hypothesis Bayesian testing (NHBT)**.

---
## Running example
From a paper on psychological resilience in _Frontiers in Psychiatry_.

`$$\mathcal{H}_0: \mu_\text{Male}-\mu_\text{Female}=0\qquad\text{vs}\qquad\mathcal{H}_1: \mu_\text{Male}-\mu_\text{Female}\not=0$$`

---
## Null hypothesis Bayesian testing (NHBT)
In NHBT, we can test similar hypotheses:
`$$\mathcal{H}_0: \mu_\text{Male}-\mu_\text{Female}=0\qquad\text{vs}\qquad\mathcal{H}_1: \mu_\text{Male}-\mu_\text{Female}\not=0$$`

--

For simplicity, we test the standardized difference:
`$$\mathcal{H}_0: \delta=0\qquad\text{vs}\qquad\mathcal{H}_1: \delta\not=0,$$`
where `$\delta=\frac{\mu_\text{Male}-\mu_\text{Female}}{\sigma}$`.

---
## Null hypothesis Bayesian testing (NHBT)
`$$\boxed{\mathcal{H}_0: \delta=0\qquad\text{vs}\qquad\mathcal{H}_1: \delta\not=0}$$`

In NHBT, we need the so-called prior distributions for all parameters, under either hypothesis.

Prior distributions encapsulate our relative belief on the possible values of `$\delta$`, before looking at the data.

---
## Null hypothesis Bayesian testing (NHBT)
.pull-left[

<img src="Figures/priorH0.png" width="100%" style="display: block; margin: auto;" />
]

.pull-right[

<img src="Figures/priorH1.png" width="100%" style="display: block; margin: auto;" />
]

---
## Bayes factor in JASP

.pull-left-60[
<iframe width="640" height="277" src="https://player.vimeo.com/video/934019299?share=copy" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
]

.pull-right-30[

Conclusion:

`$\begin{aligned} & BF_{01} \\ &= \color{#CF7112}{\frac{p(\text{data}|\mathcal{H}_0)}{p(\text{data}|\mathcal{H}_1)}} \\ &=4.2 \\ \end{aligned}$`
]

---
## Bayes factor &mdash; Interpretation 1
`$$\boxed{BF_{01}=\frac{p(\text{data}|\mathcal{H}_0)}{p(\text{data}|\mathcal{H}_1)} = 4.2}$$`

> _The observed data are 4.2 times more likely under `$\mathcal{H}_0$` than under `$\mathcal{H}_1$`._

This is a statement about the relative probability of the observed data!

---
## Bayes factor &mdash; Interpretation 2
Alternatively, knowing that
`$$\underbrace{\frac{p(\mathcal{H}_0)}{p(\mathcal{H}_1)}}_{\color{#CF7112}{\text{prior}} \text{ odds}}\times \underbrace{BF_{01}\vphantom{\frac{p(\mathcal{H}_0)}{p(\mathcal{H}_0)}}}_{\color{#CF7112}{4.2}}=\underbrace{\frac{p(\mathcal{H}_0|\text{data})}{p(\mathcal{H}_1|\text{data})}}_{\color{#CF7112}{\text{posterior}} \text{ odds}}$$`
we can say this:

> _We update our relative initial belief by a factor of 4.2-to-1 in favor of `$\mathcal{H}_0$`._

Yes, in favor of `$\mathcal{H}_0$`!
 
This is quite different from the `$p$`-value.

---
class: mysep, middle, center
# Keep in mind

---
## [_The Bayes factor_ app](https://statsedge.org/shiny/LearnBF/) (https://statsedge.org/shiny/LearnBF/)
<iframe width="1920" height="470" src="https://statsedge.org/shiny/LearnBF/" frameborder="10" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

---
## To keep in mind
1. Bayes factors are **not** posterior odds!

2. Priors matter!

3. Bayes factors provide relative assessments!

4. Bayes factors **cannot** establish the presence or absence of an effect!

5. Bayes factors are **not** effect sizes!

6. Inconclusive evidence is **not** evidence of absence!

7. Using description labels is **not** problem-free!

---
## Bayes factors are **not** posterior odds!

`$$\boxed{\underbrace{\frac{p(\mathcal{H}_0)}{p(\mathcal{H}_1)}}_{\color{#CF7112}{\text{prior}} \text{ odds}}\times \underbrace{BF_{01}\vphantom{\frac{p(\mathcal{H}_0)}{p(\mathcal{H}_0)}}}_{\color{#CF7112}{4.2}}=\underbrace{\frac{p(\mathcal{H}_0|\text{data})}{p(\mathcal{H}_1|\text{data})}}_{\color{#CF7112}{\text{posterior}} \text{ odds}}}$$`

Do **not** say:

> _ `$\mathcal{H}_0$` is 4.2 more likely than `$\mathcal{H}_1$`, after observing the data._

This would be a 'posterior odds'-like interpretation.

---
## Priors matter!

`$$\boxed{\mathcal{H}_0: \delta=0\qquad\text{vs} \qquad \mathcal{H}_1:\left\{\begin{array}{l} \delta\not=0 \\ \color{#cf7112}{\text{under prior }p(\delta)} \end{array}\right.}$$`

Always report the priors used.

---
## Bayes factors provide relative assessments!

`$$\boxed{BF_{01}=\frac{p(\text{data}|\color{#cf7112}{\mathcal{H}_0})}{p(\text{data}|\color{#cf7112}{\mathcal{H}_1})} = 4.2}$$`

Do **not** say:

> _ There is evidence in favor of `$\mathcal{H}_0$`._

Instead, say

> _ There is evidence in favor of `$\mathcal{H}_0$` relative to (this) `$\mathcal{H}_1$`._

---
## Bayes factors **cannot** establish the presence or absence of an effect!
Imagine that `$BF_{01} = 1000$`.

Do **not** say:

> _There is no effect._

Bayes factors cannot establish the absence, or presence, of any effect!

Instead, say:

> _There is very strong evidence in favor of `$\mathcal{H}_0$`, relative to (this) `$\mathcal{H}_1$`._

---
## Bayes factors are **not** effect sizes!
Imagine that `$BF_{01} = 1000$`.

Do **not** say:

> _The effect is extremely small._

Bayes factors have this property (_consistency_):
`$$\text{effect } \not=0 \quad\Longrightarrow\quad BF_{10}\underset{n\rightarrow\infty}{\longrightarrow}\infty$$`
`$$\text{effect } =0 \quad\Longrightarrow\quad BF_{01}\underset{n\rightarrow\infty}{\longrightarrow}\infty$$`

This holds regardless of the fact that the effect size is fixed in the population.
 
Bayes factors are therefore **not** effect size measures.

---
## Inconclusive evidence is **not** evidence of absence!
Imagine that `$BF_{01} = \frac{p(\text{data}|\mathcal{H}_0)}{p(\text{data}|\mathcal{H}_1)} = 1$`.

Do **not** say:

> _There is (evidence of) no effect._

Instead, say:

> _The evidence is ambiguous._

> _The observed data are equally predictable under either hypothesis._

---
## Using description labels is **not** problem-free!
Imagine that `$BF_{01} = 9.5$`.

This level of evidence may be qualified in various ways based on the literature:

Instead,

- Explain the amount of evidence in the context of the research being conducted.

- Look at belief (posterior probabilities).

---
class: mysep, middle, center
background-image: url(Figures/SanFrancisco_SEP.png)
background-size:  cover
## Questions?