Sequential marginal and conditional node monitors for a vertex of a Bayesian network.
Arguments
- dag
an object of class
bn
from thebnlearn
package- df
a base R style dataframe
- node.name
node over which to compute the monitor
Details
Consider a Bayesian network over variables \(Y_1,\dots,Y_m\) and suppose a dataset \((\boldsymbol{y}_1,\dots,\boldsymbol{y}_n)\) has been observed, where \(\boldsymbol{y}_i=(y_{i1},\dots,y_{im})\) and \(y_{ij}\) is the i-th observation of the j-th variable. Let \(p_i\) denote the marginal density of \(Y_j\) after the first \(i-1\) observations have been processed. Define $$E_i = \sum_{k=1}^Kp_i(d_k)\log(p_i(d_k)),$$ $$V_i = \sum_{k=1}^K p_i(d_k)\log^2(p_i(d_k))-E_i^2,$$ where \((d_1,\dots,d_K)\) are the possible values of \(Y_j\). The sequential marginal node monitor for the vertex \(Y_j\) is defined as $$Z_{ij}=\frac{-\sum_{k=1}^i\log(p_k(y_{kj}))-\sum_{k=1}^i E_k}{\sqrt{\sum_{k=1}^iV_k}}.$$ Values of \(Z_{ij}\) such that \(|Z_{ij}|> 1.96\) can give an indication of a poor model fit for the vertex \(Y_j\) after the first i-1 observations have been processed.
The sequential conditional node monitor for the vertex \(Y_j\) is defined as $$Z_{ij}=\frac{-\sum_{k=1}^i\log(p_k(y_{kj}|y_{k1},\dots,y_{k(j-1)},y_{k(j+1)},\dots,y_{km}))-\sum_{k=1}^i E_k}{\sqrt{\sum_{k=1}^iV_k}},$$ where \(E_k\) and \(V_k\) are computed with respect to \(p_k(y_{kj}|y_{k1},\dots,y_{k(j-1)},y_{k(j+1)},\dots,y_{km})\). Again, values of \(Z_{ij}\) such that \(|Z_{ij}|> 1.96\) can give an indication of a poor model fit for the vertex \(Y_j\).
References
Cowell, R. G., Dawid, P., Lauritzen, S. L., & Spiegelhalter, D. J. (2006). Probabilistic networks and expert systems: Exact computational methods for Bayesian networks. Springer Science & Business Media.
Cowell, R. G., Verrall, R. J., & Yoon, Y. K. (2007). Modeling operational risk with Bayesian networks. Journal of Risk and Insurance, 74(4), 795-827.
See also
influential_obs
, node_monitor
, seq_node_monitor
, seq_pa_ch_monitor