Influence of a single observation to the global monitor
Arguments
- dag
an object of class
bn
from thebnlearn
package- data
a base R style dataframe
- alpha
single integer. By default, the number of max levels in
data
Details
Consider a Bayesian network over variables \(Y_1,\dots,Y_m\) and suppose a dataset \((\boldsymbol{y}_1,\dots,\boldsymbol{y}_n)\) has been observed, where \(\boldsymbol{y}_i=(y_{i1},\dots,y_{im})\) and \(y_{ij}\) is the i-th observation of the j-th variable. Define \(\boldsymbol{y}_{-i}=(\boldsymbol{y}_1,\dots,\boldsymbol{y}_{i-1},\boldsymbol{y}_{i+1},\dots,\boldsymbol{y}_n)\). The influence of an observation to the global monitor is defined as $$|\log(p(\boldsymbol{y}_1,\dots,\boldsymbol{y}_n)) - \log(p(\boldsymbol{y}_{-i}))|.$$ High values of this index denote observations that highly contribute to the likelihood of the model.
See also
influential_obs
, node_monitor
, seq_node_monitor
, seq_pa_ch_monitor
Examples
influential_obs(chds_bn, chds[1:100,], 3)
#> Social Economic Events Admission score
#> 1 High Low Low No 2.109706
#> 2 High Low Low Yes 3.914204
#> 4 High High High No 3.350580
#> 5 Low Low High No 1.993258
#> 7 Low Low Low No 2.628451
#> 8 Low Low Average No 2.061860
#> 15 High High Low No 1.816719
#> 22 High High Low Yes 3.621217
#> 23 High Low Average Yes 4.302875
#> 26 Low Low Low Yes 4.432949
#> 34 High High Average No 3.062507
#> 35 Low Low High Yes 3.445511
#> 40 Low High Low Yes 6.545914
#> 45 High Low Average No 3.355494
#> 46 Low High Low No 4.741415
#> 52 High Low High No 3.643567
#> 59 Low Low Average Yes 3.009242
#> 69 High High Average Yes 4.009888
#> 73 Low High High No 4.106223
#> 83 High Low High Yes 5.095819
#> 85 High High High Yes 4.802832