R/diff.R
get_avg_activity_diff_based_on_mcc_clustering.Rd
This function splits the models to 'good' and 'bad' based on an MCC value
clustering method: class.id.high denotes the group id with the higher MCC
values (good model group) vs class.id.low which denotes the group id with
the lower MCC values (bad model group). Then, for each network node, the function
finds the node's average activity in each of the two classes (a value in
the [0,1] interval) and then subtracts the bad class average activity value from
the good one, taking into account the given penalty
factor and the
number of models in each respective model group.
get_avg_activity_diff_based_on_mcc_clustering( models.mcc, models.stable.state, mcc.class.ids, models.cluster.ids, class.id.low, class.id.high, penalty = 0 )
models.mcc | a numeric vector of Matthews Correlation Coefficient (MCC)
scores, one for each model. The names attribute holds the models' names.
Can be the result of using the function |
---|---|
models.stable.state | a |
mcc.class.ids | a numeric vector of group/class ids starting from 1,
e.g. |
models.cluster.ids | a numeric vector of cluster ids assigned to each
model. It is the result of using |
class.id.low | integer. This number specifies the MCC class id of the 'bad' models. |
class.id.high | integer. This number specifies the MCC class id of the
'good' models and needs to be strictly higher than |
penalty | value between 0 and 1 (inclusive). A value of 0 means no penalty and a value of 1 is the strickest possible penalty. Default value is 0. This penalty is used as part of a weighted term to the difference in a value of interest (e.g. activity or link operator difference) between two group of models, to account for the difference in the number of models from each respective model group. |
a numeric vector with values in the [-1,1] interval (minimum and maximum possible average difference) and with the names attribute representing the name of the nodes.
So, if a node has a value close to -1 it means that on average, this node is more inhibited in the 'good' models compared to the 'bad' ones while a value closer to 1 means that the node is more activated in the 'good' models. A value closer to 0 indicates that the activity of that node is not so much different between the 'good' and 'bad' models and so it won't not be a node of interest when searching for indicators of better performance (higher MCC score/class) in the good models.
Other average data difference functions:
get_avg_activity_diff_based_on_specific_synergy_prediction()
,
get_avg_activity_diff_based_on_synergy_set_cmp()
,
get_avg_activity_diff_based_on_tp_predictions()
,
get_avg_activity_diff_mat_based_on_mcc_clustering()
,
get_avg_activity_diff_mat_based_on_specific_synergy_prediction()
,
get_avg_activity_diff_mat_based_on_tp_predictions()
,
get_avg_link_operator_diff_based_on_synergy_set_cmp()
,
get_avg_link_operator_diff_mat_based_on_mcc_clustering()
,
get_avg_link_operator_diff_mat_based_on_specific_synergy_prediction()
,
get_avg_link_operator_diff_mat_based_on_tp_predictions()