scilpy.stats package

scilpy.stats.matrix_stats module

scilpy.stats.matrix_stats.omega_sigma(matrix)[source]

Returns the small-world coefficients (omega & sigma) of a graph. Omega ranges between -1 and 1. Values close to 0 mean the matrix features small-world characteristics. Values close to -1 mean the network has a lattice structure and values close to 1 mean G is a random network.

A network is commonly classified as small-world if sigma > 1.

Parameters:

matrix (numpy.ndarray) – A weighted undirected graph.

Returns:

smallworld – The small-work coefficients (omega & sigma).

Return type:

tuple of float

Notes

The implementation is adapted from the algorithm by Telesford et al. [1].

References

scilpy.stats.matrix_stats.ttest_two_matrices(matrices_g1, matrices_g2, paired, tail, fdr, bonferroni)[source]
Parameters:
  • matrices_g1 (np.ndarray of shape ? (toDO))

  • matrices_g2 (np.ndarray of shape ?)

  • paired (bool) – Use paired sample t-test instead of population t-test. The two matrices must be ordered the same way.

  • tail (str.) – One of [‘left’, ‘right’, ‘both’].

  • fdr (bool) – Perform a false discovery rate (FDR) correction for the p-values. Uses the number of non-zero edges as number of tests (value between 0.01 and 0.1).

  • bonferroni (bool) – Perform a Bonferroni correction for the p-values. Uses the number of non-zero edges as number of tests.

scilpy.stats.stats module

scilpy.stats.stats.verify_group_difference(data_by_group, normality=False, homoscedasticity=False, alpha=0.05)[source]
Parameters:
  • data_by_group (list of array_like) – The sample data separated by groups. Possibly of different group size.

  • normality (bool) – Whether or not the sample data of each groups can be considered normal.

  • homoscedasticity (bool) – Whether or not the equality of variance across groups can be assumed.

  • alpha (float) – Type 1 error of the equality of variance test. Probability of false positive or rejecting null hypothesis when it is true.

Returns:

  • test (string) – Name of the test done to verify group difference.

  • difference (bool) – Whether or not the variable associated for groups has an effect on the current measurement.

  • p_value (float) – Probability to obtain an effect at least as extreme as the one in the current sample, assuming the null hypothesis. We reject the null hypothesis when this value is lower than alpha.

scilpy.stats.stats.verify_homoscedasticity(data_by_group, normality=False, alpha=0.05)[source]
Parameters:
  • data_by_group (list of array_like) – The sample data separated by groups. Possibly of different group size.

  • normality (bool) – Whether or not the sample data of each groups can be considered normal

  • alpha (float) – Type 1 error of the equality of variance test Probability of false positive or rejecting null hypothesis when it is true.

Returns:

  • test (string) – Name of the test done to verify homoscedasticity

  • homoscedasticity (bool) – Whether or not the equality of variance across groups can be assumed

  • p_value (float) – Probability to obtain an effect at least as extreme as the one in the current sample, assuming the null hypothesis. We reject the null hypothesis when this value is lower than alpha.

scilpy.stats.stats.verify_normality(data, alpha=0.05)[source]
Parameters:
  • data (array_like) – Array of sample data to test normality on. Should be of 1 dimension.

  • alpha (float) – Type 1 error of the normality test. Probability of false positive or rejecting null hypothesis when it is true.

Returns:

  • normality (bool) – Whether or not the sample can be considered normal

  • p_value (float) – Probability to obtain an effect at least as extreme as the one in the current sample, assuming the null hypothesis. We reject the null hypothesis when this value is lower than alpha.

scilpy.stats.stats.verify_post_hoc(data_by_group, groups_list, test, correction=True, alpha=0.05)[source]
Parameters:
  • data_by_group (list of array_like) – The sample data separated by groups. Possibly of different lengths group size.

  • groups_list (list of string) – The names of each group in the same order as data_by_group.

  • test (string) – The name of the post-hoc analysis test to do. Post-hoc analysis is the analysis of pairwise difference a posteriori of the fact that there is a difference across groups.

  • correction (bool) – Whether or not to do a Bonferroni correction on the alpha threshold. Used to have a more stable type 1 error across multiple comparison.

  • alpha (float) – Type 1 error of the equality of variance test. Probability of false positive or rejecting null hypothesis when it is true.

Returns:

  • differences (list of (string, string, bool)) – The result of the post-hoc for every groups pairwise combinations.

    • 1st, 2nd dimension: Names of the groups chosen.

    • 3rd: Whether or not we detect a pairwise difference on the current measurement.

    • 4th: P-value of the pairwise difference test.

  • test (string) – Name of the test done to verify group difference

scilpy.stats.utils module

class scilpy.stats.utils.data_for_stat(json_file, participants)[source]

Bases: object

Method ‘init’ in the name will initialise argument of the object Method ‘get’ in the name return an object generated from the object

data_dictionnary

Open the json and tsv file and put the information in a dictionnary

get_bundles_list()[source]
get_data_sample(bundle, metric, value)[source]
Parameters:
  • bundle (string) – The specific bundle with which we generate our sample.

  • metric (string) – The specific metric with which we generate our sample.

  • value (string) – The specific value with which we generate our sample.

Returns:

data_sample – The sample array associate with the parameters.

Return type:

array of float

get_first_bundle(participant)[source]
get_first_metric(participant, bundle)[source]
get_first_participant()[source]
get_groups_dictionnary(group_by)[source]
Parameters:

groups_by (string) – The attribute with which we generate our groups.

Returns:

group_dict – keys: group id generated by group_by. values: dictionnary of participants of that specific group.

Return type:

dictionnary of groups

get_groups_list(group_by)[source]
Parameters:

groups_by (string) – The attribute with which we generate our groups.

Returns:

group_list – list of group id generated by group_by variable.

Return type:

list of string

get_metrics_list()[source]
get_participant_attributes_list()[source]
get_participants_list()[source]
get_values_list()[source]
validation_participant_id(json_info, participants_info)[source]

Verify if the json and tsv file has the same participants id

scilpy.stats.utils.get_group_data_sample(group_dict, group_id, bundle, metric, value)[source]
Parameters:
  • group_dict (dictionnary of groups) – keys: group id generated by group_by. values: dictionnary of participants of that specific group.

  • group_id (string) – The name of the group with which we generate our sample.

  • bundle (string) – The specific bundle with which we generate our sample.

  • metric (string) – The specific metric with which we generate our sample.

  • value (string) – The specific value with which we generate our sample.

Returns:

data_sample – The sample array associate with the parameters.

Return type:

array of float

scilpy.stats.utils.visualise_distribution(data_by_group, participants_id, bundle, metric, value, oFolder, groups_list)[source]
Parameters:
  • data_by_group (list of array_like) – The sample data separated by groups. Possibly of different lengths per group.

  • participants_id (list of string) – Names of the participants id “name”.

  • metric (string) – The name of the metricment in which you want to look at the across groups.

  • oFolder (path-like object) – Emplacement in which we want to save the graph of the distribution the measurement across groups.

  • groups_list (list of string) – The names of each group.

Returns:

outliers – The list of participants that is considered outlier for their group (participant_id, group_id).

Return type:

list of (string, string)

scilpy.stats.utils.write_csv_from_json(writer, json_dict)[source]
scilpy.stats.utils.write_current_dictionnary(metric, normality, variance_equality, diff_result, diff_2_by_2)[source]
Parameters:
  • metric (string) – The name of the metric in which the group comparison was made on.

  • normality (dictionnary of groups) – keys: group id values: (result, p-value)

  • variance_equality ((string, bool)) –

    The result of the equality of variance test.

    • 1st dimension:

      Name of the equal variance test done.

    • 2nd dimension:

      Whether or not it equality of variance can be assumed.

  • diff_result ((string, bool, float)) –

    The result of the groups difference analysis on the metric.

    • 1st dimension:

      Name of the test done.

    • 2nd dimension:

      Whether or not we detect a group difference on the metric.

    • 3rd dimension:

      p-value result

  • diff_2_by_2 ((list of (string, string, bool, float), string)) –

    The result of the pairwise groups difference a posteriori analysis.

    • 1st dimension:

      Name of the test done.

    • 2nd dimension:

      The result of every pairwise combinations of the groups (name of first group, name of second group, result, p-value).

Returns:

  • curr_dict (dictionnary of test)

  • keys (The category of test done (Normality, Homoscedascticity,…))

  • values (The result of those test.)