scil_connectivity_compute_pca.py

usage: __main__.py [-h] --metrics METRICS [METRICS ...] --list_ids FILE
                   [--all_edges] [--input_connectoflow] [--show]
                   [-v [{DEBUG,INFO,WARNING}]] [-f]
                   in_folder out_folder

Script to compute PCA analysis on a set of connectivity matrices. The output is
all significant principal components in a connectivity matrix format.
This script can take into account all edges from every subject in a population
or only non-zero edges across all subjects.

Interpretation of resulting principal components can be done by evaluating the
loadings values for each metrics. A value near 0 means that this metric doesn't
contribute to this specific component whereas high positive or negative values
mean a larger contribution. Components can then be labeled based on which
metric contributes the highest. For example, a principal component showing a
high loading for afd_fixel and near 0 loading for all other metrics can be
interpreted as axonal density (see Gagnon et al. 2022 for this specific example
or ref [3] for an introduction to PCA).

The script can take directly as input a connectoflow output folder. Simply use
the --input_connectoflow flag. Else, the script expects a single folder
containing all matrices for all subjects. Those matrices can be obtained, for
instance, by scil_connectivity_compute_matrices.py.
Example: Default input
        [in_folder]
        |--- sub-01_ad.npy
        |--- sub-01_md.npy
        |--- sub-02_ad.npy
        |--- sub-02_md.npy
        |--- ...
Connectoflow input:
        [in_folder]
              [subj-01]
                   [Compute_Connectivity]
                       |--- ad.npy

The plots, tables and principal components matrices will be saved in the
designated folder from the <out_folder> argument. If you want to move back your
principal components matrices in your connectoflow output, you can use a
similar bash command for all principal components:
for sub in `cat list_id.txt`;
do
    cp out_folder/${sub}_PC1.npy connectoflow_output/$sub/Compute_Connectivity/
done

EXAMPLE USAGE:
scil_connectivity_compute_pca.py input_folder/ output_folder/
    --metrics ad fa md rd [...] --list_ids list_ids.txt

-------------------------------------------------------------------------------
References:
[1] Chamberland M, Raven EP, Genc S, Duffy K, Descoteaux M, Parker GD, Tax CMW,
    Jones DK. Dimensionality reduction of diffusion MRI measures for improved
    tractometry of the human brain. Neuroimage. 2019 Oct 15;200:89-100.
    doi: 10.1016/j.neuroimage.2019.06.020. Epub 2019 Jun 20. PMID: 31228638;
    PMCID: PMC6711466.
[2] Gagnon A., Grenier G., Bocti C., Gillet V., Lepage J.-F., Baccarelli A. A.,
    Posner J., Descoteaux M., Takser L. (2022). White matter microstructural
    variability linked to differential attentional skills and impulsive behavior
    in a pediatric population. Cerebral Cortex.
    https://doi.org/10.1093/cercor/bhac180
[3] https://towardsdatascience.com/what-are-pca-loadings-and-biplots-9a7897f2e559
-------------------------------------------------------------------------------

positional arguments:
  in_folder             Path to the input folder. See explanation above for its expected organization.
  out_folder            Path to the output folder to export graphs, tables and principal
                        components matrices.

options:
  -h, --help            show this help message and exit
  --metrics METRICS [METRICS ...]
                        Suffixes of all metrics to include in PCA analysis (ex: ad md fa rd).
                        They must be immediately followed by the .npy extension.
  --list_ids FILE       Path to a .txt file containing a list of all ids.
  --all_edges           If true, will include all edges from all subjects and not only
                        common edges (Not recommended)
  --input_connectoflow  If true, script will assume the input folder is a Connectoflow output.
  --show                If set, show matplotlib figures. Else, they are only saved in the output folder.
  -v [{DEBUG,INFO,WARNING}]
                        Produces verbose output depending on the provided level.
                        Default level is warning, default when using -v is info.
  -f                    Force overwriting of the output files.

Scilpy version: 2.0.2