.. _scil_connectivity_compute_pca: scil_connectivity_compute_pca ============================= :: usage: __main__.py [-h] --metrics METRICS [METRICS ...] --list_ids FILE [--all_edges] [--input_connectoflow] [--show] [-v [{DEBUG,INFO,WARNING,ERROR}]] [-f] in_folder out_dir Script to compute PCA analysis on a set of connectivity matrices. The output is all significant principal components in a connectivity matrix format. This script can take into account all edges from every subject in a population or only non-zero edges across all subjects. Interpretation of resulting principal components can be done by evaluating the loadings values for each metrics. A value near 0 means that this metric doesn't contribute to this specific component whereas high positive or negative values mean a larger contribution. Components can then be labeled based on which metric contributes the highest. For example, a principal component showing a high loading for afd_fixel and near 0 loading for all other metrics can be interpreted as axonal density (see Gagnon et al. 2022 for this specific example or ref [3] for an introduction to PCA). The script can take directly as input a connectoflow output folder. Simply use the --input_connectoflow flag. Else, the script expects a single folder containing all matrices for all subjects. Those matrices can be obtained, for instance, by scil_connectivity_compute_matrices. Example: Default input [in_folder] |--- sub-01_ad.npy |--- sub-01_md.npy |--- sub-02_ad.npy |--- sub-02_md.npy |--- ... Connectoflow input: [in_folder] [subj-01] [Compute_Connectivity] |--- ad.npy The plots, tables and principal components matrices will be saved in the designated folder from the argument. If you want to move back your principal components matrices in your connectoflow output, you can use a similar bash command for all principal components: for sub in `cat list_id.txt`; do cp out_dir/${sub}_PC1.npy connectoflow_output/$sub/Compute_Connectivity/ done EXAMPLE USAGE: scil_connectivity_compute_pca input_folder/ output_folder/ --metrics ad fa md rd [...] --list_ids list_ids.txt ------------------------------------------------------------------------------- References: [1] Chamberland M, Raven EP, Genc S, Duffy K, Descoteaux M, Parker GD, Tax CMW, Jones DK. Dimensionality reduction of diffusion MRI measures for improved tractometry of the human brain. Neuroimage. 2019 Oct 15;200:89-100. doi: 10.1016/j.neuroimage.2019.06.020. Epub 2019 Jun 20. PMID: 31228638; PMCID: PMC6711466. [2] Gagnon A., Grenier G., Bocti C., Gillet V., Lepage J.-F., Baccarelli A. A., Posner J., Descoteaux M., Takser L. (2022). White matter microstructural variability linked to differential attentional skills and impulsive behavior in a pediatric population. Cerebral Cortex. https://doi.org/10.1093/cercor/bhac180 [3] https://towardsdatascience.com/what-are-pca-loadings-and-biplots-9a7897f2e559 ------------------------------------------------------------------------------- positional arguments: in_folder Path to the input folder. See explanation above for its expected organization. out_dir Path to the output folder to export graphs, tables and principal components matrices. options: -h, --help show this help message and exit --metrics METRICS [METRICS ...] Suffixes of all metrics to include in PCA analysis (ex: ad md fa rd). They must be immediately followed by the .npy extension. --list_ids FILE Path to a .txt file containing a list of all ids. --all_edges If true, will include all edges from all subjects and not only common edges (Not recommended) --input_connectoflow If true, script will assume the input folder is a Connectoflow output. --show If set, show matplotlib figures. Else, they are only saved in the output folder. -v [{DEBUG,INFO,WARNING,ERROR}] Produces verbose output depending on the provided level. Default level is warning, default when using -v is info. -f Force overwriting of the output files. 2.2.2