Documentation‎ > ‎

Group connectometry analysis

Introduction


What is connectometry?
Diffusion MRI connectometry uses permutation test to find out the association of white matter pathways with any study factor [1]. It can be combined with a simple linear regression that includes all relevant variables (e.g. sex, site difference) in the model. The results will show tracks that exhibit positive or negative correlation with a variable of interest.

Why is connectometry a powerful test?
Most studies used either track-based or region-based analysis to compare diffusion data along the tracks or within a given region. These approaches require users to assign pre-defined tracks or regions to sample diffusion indices for further analysis. The assigned regions or tracks, however, will inevitably include regions that have low SNR (near gray matter) or show no correlation at all (irrelevant branches), consequently bringing high variance and making results insignificant.

Connectometry adopts an entirely different mindset to avoid including irrelevant regions. The paradigm behind it can be summarized as "tracking the difference". This is fundamentally different from the conventional paradigm of "finding the difference in tracks or regions". The analysis first identifies voxels that have strong correlations and tracks along axonal fiber directions to identify the consecutive fiber segment that shows continuous positive and negative correlation. The tracking is only limit to regions with high correlation, thus minimizing the noise contributed from other non-correlated regions.

What is the biophysics behind connnectometry?

Connectometry compares spin distribution function (SDF) between subjects, a density-based measurement of diffusion at different orientations. SDF is different from diffusivity measurements such as FA, ADC, RD. Diffusivity measures how fast water diffuses, whereas SDF measures the density of diffusing water. A recent study [4] showed that SDF provides a unique structural characterization that can reliably identify individuals (termed local connectome fingerprint). Its reproducibility and uniqueness 

is higher than diffusivity-based measurements. It is noteworthy that since SDF reveals high individuality, the inter-subject variance of SDF can be very high (this does not mean that SDF is unreliable). As a result, connectometry is most suitable for longitudinal study, though cross-sectional study can also be benefited by connectometry analysis.

What can connectometry do in a brain study? 
Two types of connectometry analysis are available in DSI Studio:

One is group connectometry [pdf], which identifies tracks associated with group difference or correlated with a study variable. It can also be used in a longitudinal study to study the difference between paired data (e.g. pre- and post- treatment). Group connectometry use a linear regression model and can include other variables to consider in the regression.

Another is individual connectometry (see Individual Connectometry for instruction), which identifies damaged/enhanced pathways of an individual by comparing one subject with a group of normal subjects. 

You may refer to the publication at the bottom of this page to see how connectometry can be used in other studies.

The following documentation shows how to do group connectometry in DSI Studio:

STEP 1: Data Reconstruction


Create a connectometry database containing ALL subjects. At the end of this step, you will have a connectometry database, which is a file name with extension *.db.fib.gz

*You can add more subjects or remove subjects from the database later using the [Edit a Connectometry Database] function provided in the main window tab.

*If you are going to study the change in a longitudinal study, make sure that you place baseline scan and followup scan together for each subject (e.g. base of s#1, followup of #1, base of s#2, followup of s#2...etc.)

*Quality control: please check the R2 value in the connectometry database using the "source data" tab shown in the following figure. A substantially lower R2 indicates reconstruction error. You may contact me at frank.yeh@gmail.com about fixing the error.

STEP 2: Handle Longitudinal Scans (if applicable)

Skipping STEP2 if your study is a cross-sectional study (has no repeat scans of the same subjects).

If your study is a longitudinal study (e.g. comparing pre- and post-treatment of the same subject), then the scans of the same individuals needs to be matched to calculate their difference. The following are the steps:

a) Click "Open a Connectometry Database" and select the database created in STEP1:

b) In the top menu, select [Tools][Longitudinal scans...] to bring up the subject match window:


Click on the "Match consecutive scans" button to match scan 0 (base) and scan 1 (study), 2 and 3, 4 and 5, through out the entire database. 

Alternatively, you can load your own matching table using a text file. The text file should be the consecutive numbers of the matching pairs. The first number in a pair will be used as the baseline, and the second number will be the study scans. For example, a text file with "0 1 2 3 4 5" will match scan 0 (base) and scan 1(study), scan 2 and scan 3, scan 4 and scan 5. Please note that the first scan in the database is labeled by 0, not 1. 


The "metric" group box allows you to define how the difference is calculated. "a-b" calculates the absolute difference by "the study scan (a) - the baseline (b)", whereas "(a-b)/(a+b)" calculates the difference as a ratio. "a-b" is recommended here.

Click "Ok" to calculate the difference between scans and go back to the database window. 

Use [File][Save DB as..]  to save the differences as a "modified connectometry db".

The new connectometry db will contain only the difference between the paired scans for either paired longitudinal analysis or multiple regression analysis.

STEP 3: Group Connectometry


Open the db.fib.gz file using the "Group Connectometry Analysis" button in the main window (Under the connectometry tab) to bring up the following window. 

There are two analysis models available: "multiple regression" and "paired longitudinal change".

Multiple regression finds tracks correlates with any given variables with other variables included in the regression model.

Paired longitudinal change study whether there is connectivity increase or decrease between the pre- and post-treatment. To run this analysis, you need to open the "modified connectometry db" paired in STEP2. 


If the dialog popups a warning about poor image quality (low R2), you may need to check the data at the "Source Data" tab to find out which subject has a registration problem or motion artifact. The possible solutions are mentioned in the documentation for creating a connectometry database.

STEP 3.1: Load demographics (for Multiple Regression only)

Prepare a text file that records any scalar values you would like to include in the regression model. For example, you can copy the values in Microsoft Excel and Paste them in Notepad (Window) or any text editor.

An example of the data format is shown in the following figure. The first row is the name of the scalar values. The second row is the value for the first subject, and the rest follows. If there is any missing data, assign 9999 and check the "missing data labeled by" checkbox (as shown in the interface). DSI Studio will ignore subjects with missing data.


Click on the button labeled "open subjects demographics". Load the txt file that records all the scalar values that will be included in the regression model. In the demographic table, you need to check whether the values matches well with the subject IDs. 

Please choose variables to be included in the regression model. This can include any confounding factors like sex and handedness. There are several tips in choosing the variables in the model.

TIPS:
1.If you have multiples study variables, including too many of them in the model may lead to over-fitting unless you have enough sample size. Otherwise, include basic variables like the sex and age with the one study variable at a time.

2. If your study variables are highly correlated to each other, consider using a PCA to get the principle components.

STEP 3.2 Setup parameters

Choose the study variable to run the connectometry analysis. DSI Studio will find the tracks correlated with this study variable. 

T threshold
Assign a T threshold to run the analysis. Higher values give more specific results, whereas lower values give more sensitive results. You can run separate analyses by different threshold (e.g., t=1, 2, and 3) to map different levels of correlation. Each threshold is viewed as a different hypothesis and will receive an FDR. The FDR value gives us an idea of how reliable the findings are, and higher T thresholds may not necessarily reduce FDR. 

My experience is that an FDR of 0.05 is highly confirmative while 0.05~0.2 suggests a high possibility of positive findings. Any findings with FDR lower than 0.5 may be worth reporting since the results have more "true positive" findings than "false positive" findings.

Select tracks by length or FDR
The result can be selected using length or FDR. 

For studies confirming a hypothesis, use "length" as the threshold and specify a predefined length (20~40 voxel distance or more) and report FDR value as the significance index. Different length thresholds correspond to different null hypotheses, and a longer length will tend to achieve a lower FDR (with fewer findings). 

For studies aiming at findings which tracks are correlated with the study variable, use "FDR" as the threshold. DSI Studio will capture tracks with FDR lower than the predefined threshold value (e.g. 0.05).


Advanced options

Permutation count determines the total number of permutation applied to the FDR analysis. Higher values give smoother and more accurate estimation of FDR curves, but it requires more computation time. 
Seed count specify the number of seed for each permutation. Increase this number if you have assign a tracking region (see below).
Pruning helps remove noisy findings. A Higher value may need higher seeding count. Track trimming helps to improve FDR.
Normalize SDF: Recommend unchecked unless the directionality of the results is wrong. This SDF normalization will reduce the systemic error and difference, but it assumes that the maximum anisotropy is the same for all subjects to stabilize the quality. My suggestions for this parameter is the following:

(1) If the direction of the correlation is wrong (this requires your prior assumption for judgement), switch "normalize SDF" to the other state (on->off, off->on)
(2) If normalize SDF does not affect the results or FDR, turn it off.
(3) If normalize SDF improves the FDR, leave it on.
(4) Make a consistent choice for the same data set. (i.e. do not turn it on when correlating with certain variables and off for the others)

Tracking region: The default study region for connectometry is "whole brain", which means that connectometry will look at whole brain region. You can choose to use an ROI in the MNI space by loading a NIFTI file of the mask. You can also choose a region from the atlas. The region type can be ROI (select tracts passing the region), ROA (discard any track passing the region), End (select track ending in the region), Seed (Only start the seeds from the region). "Whole brain" is most suitable for the exploratory purpose. If you have specific region of interest, limiting the computation to an ROI can give more specific result (e.g. assign cerebral peduncle as the ROI can investigate whether CST affected). Using only one ROI is good for most cases unless you have specific need in the tracking routine.


STEP 3.3: Run connectometry

Click on the "STEP 4:Run" button and wait until the computation is finished.

You may click on "View Results in 3D" button to view the tracks with a substantial difference. 

DSI Studio will output the following files:

demo.txt is the demography text file prepared in STEP0. 
demo.txt.length40.bmi.t20.dist_value.txt records the distribution of track length.
demo.txt.length40.bmi.t20.fdr_value.txt records the FDR with respect to length.
demo.txt.length40.bmi.t20.greater.dist.bmp shows the histogram of track length. "greater" here means positive correlation in multiple regression, or group 0 greater than group 1 if group analysis was selected.
demo.txt.length40.bmi.t20.greater.fdr.bmp shows the FDR with respect to track length.
demo.txt.length40.bmi.t20.greater.fib.gz is a FIB file which can be opened in "STEP3 fiber tracking" to observe the how local connectome correlates with the study variable.
demo.txt.length40.bmi.t20.greater.trk.gz is a track file of the connectometry finding.
demo.txt.length40.bmi.t20.report.txt is a method report.

You may also check out the report to find out the FDR of the finding. DSI Studio will try to recognize the name of the pathways but may not be accurate. You may need to check the neuroanatomical name of the results. 

Troubleshooting (Important):
1. If the direction of the correlation is wrong, try checking "normalize SDF" to fix the problem.
2. If you have very different FDR results at different permutation counts, increase the permutation count until the FDR converges. 
3. If the finding only shows very short tracks, try lowing the T threshold 
4. If the finding only contains very few tracks, try increasing the track density in the advance option (may need more computation time).
5. If the finding has many short fragments, consider increasing the length threshold.
6. There are several ways to improve the FDR, with the expense of losing the sensitivity. You may increase the length threshold (e.g. 50 mm), increase T-score threshold, increase track trimming (e.g. 2 rounds, in the advanced options).

How to interpret and report the result?

Strategy 1: report FDR at different T-scores gives a overview of finding from high sensitivity (low T) to high specificity (high T) 
You can visualize the results at different T threshold (see the next section about visualization). The FDR value will be different for different track length. An FDR < 0.05 indicates a highly confirmative finding. An FDR > 0.5 suggest that there are false findings than true findings. The result is thus inconclusive and not reliable. An FDR between 0.05 and 0.5 cover a wide spectrum of reliability. 

Strategy 2: report FDR at different length thresholds (e.g. 20, 40, 60 voxel distance)
Different length thresholds will capture findings at different pathways. Longer pathways can be better demonstrated using a higher length threshold.

Strategy 3: Convert the finding to ROI using [Tracts][Tract to ROI] to track the entire pathways. This allows you to confirm the anatomical structure of the findings.

STEP 4) Visualize results and report finding


Tracks with surface rendering


1) Click on the "View Results in 3D" button to view the tracks picked up by connectometry analysis. Two set of tracks will be presented. One is for positive correlation, and another is for negative correlation in multiple comparison.   

Another way to show the results is first openning the connectometry database (the *.db.fib.gz file) in "STEP3: fiber tracking" and loading the trk files under the [Tracts] menu. You may find the trk files placed under same directory of the demographic files.

2) Switch to "wm" as shown in the figure and click "+isosurface". You may choose full, or right half, right half...etc. to visualize full white matter or only part of it. The cut off is defined by the slice position. You can switch slices image to "T1w" to see the T1-weighted images.



3) Change the opacity of the surface in the Options window (right upper corner) under the "Surface Rendering" node.




Tracks in mosaic T1w

The finding can be visualized in slices using the following steps:

1) Switch slice image to "T1w".
2) In the main menu, add tracks to ROI by [Tracks][Tracks to ROI]
3) In the Options window (right upper corner), under the "Region Window" node. Change the slice layout to Mosaic 2
4) Change the contrast of the slices using the slider on the top of the region window to enhance the tracks regions.


Network Property Analysis for Connectometry Findings

The connectometry findings usually present a group of fiber bundles, and how these bundles affect the network topology can be further analyzed to better present the results.


The following are steps to carry out this analysis:


1. In the tractography window (After clicking the "View Results in 3D" button or open the trk file in STEP3 Fiber Tracking), run whole brain fiber tracking (No ROI or seed) to get a total of 100,000 tracks using default tracking settings.

2. With the finding tracks unchecked and whole-brain track checked, click [Tracts][Connectivity matrix] to bring up the connectivity dialog. Switch to the "Network measures" tab and copy the content (e.g. density, ...etc.) to the Excel sheet.

3. With the finding tracks "checked" and whole-brain track also checked, repeat the same steps to copy the network measure in a different Excel sheet.

4. Calculate the value differences (finding checked-finding unchecked) as a % change in 

    a. clustering_coeff_average(binary)
    b. network_characteristic_path_length(binary)
    c. small-worldness(binary)
    d. global_efficiency(binary)
    e. local_efficiency(binary).

5. If the change of clustering_coeff_average(binary) is +5% and the finding is from a positive/negative correlation of a study factor X, then you may report: "The network topology analysis on connectometry results show that X increases/decreases 5% of clustering coefficient in the network topology." Repeat the same reporting format for other network measures.

6. Provide a table summarizing the results using the following:

Table: Effect of X, Y, Z on network topology measures
Study  VariableClustering Coefficient (%)Network Characteristic Path Length (%)Global Efficiency (%)Local Efficiency (%)Small
Worldness (%)
X−0.30−5.383.354.605.08
Y62.27−7.956.8857.9066.91
Z−18.65−10.578.11−7.47−8.24



Get SDF values along the track

Once you get tracks that show significant difference or correlation. You can open the connectometry db at [STEP 3 Fiber tracking] and load the resulting tracks. Click on track statistics and you will get SDF values along the tracks for "each subject". This allows you to plot the difference between the groups.

Reference:

[1] F.-C. Yeh, D. Badre, and T. Verstynen, “Connectometry: a statistical approach harnessing the analytical potential of the local connectome”, Neuroimage, accepted , 2015. (pdf)(link)
[2] F-C. Yeh, P-F. Tang, and W-Y. I. Tseng, “Diffusion MRI Connectometry Automatically Reveals Affected Fiber Pathways in Individuals with Chronic Stroke”, Neuroimage: Clinical 2 (2013) 912-921, (pdf)
[3] F. C. Yeh and W. Y. Tseng, "NTU-90: a high angular resolution brain atlas constructed by q-space diffeomorphic reconstruction," Neuroimage 58, 91-99 (2011). (pdf)
[4] Yeh F-C, Vettel JM, Singh A, Poczos B, Grafton ST, Erickson KI, et al. (2016) Quantifying Differences and Similarities in Whole-Brain White Matter Architecture Using Local Connectome Fingerprints. PLoS Comput Biol 12(11): e1005203. doi:10.1371/journal.pcbi.1005203

Ċ
Fang-Cheng Yeh,
Oct 22, 2015, 6:38 PM
Comments