Patch2Self: Self-Supervised Denoising via Statistical Independence#

Patch2Self [1] and [2] is a self-supervised learning method for denoising DWI data, which uses the entire volume to learn a full-rank locally linear denoiser for that volume. By taking advantage of the oversampled q-space of DWI data, Patch2Self can separate structure from noise without requiring an explicit model for either.

Classical denoising algorithms such as Local PCA [3], [4], Non-local Means [5], Total Variation Norm [6], etc. assume certain properties on the signal structure. Patch2Self does not make any such assumption on the signal, instead using the fact that the noise across different 3D volumes of the DWI signal originates from random fluctuations in the acquired signal.

Since Patch2Self only relies on the randomness of the noise, it can be applied at any step in the pre-processing pipeline. The design of Patch2Self is such that it can work on any type of diffusion data/ any body part without requiring a noise estimation or assumptions on the type of noise (such as its distribution).

The Patch2Self Framework:

https://github.com/dipy/dipy_data/blob/master/Patch2Self_Framework.PNG?raw=true

The above figure demonstrates the working of Patch2Self. The idea is to build a new regressor for denoising each 3D volume of the 4D diffusion data. This is done in the following 2 phases:

(A) Self-supervised training: First, we extract 3D Patches from all the n volumes and hold out the target volume to denoise. Each patch from the rest of the (n-1) volumes predicts the center voxel of the corresponding patch in the target volume.

This is done by using the self-supervised loss:

\[\mathcal{L}\left(\Phi_{J}\right)=\mathbb{E}\left\|\Phi_{J}\ \left(Y_{*, *,-j}\right)-Y_{*, 0, j}\right\|^{2}\]

(B) Prediction: The same ‘n-1’ volumes which were used in the training are now fed into the regressor \(\Phi\) built in phase (A). The prediction is a denoised version of held-out volume.

Note: The volume to be denoised is merely used as the target in the training phase. But is not used in the training set for (A) nor is used to predict the denoised output in (B).

Let’s load the necessary modules:

import matplotlib.pyplot as plt
import numpy as np

from dipy.data import get_fnames
from dipy.denoise.patch2self import patch2self
from dipy.io.image import load_nifti, save_nifti

Now let’s load an example dataset and denoise it with Patch2Self. Patch2Self does not require noise estimation and should work with any kind of diffusion data.

hardi_fname, hardi_bval_fname, hardi_bvec_fname = get_fnames(name="stanford_hardi")
data, affine = load_nifti(hardi_fname)
bvals = np.loadtxt(hardi_bval_fname)
denoised_arr = patch2self(
    data,
    bvals,
    model="ols",
    shift_intensity=True,
    clip_negative_vals=False,
    b0_threshold=50,
)
Fitting and Denoising:   0%|          | 0/160 [00:00<?, ?it/s]
Fitting and Denoising:   1%|          | 1/160 [00:00<00:54,  2.90it/s]
Fitting and Denoising:   1%|▏         | 2/160 [00:00<00:40,  3.92it/s]
Fitting and Denoising:   2%|▏         | 3/160 [00:00<00:32,  4.86it/s]
Fitting and Denoising:   2%|▎         | 4/160 [00:00<00:28,  5.41it/s]
Fitting and Denoising:   3%|▎         | 5/160 [00:01<00:27,  5.62it/s]
Fitting and Denoising:   4%|▍         | 6/160 [00:01<00:26,  5.83it/s]
Fitting and Denoising:   4%|▍         | 7/160 [00:01<00:24,  6.19it/s]
Fitting and Denoising:   5%|▌         | 8/160 [00:01<00:23,  6.58it/s]
Fitting and Denoising:   6%|▌         | 9/160 [00:01<00:22,  6.67it/s]
Fitting and Denoising:   6%|▋         | 10/160 [00:01<00:24,  6.24it/s]
Fitting and Denoising:   7%|▋         | 11/160 [00:03<01:16,  1.95it/s]
Fitting and Denoising:   8%|▊         | 12/160 [00:04<01:59,  1.24it/s]
Fitting and Denoising:   8%|▊         | 13/160 [00:05<02:26,  1.00it/s]
Fitting and Denoising:   9%|▉         | 14/160 [00:07<02:49,  1.16s/it]
Fitting and Denoising:   9%|▉         | 15/160 [00:08<02:47,  1.16s/it]
Fitting and Denoising:  10%|█         | 16/160 [00:10<02:58,  1.24s/it]
Fitting and Denoising:  11%|█         | 17/160 [00:11<03:06,  1.30s/it]
Fitting and Denoising:  11%|█▏        | 18/160 [00:12<03:07,  1.32s/it]
Fitting and Denoising:  12%|█▏        | 19/160 [00:14<03:09,  1.34s/it]
Fitting and Denoising:  12%|█▎        | 20/160 [00:15<03:17,  1.41s/it]
Fitting and Denoising:  13%|█▎        | 21/160 [00:17<03:28,  1.50s/it]
Fitting and Denoising:  14%|█▍        | 22/160 [00:19<03:23,  1.48s/it]
Fitting and Denoising:  14%|█▍        | 23/160 [00:20<03:26,  1.51s/it]
Fitting and Denoising:  15%|█▌        | 24/160 [00:22<03:25,  1.51s/it]
Fitting and Denoising:  16%|█▌        | 25/160 [00:23<03:15,  1.44s/it]
Fitting and Denoising:  16%|█▋        | 26/160 [00:24<03:17,  1.47s/it]
Fitting and Denoising:  17%|█▋        | 27/160 [00:26<03:16,  1.48s/it]
Fitting and Denoising:  18%|█▊        | 28/160 [00:28<03:19,  1.51s/it]
Fitting and Denoising:  18%|█▊        | 29/160 [00:29<03:08,  1.44s/it]
Fitting and Denoising:  19%|█▉        | 30/160 [00:30<03:16,  1.51s/it]
Fitting and Denoising:  19%|█▉        | 31/160 [00:32<03:24,  1.59s/it]
Fitting and Denoising:  20%|██        | 32/160 [00:34<03:21,  1.57s/it]
Fitting and Denoising:  21%|██        | 33/160 [00:35<03:14,  1.53s/it]
Fitting and Denoising:  21%|██▏       | 34/160 [00:37<03:18,  1.58s/it]
Fitting and Denoising:  22%|██▏       | 35/160 [00:39<03:19,  1.60s/it]
Fitting and Denoising:  22%|██▎       | 36/160 [00:40<03:15,  1.58s/it]
Fitting and Denoising:  23%|██▎       | 37/160 [00:41<03:02,  1.48s/it]
Fitting and Denoising:  24%|██▍       | 38/160 [00:43<03:11,  1.57s/it]
Fitting and Denoising:  24%|██▍       | 39/160 [00:45<03:20,  1.66s/it]
Fitting and Denoising:  25%|██▌       | 40/160 [00:47<03:24,  1.70s/it]
Fitting and Denoising:  26%|██▌       | 41/160 [00:48<03:18,  1.67s/it]
Fitting and Denoising:  26%|██▋       | 42/160 [00:50<03:17,  1.67s/it]
Fitting and Denoising:  27%|██▋       | 43/160 [00:52<03:25,  1.76s/it]
Fitting and Denoising:  28%|██▊       | 44/160 [00:54<03:26,  1.78s/it]
Fitting and Denoising:  28%|██▊       | 45/160 [00:56<03:22,  1.76s/it]
Fitting and Denoising:  29%|██▉       | 46/160 [00:57<03:25,  1.80s/it]
Fitting and Denoising:  29%|██▉       | 47/160 [00:59<03:15,  1.73s/it]
Fitting and Denoising:  30%|███       | 48/160 [01:01<03:19,  1.78s/it]
Fitting and Denoising:  31%|███       | 49/160 [01:04<03:46,  2.04s/it]
Fitting and Denoising:  31%|███▏      | 50/160 [01:06<04:02,  2.20s/it]
Fitting and Denoising:  32%|███▏      | 51/160 [01:08<03:50,  2.11s/it]
Fitting and Denoising:  32%|███▎      | 52/160 [01:10<03:48,  2.11s/it]
Fitting and Denoising:  33%|███▎      | 53/160 [01:13<03:53,  2.18s/it]
Fitting and Denoising:  34%|███▍      | 54/160 [01:15<03:57,  2.25s/it]
Fitting and Denoising:  34%|███▍      | 55/160 [01:17<03:49,  2.19s/it]
Fitting and Denoising:  35%|███▌      | 56/160 [01:19<03:58,  2.30s/it]
Fitting and Denoising:  36%|███▌      | 57/160 [01:21<03:46,  2.20s/it]
Fitting and Denoising:  36%|███▋      | 58/160 [01:24<03:50,  2.26s/it]
Fitting and Denoising:  37%|███▋      | 59/160 [01:26<03:43,  2.22s/it]
Fitting and Denoising:  38%|███▊      | 60/160 [01:28<03:26,  2.06s/it]
Fitting and Denoising:  38%|███▊      | 61/160 [01:29<03:03,  1.86s/it]
Fitting and Denoising:  39%|███▉      | 62/160 [01:31<03:00,  1.84s/it]
Fitting and Denoising:  39%|███▉      | 63/160 [01:32<02:51,  1.77s/it]
Fitting and Denoising:  40%|████      | 64/160 [01:34<02:48,  1.76s/it]
Fitting and Denoising:  41%|████      | 65/160 [01:36<02:45,  1.75s/it]
Fitting and Denoising:  41%|████▏     | 66/160 [01:38<02:48,  1.80s/it]
Fitting and Denoising:  42%|████▏     | 67/160 [01:40<02:45,  1.78s/it]
Fitting and Denoising:  42%|████▎     | 68/160 [01:41<02:35,  1.69s/it]
Fitting and Denoising:  43%|████▎     | 69/160 [01:43<02:30,  1.66s/it]
Fitting and Denoising:  44%|████▍     | 70/160 [01:45<02:34,  1.72s/it]
Fitting and Denoising:  44%|████▍     | 71/160 [01:46<02:35,  1.75s/it]
Fitting and Denoising:  45%|████▌     | 72/160 [01:48<02:24,  1.65s/it]
Fitting and Denoising:  46%|████▌     | 73/160 [01:49<02:24,  1.66s/it]
Fitting and Denoising:  46%|████▋     | 74/160 [01:51<02:25,  1.69s/it]
Fitting and Denoising:  47%|████▋     | 75/160 [01:53<02:25,  1.72s/it]
Fitting and Denoising:  48%|████▊     | 76/160 [01:55<02:27,  1.75s/it]
Fitting and Denoising:  48%|████▊     | 77/160 [01:57<02:24,  1.74s/it]
Fitting and Denoising:  49%|████▉     | 78/160 [01:58<02:21,  1.72s/it]
Fitting and Denoising:  49%|████▉     | 79/160 [02:00<02:24,  1.78s/it]
Fitting and Denoising:  50%|█████     | 80/160 [02:02<02:25,  1.81s/it]
Fitting and Denoising:  51%|█████     | 81/160 [02:04<02:21,  1.79s/it]
Fitting and Denoising:  51%|█████▏    | 82/160 [02:05<02:18,  1.78s/it]
Fitting and Denoising:  52%|█████▏    | 83/160 [02:07<02:18,  1.80s/it]
Fitting and Denoising:  52%|█████▎    | 84/160 [02:09<02:12,  1.74s/it]
Fitting and Denoising:  53%|█████▎    | 85/160 [02:11<02:08,  1.71s/it]
Fitting and Denoising:  54%|█████▍    | 86/160 [02:13<02:13,  1.81s/it]
Fitting and Denoising:  54%|█████▍    | 87/160 [02:14<02:09,  1.78s/it]
Fitting and Denoising:  55%|█████▌    | 88/160 [02:16<02:01,  1.68s/it]
Fitting and Denoising:  56%|█████▌    | 89/160 [02:17<01:55,  1.63s/it]
Fitting and Denoising:  56%|█████▋    | 90/160 [02:19<01:56,  1.67s/it]
Fitting and Denoising:  57%|█████▋    | 91/160 [02:21<01:58,  1.72s/it]
Fitting and Denoising:  57%|█████▊    | 92/160 [02:22<01:54,  1.68s/it]
Fitting and Denoising:  58%|█████▊    | 93/160 [02:24<01:41,  1.52s/it]
Fitting and Denoising:  59%|█████▉    | 94/160 [02:26<01:47,  1.63s/it]
Fitting and Denoising:  59%|█████▉    | 95/160 [02:27<01:50,  1.70s/it]
Fitting and Denoising:  60%|██████    | 96/160 [02:29<01:49,  1.72s/it]
Fitting and Denoising:  61%|██████    | 97/160 [02:31<01:46,  1.68s/it]
Fitting and Denoising:  61%|██████▏   | 98/160 [02:33<01:48,  1.74s/it]
Fitting and Denoising:  62%|██████▏   | 99/160 [02:34<01:47,  1.76s/it]
Fitting and Denoising:  62%|██████▎   | 100/160 [02:36<01:40,  1.67s/it]
Fitting and Denoising:  63%|██████▎   | 101/160 [02:38<01:38,  1.66s/it]
Fitting and Denoising:  64%|██████▍   | 102/160 [02:39<01:39,  1.72s/it]
Fitting and Denoising:  64%|██████▍   | 103/160 [02:41<01:38,  1.72s/it]
Fitting and Denoising:  65%|██████▌   | 104/160 [02:43<01:33,  1.68s/it]
Fitting and Denoising:  66%|██████▌   | 105/160 [02:44<01:33,  1.70s/it]
Fitting and Denoising:  66%|██████▋   | 106/160 [02:46<01:31,  1.69s/it]
Fitting and Denoising:  67%|██████▋   | 107/160 [02:48<01:29,  1.69s/it]
Fitting and Denoising:  68%|██████▊   | 108/160 [02:50<01:28,  1.71s/it]
Fitting and Denoising:  68%|██████▊   | 109/160 [02:51<01:23,  1.64s/it]
Fitting and Denoising:  69%|██████▉   | 110/160 [02:53<01:23,  1.66s/it]
Fitting and Denoising:  69%|██████▉   | 111/160 [02:54<01:18,  1.61s/it]
Fitting and Denoising:  70%|███████   | 112/160 [02:56<01:19,  1.65s/it]
Fitting and Denoising:  71%|███████   | 113/160 [02:58<01:20,  1.72s/it]
Fitting and Denoising:  71%|███████▏  | 114/160 [02:59<01:16,  1.67s/it]
Fitting and Denoising:  72%|███████▏  | 115/160 [03:01<01:09,  1.54s/it]
Fitting and Denoising:  72%|███████▎  | 116/160 [03:02<01:10,  1.60s/it]
Fitting and Denoising:  73%|███████▎  | 117/160 [03:04<01:08,  1.59s/it]
Fitting and Denoising:  74%|███████▍  | 118/160 [03:06<01:08,  1.62s/it]
Fitting and Denoising:  74%|███████▍  | 119/160 [03:07<01:01,  1.51s/it]
Fitting and Denoising:  75%|███████▌  | 120/160 [03:09<01:03,  1.58s/it]
Fitting and Denoising:  76%|███████▌  | 121/160 [03:10<01:02,  1.60s/it]
Fitting and Denoising:  76%|███████▋  | 122/160 [03:12<01:00,  1.59s/it]
Fitting and Denoising:  77%|███████▋  | 123/160 [03:13<00:57,  1.55s/it]
Fitting and Denoising:  78%|███████▊  | 124/160 [03:15<00:58,  1.61s/it]
Fitting and Denoising:  78%|███████▊  | 125/160 [03:17<00:56,  1.61s/it]
Fitting and Denoising:  79%|███████▉  | 126/160 [03:18<00:54,  1.62s/it]
Fitting and Denoising:  79%|███████▉  | 127/160 [03:20<00:53,  1.63s/it]
Fitting and Denoising:  80%|████████  | 128/160 [03:22<00:54,  1.70s/it]
Fitting and Denoising:  81%|████████  | 129/160 [03:23<00:52,  1.68s/it]
Fitting and Denoising:  81%|████████▏ | 130/160 [03:25<00:49,  1.64s/it]
Fitting and Denoising:  82%|████████▏ | 131/160 [03:27<00:48,  1.66s/it]
Fitting and Denoising:  82%|████████▎ | 132/160 [03:29<00:47,  1.71s/it]
Fitting and Denoising:  83%|████████▎ | 133/160 [03:30<00:47,  1.74s/it]
Fitting and Denoising:  84%|████████▍ | 134/160 [03:32<00:42,  1.64s/it]
Fitting and Denoising:  84%|████████▍ | 135/160 [03:34<00:42,  1.68s/it]
Fitting and Denoising:  85%|████████▌ | 136/160 [03:35<00:40,  1.71s/it]
Fitting and Denoising:  86%|████████▌ | 137/160 [03:37<00:40,  1.76s/it]
Fitting and Denoising:  86%|████████▋ | 138/160 [03:39<00:38,  1.74s/it]
Fitting and Denoising:  87%|████████▋ | 139/160 [03:41<00:35,  1.71s/it]
Fitting and Denoising:  88%|████████▊ | 140/160 [03:42<00:35,  1.78s/it]
Fitting and Denoising:  88%|████████▊ | 141/160 [03:44<00:32,  1.71s/it]
Fitting and Denoising:  89%|████████▉ | 142/160 [03:46<00:31,  1.74s/it]
Fitting and Denoising:  89%|████████▉ | 143/160 [03:48<00:31,  1.84s/it]
Fitting and Denoising:  90%|█████████ | 144/160 [03:50<00:28,  1.81s/it]
Fitting and Denoising:  91%|█████████ | 145/160 [03:51<00:26,  1.75s/it]
Fitting and Denoising:  91%|█████████▏| 146/160 [03:53<00:23,  1.71s/it]
Fitting and Denoising:  92%|█████████▏| 147/160 [03:55<00:22,  1.70s/it]
Fitting and Denoising:  92%|█████████▎| 148/160 [03:56<00:20,  1.72s/it]
Fitting and Denoising:  93%|█████████▎| 149/160 [03:58<00:18,  1.72s/it]
Fitting and Denoising:  94%|█████████▍| 150/160 [04:00<00:17,  1.71s/it]
Fitting and Denoising:  94%|█████████▍| 151/160 [04:02<00:15,  1.74s/it]
Fitting and Denoising:  95%|█████████▌| 152/160 [04:03<00:13,  1.71s/it]
Fitting and Denoising:  96%|█████████▌| 153/160 [04:05<00:11,  1.69s/it]
Fitting and Denoising:  96%|█████████▋| 154/160 [04:06<00:10,  1.67s/it]
Fitting and Denoising:  97%|█████████▋| 155/160 [04:08<00:08,  1.69s/it]
Fitting and Denoising:  98%|█████████▊| 156/160 [04:10<00:06,  1.66s/it]
Fitting and Denoising:  98%|█████████▊| 157/160 [04:11<00:05,  1.67s/it]
Fitting and Denoising:  99%|█████████▉| 158/160 [04:13<00:03,  1.69s/it]
Fitting and Denoising:  99%|█████████▉| 159/160 [04:15<00:01,  1.66s/it]
Fitting and Denoising: 100%|██████████| 160/160 [04:17<00:00,  1.74s/it]

The above parameters should give optimal denoising performance for Patch2Self. The ordinary least squares regression (model='ols') tends to be a little slower depending on the size of the data. In that case, please consider switching to ridge regression (model='ridge').

Please do note that sometimes using ridge regression can hamper the performance of Patch2Self. If so, please use model='ols'.

The array denoised_arr contains the denoised output obtained from Patch2Self.

Note

Depending on the acquisition, b0 may exhibit signal attenuation or other artefacts that are not ideal for any denoising algorithm. We therefore provide an option to skip denoising b0 volumes in the data. This can be done by using the option b0_denoising=False within Patch2Self.

Please set shift_intensity=True and clip_negative_vals=False by default to avoid negative values in the denoised output.

The b0_threshold is used to separate the b0 volumes from the DWI volumes. Changing the value of the b0 threshold is needed if the b0 volumes in the bval file have a value greater than the default b0_threshold.

The default value of b0_threshold in DIPY is set to 50. If using data such as HCP 7T, the b0 volumes tend to have a higher b-value (>=50) associated with them in the bval file. Please check the b-values for b0s and adjust the b0_threshold` accordingly.

Now let’s visualize the output and the residuals obtained from the denoising.

# Gets the center slice and the middle volume of the 4D diffusion data.
sli = data.shape[2] // 2
gra = 60  # pick out a random volume for a particular gradient direction

orig = data[:, :, sli, gra]
den = denoised_arr[:, :, sli, gra]

# computes the residuals
rms_diff = np.sqrt((orig - den) ** 2)

fig1, ax = plt.subplots(1, 3, figsize=(12, 6), subplot_kw={"xticks": [], "yticks": []})

fig1.subplots_adjust(hspace=0.3, wspace=0.05)

ax.flat[0].imshow(orig.T, cmap="gray", interpolation="none", origin="lower")
ax.flat[0].set_title("Original")
ax.flat[1].imshow(den.T, cmap="gray", interpolation="none", origin="lower")
ax.flat[1].set_title("Denoised Output")
ax.flat[2].imshow(rms_diff.T, cmap="gray", interpolation="none", origin="lower")
ax.flat[2].set_title("Residuals")

fig1.savefig("denoised_patch2self.png")
denoise patch2self

Patch2Self preserved anatomical detail. This can be visually verified by inspecting the residuals obtained above. Since we do not see any structure in the difference residuals, it is clear that it preserved the underlying signal structure and got rid of the stochastic noise.

Below we show how the denoised data can be saved.

save_nifti("denoised_patch2self.nii.gz", denoised_arr, affine)

You can also use Patch2Self version 1 to denoise the data by using version argument. The default version is set to 3. To use version 1, you can call Patch2Self as follows:

patch2self(data, bvals, version=1)

Lastly, one can also use Patch2Self in batches if the number of gradient directions is very high (>=200 volumes). For instance, if the data has 300 volumes, one can split the data into 2 batches, (150 directions each) and still get the same denoising performance. One can run Patch2Self using:

denoised_batch1 = patch2self(data[..., :150], bvals[:150])
denoised_batch2 = patch2self(data[..., 150:], bvals[150:])

After doing this, the 2 denoised batches can be merged as follows:

denoised_p2s = np.concatenate((denoised_batch1, denoised_batch2), axis=3)

One can also consider using the above batching approach to denoise each shell separately if working with multi-shell data.

References#

Total running time of the script: (4 minutes 25.996 seconds)

Gallery generated by Sphinx-Gallery