stats
¶
Module: stats.analysis
¶
Computes the average of pointwise Euclidean distances between two sequential data. |
|
|
Clusters streamlines using QuickBundles [Garyfallidis12]. |
alias of |
|
|
kd-tree for quick nearest-neighbor lookup |
|
Calculates a summarized profile of data for a bundle or tract along its length. |
|
Applies statistical analysis on bundles and saves the results in a directory specified by |
|
Calculates dti measure (eg: FA, MD) per point on streamlines and save it in hd5 file. |
|
Calculate weights for each streamline/node in a bundle, based on a Mahalanobis distance from the core the bundle, at that node (mean, per default). |
|
|
|
Load a PeaksAndMetrics HDF5 file (PAM5) |
|
Load the stateful tractogram from any format (trk, tck, fib, dpy) |
|
Compute the Mahalanobis distance between two 1-D arrays. |
|
Map the input array to new coordinates by interpolation. |
|
Return package-like thing and module setup for package name |
|
Orient a bundle of streamlines to a standard streamline. |
|
Peak_values function finds the peak direction and peak value of a point on a streamline used while tracking (generating the tractogram) and save it in hd5 file. |
Change the number of points of streamlines |
|
|
Apply affine transformation to streamlines |
|
Extract values of a scalar/vector along each streamline from a volume. |
AveragePointwiseEuclideanMetric
¶
-
class
dipy.stats.analysis.
AveragePointwiseEuclideanMetric
¶ Bases:
dipy.segment.metricspeed.SumPointwiseEuclideanMetric
Computes the average of pointwise Euclidean distances between two sequential data.
A sequence of N-dimensional points is represented as a 2D array with shape (nb_points, nb_dimensions). A feature object can be specified in order to calculate the distance between the features, rather than directly between the sequential data.
- Parameters
- featureFeature object, optional
It is used to extract features before computing the distance.
Notes
The distance between two 2D sequential data:
s1 s2 0* a *0 \ | \ | 1* | | b *1 | \ 2* \ c *2
is equal to \((a+b+c)/3\) where \(a\) is the Euclidean distance between s1[0] and s2[0], \(b\) between s1[1] and s2[1] and \(c\) between s1[2] and s2[2].
- Attributes
feature
Feature object used to extract features from sequential data
is_order_invariant
Is this metric invariant to the sequence’s ordering
Methods
are_compatible
Checks if features can be used by metric.dist based on their shape.
dist
Computes a distance between two data points based on their features.
-
__init__
($self, /, *args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
QuickBundles
¶
-
class
dipy.stats.analysis.
QuickBundles
(threshold, metric='MDF_12points', max_nb_clusters=2147483647)¶ Bases:
dipy.segment.clustering.Clustering
Clusters streamlines using QuickBundles [Garyfallidis12].
Given a list of streamlines, the QuickBundles algorithm sequentially assigns each streamline to its closest bundle in \(\mathcal{O}(Nk)\) where \(N\) is the number of streamlines and \(k\) is the final number of bundles. If for a given streamline its closest bundle is farther than threshold, a new bundle is created and the streamline is assigned to it except if the number of bundles has already exceeded max_nb_clusters.
- Parameters
- thresholdfloat
The maximum distance from a bundle for a streamline to be still considered as part of it.
- metricstr or Metric object (optional)
The distance metric to use when comparing two streamlines. By default, the Minimum average Direct-Flip (MDF) distance [Garyfallidis12] is used and streamlines are automatically resampled so they have 12 points.
- max_nb_clustersint
Limits the creation of bundles.
References
- Garyfallidis12(1,2,3,4)
Garyfallidis E. et al., QuickBundles a method for tractography simplification, Frontiers in Neuroscience, vol 6, no 175, 2012.
Examples
>>> from dipy.segment.clustering import QuickBundles >>> from dipy.data import get_fnames >>> from dipy.io.streamline import load_tractogram >>> from dipy.tracking.streamline import Streamlines >>> fname = get_fnames('fornix') >>> fornix = load_tractogram(fname, 'same', ... bbox_valid_check=False).streamlines >>> streamlines = Streamlines(fornix) >>> # Segment fornix with a threshold of 10mm and streamlines resampled >>> # to 12 points. >>> qb = QuickBundles(threshold=10.) >>> clusters = qb.cluster(streamlines) >>> len(clusters) 4 >>> list(map(len, clusters)) [61, 191, 47, 1] >>> # Resampling streamlines differently is done explicitly as follows. >>> # Note this has an impact on the speed and the accuracy (tradeoff). >>> from dipy.segment.metric import ResampleFeature >>> from dipy.segment.metric import AveragePointwiseEuclideanMetric >>> feature = ResampleFeature(nb_points=2) >>> metric = AveragePointwiseEuclideanMetric(feature) >>> qb = QuickBundles(threshold=10., metric=metric) >>> clusters = qb.cluster(streamlines) >>> len(clusters) 4 >>> list(map(len, clusters)) [58, 142, 72, 28]
Methods
cluster
(streamlines[, ordering])Clusters streamlines into bundles.
-
__init__
(threshold, metric='MDF_12points', max_nb_clusters=2147483647)¶ Initialize self. See help(type(self)) for accurate signature.
-
cluster
(streamlines, ordering=None)¶ Clusters streamlines into bundles.
Performs quickbundles algorithm using predefined metric and threshold.
- Parameters
- streamlineslist of 2D arrays
Each 2D array represents a sequence of 3D points (points, 3).
- orderingiterable of indices
Specifies the order in which data points will be clustered.
- Returns
- `ClusterMapCentroid` object
Result of the clustering.
Streamlines
¶
-
dipy.stats.analysis.
Streamlines
¶ alias of
nibabel.streamlines.array_sequence.ArraySequence
cKDTree
¶
-
class
dipy.stats.analysis.
cKDTree
(data, leafsize=16, compact_nodes=True, copy_data=False, balanced_tree=True)¶ Bases:
object
kd-tree for quick nearest-neighbor lookup
This class provides an index into a set of k-dimensional points which can be used to rapidly look up the nearest neighbors of any point.
The algorithm used is described in Maneewongvatana and Mount 1999. The general idea is that the kd-tree is a binary trie, each of whose nodes represents an axis-aligned hyperrectangle. Each node specifies an axis and splits the set of points based on whether their coordinate along that axis is greater than or less than a particular value.
During construction, the axis and splitting point are chosen by the “sliding midpoint” rule, which ensures that the cells do not all become long and thin.
The tree can be queried for the r closest neighbors of any given point (optionally returning only those within some maximum distance of the point). It can also be queried, with a substantial gain in efficiency, for the r approximate closest neighbors.
For large dimensions (20 is already large) do not expect this to run significantly faster than brute force. High-dimensional nearest-neighbor queries are a substantial open problem in computer science.
- Parameters
- dataarray_like, shape (n,m)
The n data points of dimension m to be indexed. This array is not copied unless this is necessary to produce a contiguous array of doubles, and so modifying this data will result in bogus results. The data are also copied if the kd-tree is built with copy_data=True.
- leafsizepositive int, optional
The number of points at which the algorithm switches over to brute-force. Default: 16.
- compact_nodesbool, optional
If True, the kd-tree is built to shrink the hyperrectangles to the actual data range. This usually gives a more compact tree that is robust against degenerated input data and gives faster queries at the expense of longer build time. Default: True.
- copy_databool, optional
If True the data is always copied to protect the kd-tree against data corruption. Default: False.
- balanced_treebool, optional
If True, the median is used to split the hyperrectangles instead of the midpoint. This usually gives a more compact tree and faster queries at the expense of longer build time. Default: True.
- boxsizearray_like or scalar, optional
Apply a m-d toroidal topology to the KDTree.. The topology is generated by \(x_i + n_i L_i\) where \(n_i\) are integers and \(L_i\) is the boxsize along i-th dimension. The input data shall be wrapped into \([0, L_i)\). A ValueError is raised if any of the data is outside of this bound.
See also
KDTree
Implementation of cKDTree in pure Python
- Attributes
- datandarray, shape (n,m)
The n data points of dimension m to be indexed. This array is not copied unless this is necessary to produce a contiguous array of doubles. The data are also copied if the kd-tree is built with copy_data=True.
- leafsizepositive int
The number of points at which the algorithm switches over to brute-force.
- mint
The dimension of a single data-point.
- nint
The number of data points.
- maxesndarray, shape (m,)
The maximum value in each dimension of the n data points.
- minsndarray, shape (m,)
The minimum value in each dimension of the n data points.
- treeobject, class cKDTreeNode
This class exposes a Python view of the root node in the cKDTree object.
- sizeint
The number of nodes in the tree.
Methods
count_neighbors
(self, other, r[, p, …])Count how many nearby pairs can be formed.
query
(self, x[, k, eps, p, …])Query the kd-tree for nearest neighbors
query_ball_point
(self, x, r[, p, eps])Find all points within distance r of point(s) x.
query_ball_tree
(self, other, r[, p, eps])Find all pairs of points whose distance is at most r
query_pairs
(self, r[, p, eps])Find all pairs of points whose distance is at most r.
sparse_distance_matrix
(self, other, max_distance)Compute a sparse distance matrix
-
__init__
($self, /, *args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
-
boxsize
¶
-
count_neighbors
(self, other, r, p=2., weights=None, cumulative=True)¶ Count how many nearby pairs can be formed. (pair-counting)
Count the number of pairs (x1,x2) can be formed, with x1 drawn from self and x2 drawn from
other
, and wheredistance(x1, x2, p) <= r
.Data points on self and other are optionally weighted by the
weights
argument. (See below)The algorithm we implement here is based on [1]. See notes for further discussion.
- Parameters
- othercKDTree instance
The other tree to draw points from, can be the same tree as self.
- rfloat or one-dimensional array of floats
The radius to produce a count for. Multiple radii are searched with a single tree traversal. If the count is non-cumulative(
cumulative=False
),r
defines the edges of the bins, and must be non-decreasing.- pfloat, optional
1<=p<=infinity. Which Minkowski p-norm to use. Default 2.0.
- weightstuple, array_like, or None, optional
If None, the pair-counting is unweighted. If given as a tuple, weights[0] is the weights of points in
self
, and weights[1] is the weights of points inother
; either can be None to indicate the points are unweighted. If given as an array_like, weights is the weights of points inself
andother
. For this to make sense,self
andother
must be the same tree. Ifself
andother
are two different trees, aValueError
is raised. Default: None- cumulativebool, optional
Whether the returned counts are cumulative. When cumulative is set to
False
the algorithm is optimized to work with a large number of bins (>10) specified byr
. Whencumulative
is set to True, the algorithm is optimized to work with a small number ofr
. Default: True
- Returns
- resultscalar or 1-D array
The number of pairs. For unweighted counts, the result is integer. For weighted counts, the result is float. If cumulative is False,
result[i]
contains the counts with(-inf if i == 0 else r[i-1]) < R <= r[i]
Notes
Pair-counting is the basic operation used to calculate the two point correlation functions from a data set composed of position of objects.
Two point correlation function measures the clustering of objects and is widely used in cosmology to quantify the large scale structure in our Universe, but it may be useful for data analysis in other fields where self-similar assembly of objects also occur.
The Landy-Szalay estimator for the two point correlation function of
D
measures the clustering signal inD
. [2]For example, given the position of two sets of objects,
objects
D
(data) contains the clustering signal, andobjects
R
(random) that contains no signal,
\[\xi(r) = \frac{<D, D> - 2 f <D, R> + f^2<R, R>}{f^2<R, R>},\]where the brackets represents counting pairs between two data sets in a finite bin around
r
(distance), corresponding to setting cumulative=False, andf = float(len(D)) / float(len(R))
is the ratio between number of objects from data and random.The algorithm implemented here is loosely based on the dual-tree algorithm described in [1]. We switch between two different pair-cumulation scheme depending on the setting of
cumulative
. The computing time of the method we use when forcumulative == False
does not scale with the total number of bins. The algorithm forcumulative == True
scales linearly with the number of bins, though it is slightly faster when only 1 or 2 bins are used. [5].As an extension to the naive pair-counting, weighted pair-counting counts the product of weights instead of number of pairs. Weighted pair-counting is used to estimate marked correlation functions ([3], section 2.2), or to properly calculate the average of data per distance bin (e.g. [4], section 2.1 on redshift).
- 1(1,2)
Gray and Moore, “N-body problems in statistical learning”, Mining the sky, 2000, https://arxiv.org/abs/astro-ph/0012333
- 2
Landy and Szalay, “Bias and variance of angular correlation functions”, The Astrophysical Journal, 1993, http://adsabs.harvard.edu/abs/1993ApJ…412…64L
- 3
Sheth, Connolly and Skibba, “Marked correlations in galaxy formation models”, Arxiv e-print, 2005, https://arxiv.org/abs/astro-ph/0511773
- 4
Hawkins, et al., “The 2dF Galaxy Redshift Survey: correlation functions, peculiar velocities and the matter density of the Universe”, Monthly Notices of the Royal Astronomical Society, 2002, http://adsabs.harvard.edu/abs/2003MNRAS.346…78H
- 5
https://github.com/scipy/scipy/pull/5647#issuecomment-168474926
-
data
¶
-
indices
¶
-
leafsize
¶
-
m
¶
-
maxes
¶
-
mins
¶
-
n
¶
-
query
(self, x, k=1, eps=0, p=2, distance_upper_bound=np.inf, n_jobs=1)¶ Query the kd-tree for nearest neighbors
- Parameters
- xarray_like, last dimension self.m
An array of points to query.
- klist of integer or integer
The list of k-th nearest neighbors to return. If k is an integer it is treated as a list of [1, … k] (range(1, k+1)). Note that the counting starts from 1.
- epsnon-negative float
Return approximate nearest neighbors; the k-th returned value is guaranteed to be no further than (1+eps) times the distance to the real k-th nearest neighbor.
- pfloat, 1<=p<=infinity
Which Minkowski p-norm to use. 1 is the sum-of-absolute-values “Manhattan” distance 2 is the usual Euclidean distance infinity is the maximum-coordinate-difference distance
- distance_upper_boundnonnegative float
Return only neighbors within this distance. This is used to prune tree searches, so if you are doing a series of nearest-neighbor queries, it may help to supply the distance to the nearest neighbor of the most recent point.
- n_jobsint, optional
Number of jobs to schedule for parallel processing. If -1 is given all processors are used. Default: 1.
- Returns
- darray of floats
The distances to the nearest neighbors. If
x
has shapetuple+(self.m,)
, thend
has shapetuple+(k,)
. When k == 1, the last dimension of the output is squeezed. Missing neighbors are indicated with infinite distances.- indarray of ints
The locations of the neighbors in
self.data
. Ifx
has shapetuple+(self.m,)
, theni
has shapetuple+(k,)
. When k == 1, the last dimension of the output is squeezed. Missing neighbors are indicated withself.n
.
Notes
If the KD-Tree is periodic, the position
x
is wrapped into the box.When the input k is a list, a query for arange(max(k)) is performed, but only columns that store the requested values of k are preserved. This is implemented in a manner that reduces memory usage.
Examples
>>> import numpy as np >>> from scipy.spatial import cKDTree >>> x, y = np.mgrid[0:5, 2:8] >>> tree = cKDTree(np.c_[x.ravel(), y.ravel()])
To query the nearest neighbours and return squeezed result, use
>>> dd, ii = tree.query([[0, 0], [2.1, 2.9]], k=1) >>> print(dd, ii) [2. 0.14142136] [ 0 13]
To query the nearest neighbours and return unsqueezed result, use
>>> dd, ii = tree.query([[0, 0], [2.1, 2.9]], k=[1]) >>> print(dd, ii) [[2. ] [0.14142136]] [[ 0] [13]]
To query the second nearest neighbours and return unsqueezed result, use
>>> dd, ii = tree.query([[0, 0], [2.1, 2.9]], k=[2]) >>> print(dd, ii) [[2.23606798] [0.90553851]] [[ 6] [12]]
To query the first and second nearest neighbours, use
>>> dd, ii = tree.query([[0, 0], [2.1, 2.9]], k=2) >>> print(dd, ii) [[2. 2.23606798] [0.14142136 0.90553851]] [[ 0 6] [13 12]]
or, be more specific
>>> dd, ii = tree.query([[0, 0], [2.1, 2.9]], k=[1, 2]) >>> print(dd, ii) [[2. 2.23606798] [0.14142136 0.90553851]] [[ 0 6] [13 12]]
-
query_ball_point
(self, x, r, p=2., eps=0)¶ Find all points within distance r of point(s) x.
- Parameters
- xarray_like, shape tuple + (self.m,)
The point or points to search for neighbors of.
- rpositive float
The radius of points to return.
- pfloat, optional
Which Minkowski p-norm to use. Should be in the range [1, inf].
- epsnonnegative float, optional
Approximate search. Branches of the tree are not explored if their nearest points are further than
r / (1 + eps)
, and branches are added in bulk if their furthest points are nearer thanr * (1 + eps)
.- n_jobsint, optional
Number of jobs to schedule for parallel processing. If -1 is given all processors are used. Default: 1.
- Returns
- resultslist or array of lists
If x is a single point, returns a list of the indices of the neighbors of x. If x is an array of points, returns an object array of shape tuple containing lists of neighbors.
Notes
If you have many points whose neighbors you want to find, you may save substantial amounts of time by putting them in a cKDTree and using query_ball_tree.
Examples
>>> from scipy import spatial >>> x, y = np.mgrid[0:4, 0:4] >>> points = np.c_[x.ravel(), y.ravel()] >>> tree = spatial.cKDTree(points) >>> tree.query_ball_point([2, 0], 1) [4, 8, 9, 12]
-
query_ball_tree
(self, other, r, p=2., eps=0)¶ Find all pairs of points whose distance is at most r
- Parameters
- othercKDTree instance
The tree containing points to search against.
- rfloat
The maximum distance, has to be positive.
- pfloat, optional
Which Minkowski norm to use. p has to meet the condition
1 <= p <= infinity
.- epsfloat, optional
Approximate search. Branches of the tree are not explored if their nearest points are further than
r/(1+eps)
, and branches are added in bulk if their furthest points are nearer thanr * (1+eps)
. eps has to be non-negative.
- Returns
- resultslist of lists
For each element
self.data[i]
of this tree,results[i]
is a list of the indices of its neighbors inother.data
.
-
query_pairs
(self, r, p=2., eps=0)¶ Find all pairs of points whose distance is at most r.
- Parameters
- rpositive float
The maximum distance.
- pfloat, optional
Which Minkowski norm to use.
p
has to meet the condition1 <= p <= infinity
.- epsfloat, optional
Approximate search. Branches of the tree are not explored if their nearest points are further than
r/(1+eps)
, and branches are added in bulk if their furthest points are nearer thanr * (1+eps)
. eps has to be non-negative.- output_typestring, optional
Choose the output container, ‘set’ or ‘ndarray’. Default: ‘set’
- Returns
- resultsset or ndarray
Set of pairs
(i,j)
, withi < j
, for which the corresponding positions are close. If output_type is ‘ndarray’, an ndarry is returned instead of a set.
-
size
¶
-
sparse_distance_matrix
(self, other, max_distance, p=2.)¶ Compute a sparse distance matrix
Computes a distance matrix between two cKDTrees, leaving as zero any distance greater than max_distance.
- Parameters
- othercKDTree
- max_distancepositive float
- pfloat, 1<=p<=infinity
Which Minkowski p-norm to use.
- output_typestring, optional
Which container to use for output data. Options: ‘dok_matrix’, ‘coo_matrix’, ‘dict’, or ‘ndarray’. Default: ‘dok_matrix’.
- Returns
- resultdok_matrix, coo_matrix, dict or ndarray
Sparse matrix representing the results in “dictionary of keys” format. If a dict is returned the keys are (i,j) tuples of indices. If output_type is ‘ndarray’ a record array with fields ‘i’, ‘j’, and ‘k’ is returned,
-
tree
¶
afq_profile¶
-
dipy.stats.analysis.
afq_profile
(data, bundle, affine, n_points=100, orient_by=None, weights=None, **weights_kwarg)¶ Calculates a summarized profile of data for a bundle or tract along its length.
Follows the approach outlined in [Yeatman2012].
- Parameters
- data3D volume
The statistic to sample with the streamlines.
- bundleStreamLines class instance
- The collection of streamlines (possibly already resampled into an array
for each to have the same length) with which we are resampling. See Note below about orienting the streamlines.
- affinearray_like (4, 4)
The mapping from voxel coordinates to streamline points. The voxel_to_rasmm matrix, typically from a NIFTI file.
- n_points: int, optional
The number of points to sample along the bundle. Default: 100.
- orient_by: streamline, optional.
A streamline to use as a standard to orient all of the streamlines in the bundle according to.
- weights1D array or 2D array or callable (optional)
Weight each streamline (1D) or each node (2D) when calculating the tract-profiles. Must sum to 1 across streamlines (in each node if relevant). If callable, this is a function that calculates weights.
- weights_kwargkey-word arguments
Additional key-word arguments to pass to the weight-calculating function. Only to be used if weights is a callable.
- Returns
- ndarraya 1D array with the profile of data along the length of
bundle
Notes
Before providing a bundle as input to this function, you will need to make sure that the streamlines in the bundle are all oriented in the same orientation relative to the bundle (use
orient_by_streamline()
).References
bundle_analysis¶
-
dipy.stats.analysis.
bundle_analysis
(model_bundle_folder, bundle_folder, orig_bundle_folder, metric_folder, group, subject, no_disks=100, out_dir='')¶ Applies statistical analysis on bundles and saves the results in a directory specified by
out_dir
.- Parameters
- model_bundle_folderstring
Path to the input model bundle files. This path may contain wildcards to process multiple inputs at once.
- bundle_folderstring
Path to the input bundle files in common space. This path may contain wildcards to process multiple inputs at once.
- orig_folderstring
Path to the input bundle files in native space. This path may contain wildcards to process multiple inputs at once.
- metric_folderstring
Path to the input dti metric or/and peak files. It will be used as metric for statistical analysis of bundles.
- groupstring
what group subject belongs to e.g. control or patient
- subjectstring
subject id e.g. 10001
- no_disksinteger, optional
Number of disks used for dividing bundle into disks. (Default 100)
- out_dirstring, optional
Output directory (default input file directory)
References
- Chandio19
Chandio, B.Q., S. Koudoro, D. Reagan, J. Harezlak,
E. Garyfallidis, Bundle Analytics: a computational and statistical analyses framework for tractometric studies, Proceedings of: International Society of Magnetic Resonance in Medicine (ISMRM), Montreal, Canada, 2019.
dti_measures¶
-
dipy.stats.analysis.
dti_measures
(bundle, metric, dt, pname, bname, subject, group, ind, dir)¶ Calculates dti measure (eg: FA, MD) per point on streamlines and save it in hd5 file.
- Parameters
- bundlestring
Name of bundle being analyzed
- metricmatrix of float values
dti metric e.g. FA, MD
- dtDataFrame
DataFrame to be populated
- pnamestring
Name of the dti metric
- bnamestring
Name of bundle being analyzed.
- subjectstring
subject number as a string (e.g. 10001)
- groupstring
which group subject belongs to (e.g. patient or control)
- indinteger list
ind tells which disk number a point belong.
- dirstring
path of output directory
gaussian_weights¶
-
dipy.stats.analysis.
gaussian_weights
(bundle, n_points=100, return_mahalnobis=False, stat=<function mean>)¶ Calculate weights for each streamline/node in a bundle, based on a Mahalanobis distance from the core the bundle, at that node (mean, per default).
- Parameters
- bundleStreamlines
The streamlines to weight.
- n_pointsint, optional
The number of points to resample to. If the `bundle` is an array, this input is ignored. Default: 100.
- Returns
- warray of shape (n_streamlines, n_points)
Weights for each node in each streamline, calculated as its relative inverse of the Mahalanobis distance, relative to the distribution of coordinates at that node position across streamlines.
load_nifti¶
-
dipy.stats.analysis.
load_nifti
(fname, return_img=False, return_voxsize=False, return_coords=False)¶
load_peaks¶
-
dipy.stats.analysis.
load_peaks
(fname, verbose=False)¶ Load a PeaksAndMetrics HDF5 file (PAM5)
- Parameters
- fnamestring
Filename of PAM5 file.
- verbosebool
Print summary information about the loaded file.
- Returns
- pamPeaksAndMetrics object
load_tractogram¶
-
dipy.stats.analysis.
load_tractogram
(filename, reference, to_space=<Space.RASMM: 'rasmm'>, shifted_origin=False, bbox_valid_check=True, trk_header_check=True)¶ Load the stateful tractogram from any format (trk, tck, fib, dpy)
- Parameters
- filenamestring
Filename with valid extension
- referenceNifti or Trk filename, Nifti1Image or TrkFile, Nifti1Header or
trk.header (dict), or ‘same’ if the input is a trk file. Reference that provides the spatial attribute. Typically a nifti-related object from the native diffusion used for streamlines generation
- spacestring
Space in which the streamlines will be transformed after loading (vox, voxmm or rasmm)
- shifted_originbool
Information on the position of the origin, False is Trackvis standard, default (center of the voxel) True is NIFTI standard (corner of the voxel)
- Returns
- outputStatefulTractogram
The tractogram to load (must have been saved properly)
mahalanobis¶
-
dipy.stats.analysis.
mahalanobis
(u, v, VI)¶ Compute the Mahalanobis distance between two 1-D arrays.
The Mahalanobis distance between 1-D arrays u and v, is defined as
\[\sqrt{ (u-v) V^{-1} (u-v)^T }\]where
V
is the covariance matrix. Note that the argument VI is the inverse ofV
.- Parameters
- u(N,) array_like
Input array.
- v(N,) array_like
Input array.
- VIndarray
The inverse of the covariance matrix.
- Returns
- mahalanobisdouble
The Mahalanobis distance between vectors u and v.
Examples
>>> from scipy.spatial import distance >>> iv = [[1, 0.5, 0.5], [0.5, 1, 0.5], [0.5, 0.5, 1]] >>> distance.mahalanobis([1, 0, 0], [0, 1, 0], iv) 1.0 >>> distance.mahalanobis([0, 2, 0], [0, 1, 0], iv) 1.0 >>> distance.mahalanobis([2, 0, 0], [0, 1, 0], iv) 1.7320508075688772
map_coordinates¶
-
dipy.stats.analysis.
map_coordinates
(input, coordinates, output=None, order=3, mode='constant', cval=0.0, prefilter=True)¶ Map the input array to new coordinates by interpolation.
The array of coordinates is used to find, for each point in the output, the corresponding coordinates in the input. The value of the input at those coordinates is determined by spline interpolation of the requested order.
The shape of the output is derived from that of the coordinate array by dropping the first axis. The values of the array along the first axis are the coordinates in the input array at which the output value is found.
- Parameters
- inputarray_like
The input array.
- coordinatesarray_like
The coordinates at which input is evaluated.
- outputarray or dtype, optional
The array in which to place the output, or the dtype of the returned array. By default an array of the same dtype as input will be created.
- orderint, optional
The order of the spline interpolation, default is 3. The order has to be in the range 0-5.
- mode{‘reflect’, ‘constant’, ‘nearest’, ‘mirror’, ‘wrap’}, optional
The mode parameter determines how the input array is extended when the filter overlaps a border. Default is ‘reflect’. Behavior for each valid value is as follows:
- ‘reflect’ (d c b a | a b c d | d c b a)
The input is extended by reflecting about the edge of the last pixel.
- ‘constant’ (k k k k | a b c d | k k k k)
The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.
- ‘nearest’ (a a a a | a b c d | d d d d)
The input is extended by replicating the last pixel.
- ‘mirror’ (d c b | a b c d | c b a)
The input is extended by reflecting about the center of the last pixel.
- ‘wrap’ (a b c d | a b c d | a b c d)
The input is extended by wrapping around to the opposite edge.
- cvalscalar, optional
Value to fill past edges of input if mode is ‘constant’. Default is 0.0.
- prefilterbool, optional
Determines if the input array is prefiltered with spline_filter before interpolation. The default is True, which will create a temporary float64 array of filtered values if order > 1. If setting this to False, the output will be slightly blurred if order > 1, unless the input is prefiltered, i.e. it is the result of calling spline_filter on the original input.
- Returns
- map_coordinatesndarray
The result of transforming the input. The shape of the output is derived from that of coordinates by dropping the first axis.
See also
spline_filter
,geometric_transform
,scipy.interpolate
Examples
>>> from scipy import ndimage >>> a = np.arange(12.).reshape((4, 3)) >>> a array([[ 0., 1., 2.], [ 3., 4., 5.], [ 6., 7., 8.], [ 9., 10., 11.]]) >>> ndimage.map_coordinates(a, [[0.5, 2], [0.5, 1]], order=1) array([ 2., 7.])
Above, the interpolated value of a[0.5, 0.5] gives output[0], while a[2, 1] is output[1].
>>> inds = np.array([[0.5, 2], [0.5, 4]]) >>> ndimage.map_coordinates(a, inds, order=1, cval=-33.3) array([ 2. , -33.3]) >>> ndimage.map_coordinates(a, inds, order=1, mode='nearest') array([ 2., 8.]) >>> ndimage.map_coordinates(a, inds, order=1, cval=0, output=bool) array([ True, False], dtype=bool)
optional_package¶
-
dipy.stats.analysis.
optional_package
(name, trip_msg=None)¶ Return package-like thing and module setup for package name
- Parameters
- namestr
package name
- trip_msgNone or str
message to give when someone tries to use the return package, but we could not import it, and have returned a TripWire object instead. Default message if None.
- Returns
- pkg_likemodule or
TripWire
instance If we can import the package, return it. Otherwise return an object raising an error when accessed
- have_pkgbool
True if import for package was successful, false otherwise
- module_setupfunction
callable usually set as
setup_module
in calling namespace, to allow skipping tests.
- pkg_likemodule or
Examples
Typical use would be something like this at the top of a module using an optional package:
>>> from dipy.utils.optpkg import optional_package >>> pkg, have_pkg, setup_module = optional_package('not_a_package')
Of course in this case the package doesn’t exist, and so, in the module:
>>> have_pkg False
and
>>> pkg.some_function() #doctest: +IGNORE_EXCEPTION_DETAIL Traceback (most recent call last): ... TripWireError: We need package not_a_package for these functions, but ``import not_a_package`` raised an ImportError
If the module does exist - we get the module
>>> pkg, _, _ = optional_package('os') >>> hasattr(pkg, 'path') True
Or a submodule if that’s what we asked for
>>> subpkg, _, _ = optional_package('os.path') >>> hasattr(subpkg, 'dirname') True
orient_by_streamline¶
-
dipy.stats.analysis.
orient_by_streamline
(streamlines, standard, n_points=12, in_place=False, as_generator=False)¶ Orient a bundle of streamlines to a standard streamline.
- Parameters
- streamlinesStreamlines, list
The input streamlines to orient.
- standardStreamlines, list, or ndarrray
This provides the standard orientation according to which the streamlines in the provided bundle should be reoriented.
- n_points: int, optional
The number of samples to apply to each of the streamlines.
- in_placebool
Whether to make the change in-place in the original input (and return a reference), or to make a copy of the list and return this copy, with the relevant streamlines reoriented. Default: False.
- as_generatorbool
Whether to return a generator as output. Default: False
- Returns
- Streamlineswith each individual array oriented to be as similar as
possible to the standard.
peak_values¶
-
dipy.stats.analysis.
peak_values
(bundle, peaks, dt, pname, bname, subject, group, ind, dir)¶ Peak_values function finds the peak direction and peak value of a point on a streamline used while tracking (generating the tractogram) and save it in hd5 file.
- Parameters
- bundlestring
Name of bundle being analyzed
- peakspeaks
contains peak directions and values
- dtDataFrame
DataFrame to be populated
- pnamestring
Name of the dti metric
- bnamestring
Name of bundle being analyzed.
- subjectstring
subject number as a string (e.g. 10001)
- groupstring
which group subject belongs to (e.g. patient or control)
- indinteger list
ind tells which disk number a point belong.
- dirstring
path of output directory
set_number_of_points¶
-
dipy.stats.analysis.
set_number_of_points
()¶ - Change the number of points of streamlines
(either by downsampling or upsampling)
Change the number of points of streamlines in order to obtain nb_points-1 segments of equal length. Points of streamlines will be modified along the curve.
- Parameters
- streamlinesndarray or a list or
dipy.tracking.Streamlines
If ndarray, must have shape (N,3) where N is the number of points of the streamline. If list, each item must be ndarray shape (Ni,3) where Ni is the number of points of streamline i. If
dipy.tracking.Streamlines
, its common_shape must be 3.- nb_pointsint
integer representing number of points wanted along the curve.
- streamlinesndarray or a list or
- Returns
- new_streamlinesndarray or a list or
dipy.tracking.Streamlines
Results of the downsampling or upsampling process.
- new_streamlinesndarray or a list or
Examples
>>> from dipy.tracking.streamline import set_number_of_points >>> import numpy as np
One streamline, a semi-circle:
>>> theta = np.pi*np.linspace(0, 1, 100) >>> x = np.cos(theta) >>> y = np.sin(theta) >>> z = 0 * x >>> streamline = np.vstack((x, y, z)).T >>> modified_streamline = set_number_of_points(streamline, 3) >>> len(modified_streamline) 3
Multiple streamlines:
>>> streamlines = [streamline, streamline[::2]] >>> new_streamlines = set_number_of_points(streamlines, 10) >>> [len(s) for s in streamlines] [100, 50] >>> [len(s) for s in new_streamlines] [10, 10]
transform_streamlines¶
-
dipy.stats.analysis.
transform_streamlines
(streamlines, mat, in_place=False)¶ Apply affine transformation to streamlines
- Parameters
- streamlinesStreamlines
Streamlines object
- matarray, (4, 4)
transformation matrix
- in_placebool
If True then change data in place. Be careful changes input streamlines.
- Returns
- new_streamlinesStreamlines
Sequence transformed 2D ndarrays of shape[-1]==3
values_from_volume¶
-
dipy.stats.analysis.
values_from_volume
(data, streamlines, affine)¶ Extract values of a scalar/vector along each streamline from a volume.
- Parameters
- data3D or 4D array
Scalar (for 3D) and vector (for 4D) values to be extracted. For 4D data, interpolation will be done on the 3 spatial dimensions in each volume.
- streamlinesndarray or list
If array, of shape (n_streamlines, n_nodes, 3) If list, len(n_streamlines) with (n_nodes, 3) array in each element of the list.
- affinearray_like (4, 4)
The mapping from voxel coordinates to streamline points. The voxel_to_rasmm matrix, typically from a NIFTI file.
- Returns
- array or list (depending on the input)values interpolate to each
coordinate along the length of each streamline.
Notes
Values are extracted from the image based on the 3D coordinates of the nodes that comprise the points in the streamline, without any interpolation into segments between the nodes. Using this function with streamlines that have been resampled into a very small number of nodes will result in very few values.