[UPDATE]1. If you really want to demonstrate the performance. You may consider training your data using only /10 (or even smaller) of the original data and test on another 1/10 of the data to show the performance of your algorithm. 2. An additional set of track data from individual subject is available at https://www.dropbox.com/s/xqciq76juauib29/test_ind.zip?dl=0 This data set is very noisy and have a lot of distortion. Please use this as a testing data to report the classification error of your algorithm. You may note that there are much fewer clusters in this data set. This is due to poorer quality of the individual data that cause tracking failure in certain tracks . *please note that the evaluation approaches, either 10-fold CV, reducing the training data, or using the individual data, are mainly for helping you sell your work. It is not an absolute guideline. Please feel free to come up with a novel way to demonstrate the performance of your algorithm!! Synopsis Diffusion MRI fiber tracking has been widely used to map the trajectories of axonal connections in the human brain, but identifying or classifying these trajectories still requires inspection of experienced experts due to the high complexity of neuroanatomy. Here we used the human connectome project data and mapped the trajectories on a population atlas of 842 subjects. These trajectories were classified by experienced neuroanatomists, allowing for training a supervised learning classifier to identify the name of an unknown trajectory. An ideal classifier is expected to provide a suggestion of possible classifications as well as identifying false pathways created by the noise of the MRI data. IntroductionSporns, Olaf, and Patric Hagmann. (2009): Science Maps for Scholars,” Places & Spaces: Mapping Science, http://scimaps.org. Diffusion MRI and the human connectome The human brain has billions of axonal connections between neurons. These axonal connections give rise to the human connectome, which describes how brain regions are connected to each other. The wiring diagram of all axonal connections, also known as the structural connectome, maps the neuronal pathways and has been a major research topic in the neuroscience field, as it is the key to understand how brain functions and malfunctions. Here we use a technique called diffusion MRI fiber tracking to map the neuronal pathways of the human brain. Diffusion MRI detects microscopic movement of water molecules, thereby allowing us to calculate the local orientation of brain connections. The local orientations can be used to map the neuronal connections and store their trajectories coordinates as "tracks". We analyzed the brain imaging data from the NIH-funded "Human Connectome Project" with a total of 842 subjects the data and and averaged it as a single brain to obtain the representative "tracks" for a general population. Research question Deciphering the complex wiring diagram of the human connectome first requires labeling the tracks by their anatomical name. This helps answer questions like "what is the function of this track?", "How does this tracks contribute to cognition and behaviour?". Recognizing tracks is a challenging task due to the high complexity of the wiring diagram and the existence of false tracks, which are due to noise, artifact, or computation error. Here we are seeking a supervised learning solution to recognize tracks. The training data are from manual labeling conducted by neuroanatomists at the University of Pittsburgh Medical Center. The goal is to develop a supervised classifier that learns how neurosurgeons recognize meaningful tracks in the human connectome. DataFigure: 3D visualization of the human brain connectome, which contains millions of trajectories colored by their local orientation. There are a total of 129,414 tracks, categorized into 56 clusters. The largest cluster is the "F" cluster (false tracks), which has 26,253 tracks. The next is "CC" cluster, 21,953 tracks. The smallest cluster has only 25 tracks. Each of the tracks contains a sequence of 3D points (x,y,z in mm) recording the coordinates from one end to another end. The distance between each point is around 0.5 mm. The average length of the tracks is 93 mm. Some tracks are longer, and hence they have more points, whereas other tracks may have fewer points. The following is the visualization of a track (colored by green color) in the brain space. ![]() File FormatThe track coordinates are stored as an ASCII file. The coordinates of a track are stored by x1 y1 z1 x2 y2 z2 .... xn yn zn, separated by spaces. More than one tracks can be stored in the a single text file. Each track is separated by a line break: A track does not have "directionality", which means that a track can be stored in the reverse order: xn yn zn xn-1 yn-1 zn-1 ... x1 y1 z1. The following is an example C++ code to load a text file contains multiple tracks. std::ifstream in(file_name); if (!in) return false; std::string line; std::vector<std::vector<float> > track_data; while (std::getline(in,line)) { track_data.push_back(std::vector<float>()); std::istringstream in(line); std::copy(std::istream_iterator<float>(in), std::istream_iterator<float>(),std::back_inserter(tract_data.back())); } There are a total of 56 clusters, and tracks of the same cluster are stored as a single ASCII file (The file name is the label name). Your classifier should be trained to classify a track. Note that there is a "False Track" cluster (file name F.txt). A robust classifier should be able to identify false tracks from real tracks. EvaluationSupervised learning Please use a 10-fold cross validation to demonstrate the testing error of your algorithm. Please also highlight the accuracy of the algorithm to identify the "False Tracks" versus other none clusters. A good algorithm should achieve less than 10% classification error. Download LinkGoggle drive: Cluster Abbreviations
|