A Radiologist’s Exploration of the Stanford ML Group’s MRNet data

Data exploration through domain knowledge of medical imaging

Walter WigginsFollowApr 19

This post reviews the recently released Stanford MRNet knee MRI data set and competition. As I am a senior radiology resident, I will focus on exploring the data through basic domain knowledge — addressing aspects of the data distribution that non-physicians may find perplexing. I’ll also include some Python code that interested parties may find useful in exploring the data set on their own.


The Stanford ML Group recently released their third public data set of medical imaging examinations, called MRNet, which can be found here. From the website:

The MRNet dataset consists of 1,370 knee MRI exams performed at Stanford University Medical Center. The dataset contains 1,104 (80.6%) abnormal exams, with 319 (23.3%) ACL tears and 508 (37.1%) meniscal tears; labels were obtained through manual extraction from clinical reports.

The data set accompanies the publication of the Stanford ML Group’s work, which can be found here. Once again, they are hosting a competition to drive innovation in automated analysis of medical imaging.

Entries into the competition will be evaluated on a private test data set with the following metric:

The leaderboard reports the average AUC of the abnormality detection, ACL tear, and Meniscal tear tasks.

You can find more details about the competition on the website. My goals for this post are as follows.


  • Basic medical imaging terminology
  • Overview of the data set & tasks for the competition
  • Basic knee anatomy
  • Examples of the target knee pathology
  • Examples of potentially “bad” data points
  • Python code for reviewing the data

Basic medical imaging terminology

Magnetic resonance imaging (MRI) is a cross-sectional imaging modality, meaning that 2D images are acquired more-or-less sequentially in different imaging planes. The standard planes of imaging included in the MRNet data set are: axial, coronal and sagittal. MR images can be acquired in any plane, but that is beyond the scope of this post.

MR images are acquired by sequences of radiofrequency (RF) pulses (pulse sequences or — simply — sequences). Different sequences are designed to produce a signal that can be acquired and processed to reveal different patterns of signal intensity in biologic tissues. T1-weighting, T2-weighting, and proton density (PD)-weighting are the 3 core pulse sequence types used in musculoskeletal imaging.

It is helpful to know what the basic signal intensity patterns of fat, water, and muscle are for these 3 sequences (table below). Also helpful is the knowledge that cortical bone and fibrous structures (e.g. ligaments, menisci) should be dark on all sequences. Since fat and water are both intermediate-to-bright on T2 and PD sequences, fat saturation (fat-sat) is a technique that can be applied to enhance the appearance of water on these sequences.

Abnormal fluid signal (either in amount or location) is often a key indicator of musculoskeletal pathology — or any pathology, for that matter — so the fluid-sensitive images are really the workhorse of the knee MRI.

Source: https://www.fifamedicinediploma.com/lessons/radiology-mri/

Additional terminology:

  • Slice:an individual image is often referred to as a slice
  • Series: a full stack of images acquired with a given pulse sequence in a given plane
  • Field-of-view (FOV): the spatial coverage of the image

Overview of the data set & tasks for the competition

Caveat: I have yet to review all of the data provided. These comments are based on a limited initial review of the data.

Again, the data can be obtained through the links above. You must first register an account with the Stanford ML Group and then you will be emailed a link to the data. Sharing the data is expressly forbidden, even with a teammate in the competition.

Once you’ve downloaded and unzipped the data, you should have this directory tree:

├── train
│ ├── axial
│ ├── coronal
│ └── sagittal
├── train-abnormal.csv
├── train-acl.csv
├── train-meniscus.csv
├── valid
│ ├── axial
│ ├── coronal
│ └── sagittal
├── valid-abnormal.csv
├── valid-acl.csv
└── valid-meniscus.csv

The *.csv files contain the labels for the cases. The *.npy files contained in the subdirectories of train and valid are NumPy arrays of dimension (slices, x, y). The x and y dimensions are consistently 256 x 256 across all exams with int values ranging from 0 to 255. This implies that the pixel data has already been normalized by the Stanford ML Group.

Note: The image stack for each exam may contain different numbers of images and each exam may have a different number of slices for any given plane. This is completely normal for medical imaging data.

So as not to run afoul of the competition rules, I will refrain from posting summary data tables of the overlap in labels. However, it is pertinent to note the following:

1. There are many abnormal exams that don’t contain an ACL or meniscal tear.

2. There are far more exams that contain both an ACL tear and a meniscal tear than there are cases with just an ACL tear.

These findings suggest that the data set is likely fairly realistic for what would be encountered in the real world.

The reasons for this are as follows:

  • Knee MRIs are not typically ordered on asymptomatic patients.
  • The most common abnormalities on knee MRI exams are within the cartilage and underlying bone.
  • Due to the mechanisms of injury and forces required to tear the ACL, the menisci are very often injured as well.

Preview of the image data

Though the website states that the standard protocol for the MRI knee exams in the data set includes a variety of sequences, the data that I’ve reviewed thus far contains three sequences per case:

  • Coronal T1-weighted images without fat saturation
  • Sagittal T2-weighted images with fat-sat (fluid-sensitive)
  • Axial PD-weighted images with fat-sat (fluid-sensitive)

Basic knee anatomy

I’ll include an image here for a basic overview of knee anatomy. For those interested in learning more about the appearance of knee anatomy on MRI, this website (the source for the below image) might be helpful.

Source: http://www.freitasrad.net/pages/atlas/Knee/Knee.html

Examples of the target knee pathology

  1. Meniscal tear: The sagittal T2 fat-sat image below shows a normal anterior horn of the medial meniscus (green arrow — dark triangle on the left side of the image) and a tear in the posterior horn of the medial meniscus (red arrow — little bits of bright signal within the dark triangle on the right side of the image). Note: there are several other abnormalities on this image, but these are beyond the scope of this post.
Sagittal T2-weighted image with fat saturation

2. Anterior cruciate ligament (ACL) tear:

First, I’ll show a sagittal T2 fat-sat image of a normal ACL. Note: the ripples of increased signal in the image just above the ACL represent pulsation artifact from the popliteal vessels.

Normal ACL

And now, a torn ACL. In the following image, the red arrow points to an oblong structure of relatively high signal intensity and the normal dark band of fibrous ligament (seen in the above image) is absent.

Torn ACL

Examples of potentially “bad” data points

Again, at the time of this writing, I haven’t yet explored the data in its entirety. However, I have observed a few potentially “bad” data points that add to the challenge of this competition. I say potentially bad, because some of these may be handled fairly well by a deep learning model. Here, I show a couple of examples and give brief explanations for why these represent troublesome data points.

One issue I observed was a sagittal image stack where the majority of the knee is out of the FOV, resulting in an essentially useless stack of images for the competition tasks, as none of the relevant anatomic structures are visible. This may have been due to an error in image preprocessing for curation into the MRNet data set or, rather, due to an issue with the source data. Thus far, it is an infrequently encountered issue.

Another cause of potentially “bad” data is fairly typical for medical imaging. In a few cases, I’ve seen image artifact bad enough to make the images very challenging for a human to read. However, it’s possible that a deep learning algorithm could “read through” the artifact, just as we radiologists try to do in the interest of patient care. Here, the pulsation of the popliteal artery results in aliasing artifact throughout the image at the level of the knee joint.

Extensive pulsation artifact can make images hard to read/analyze

Python code for viewing the data

The following code will load the data from one case into a dict of NumPy arrays, which is then used by the KneePlot class to generate the interactive plot shown in the *.gif below.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from ipywidgets import interact, Dropdown, IntSlider
%matplotlib notebook

train_abnl = pd.read_csv(data_path/'train-abnormal.csv', header=None,
                       names=['Case', 'Abnormal'], 
                       dtype={'Case': str, 'Abnormal': np.int64})

# data loading functions
def load_one_stack(case, data_path=train_path, plane='coronal'):
    fpath = data_path/plane/'{}.npy'.format(case)
    return np.load(fpath)

def load_stacks(case, data_path=train_path):
    x = {}
    planes = ['coronal', 'sagittal', 'axial']
    for i, plane in enumerate(planes):
        x[plane] = load_one_stack(case, plane=plane)
    return x

# interactive viewer
class KneePlot():
    def __init__(self, x, figsize=(10, 10)):
        self.x = x
        self.planes = list(x.keys())
        self.slice_nums = {plane: self.x[plane].shape[0] for plane in self.planes}
        self.figsize = figsize
    def _plot_slices(self, plane, im_slice): 
        fig, ax = plt.subplots(1, 1, figsize=self.figsize)
        ax.imshow(self.x[plane][im_slice, :, :])
    def draw(self):
        planes_widget = Dropdown(options=self.planes)
        plane_init = self.planes[0]
        slice_init = self.slice_nums[plane_init] - 1
        slices_widget = IntSlider(min=0, max=slice_init, value=slice_init//2)
        def update_slices_widget(*args):
            slices_widget.max = self.slice_nums[planes_widget.value] - 1
            slices_widget.value = slices_widget.max // 2
        planes_widget.observe(update_slices_widget, 'value')
        interact(self._plot_slices, plane=planes_widget, im_slice=slices_widget)
    def resize(self, figsize): self.figsize = figsize

# example usage
case = train_abnl.Case[0]
x = load_stacks(case)
plot = KneePlot(x, figsize=(8, 8))


I hope this post gives you a feel for the MRNet data set. Perhaps more importantly, I hope you’ve learned a little about knee MRI. Though I’ve yet to explore it in its entirety, I think this data set will be a valuable resource for the ML community. And I look forward to reading about the models developed through the competition. Thank you for reading!Thanks to Pierre Guillou.

143 claps2Follow

Walter Wiggins

Medium member since May 2018

BWH/Harvard radiology resident, MGH-BWH Center for Clinical Data Science researcher, neuroscience PhD, and writer. Continually striving toward self-improvement.

Source: Towards Data Science

Judith Chao Andrade

Apasionada del conocimiento, de compartirlo y de aprender de todo lo que me rodea, disfruto aprendiendo y realizando actividades. Actualmente estoy aprendiendo programación pero me fascinan los temas relacionados con los materiales especiales, las cuiriosidades, el humor, los eventos, las redes sociales ... Mi mayor interés podría decir que es no perder nunca la cuiriosidad por lo que si tienes un plan en mente solo proponlo !.

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *