Name that lesion: Physicians face off against an algorithm

Paul Basilio, MDLinx | February 26, 2018

Another chess match recently took place between humans and artificial intelligence (AI), except the pawns and rooks were swapped with birthmarks and melanomas.


The results could have important implications on the daily practice of many physicians.

In a study published in Nature, a deep neural network featuring a “brain” full of almost 130,000 clinical images was itching to recognize patterns. On the other side, sat 21 board-certified dermatologists, dermatoscopes in hand. Researchers then pitted the two sides to see who could identify the lesions requiring further medical testing.

While the actual day-to-day proceedings of the study were perhaps less dramatic than the famous chess match between Gary Kasparov and Deep Blue, the results could have important implications on the daily practice of many physicians and outcomes for patients.

In this study, the team led by Andre Esteva and Brett Kuprel—both PhD students in the Department of Electrical Engineering, Stanford University, Stanford, CA—employed a deep convolutional neural network (CNN). The network, which has shown the potential for highly variable tasks across many categories, was “trained” with 129,450 clinical images encompassing 2,032 diseases. This training set was two orders of magnitude larger than those in previous studies of this kind.

In these types of neural networks, the machines are shown images, which are then identified. As the network is fed more images, it can fine-tune its accuracy in identifying future images. This form of AI is currently gaining traction as an adjunct to a clinician’s expertise.

The algorithm’s performance was measured by a sensitivity-specificity curve; sensitivity was the algorithm’s ability to correctly identify malignant lesions, and specificity was how well it identified benign lesions.  

The CNN and the dermatologists were presented a previously unseen set of digital images with diagnoses that had been verified by biopsy. The images were examples of two categories of benign and malignant lesions.

One category, melanocytic lesions, included moles and melanoma. The other category was predominately keratinocytic lesions, such as nonmelanocytic carcinoma and benign seborrheic keratosis. For melanocytic lesions, a standard photograph and a dermoscopic image were shown to reflect two steps a dermatologist might take during a clinical examination.

After viewing each image, the dermatologists were asked whether they would proceed with obtaining a biopsy specimen or treatment, or whether they would reassure the patient that the lesion was benign. Success was determined by how well the dermatologists correctly diagnosed the cancerous and noncancerous lesions.

In the end, the computer algorithm was at least as successful—and in some cases, more successful—than the board-certified dermatologists.

“This fast, scalable method is deployable on mobile devices and holds the potential for substantial clinical impact, including broadening the scope of primary care practice and augmenting clinical decision-making for dermatology specialists,” the authors wrote. “Further research is necessary to evaluate performance in a real-world, clinical setting, in order to validate this technique across the full distribution and spectrum of lesions encountered in typical practice.”

It is important to note that the result is not proof that dermatologists can be replaced with an algorithm and a good camera. Clinical experience, patient interaction, and a career full of experience is still paramount.

However, an algorithm that can augment a physician’s diagnostic abilities and marry artificial intelligence with live intelligent is an attractive commodity that can provide data that can be adapted to multiple specialties.

 To read more about this study, click here