Lung cancer is one of most common types of cancer that causes death worldwide. It has been found that early detection of lung cancer can help in better treatment and improved survival rate. Deep learning techniques such as Convolutional Neural Networks (CNN) have shown promising results in the field of medical image analysis. In this article we will learn Python Program for Lungs Cancer Detection using CNN.
Step 1: Importing necessary libraries
First we need to import necessary libraries that will be used in our program. We will use Keras a high level neural networks API, for building our CNN model. We will also use Numpy and Matplotlib libraries for data manipulation and visualization, respectively.
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
Step 2: Preparing data
The next step is to prepare our data for CNN model. We will use dataset provided by Kaggle which contains images CT scans of lung. The dataset is divided into two classes one for patients with lung cancer and the other for patients without lung cancer. We will split the data into training and validation sets.
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_set = train_datagen.flow_from_directory('dataset/train',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
test_set = test_datagen.flow_from_directory('dataset/test',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
Step 3: Building CNN model
After preparing data we can build our CNN model. The model will consist two convolutional layers with max pooling followed by two fully connected layers. We will use rectified linear unit (ReLU) activation function for hidden layers and sigmoid activation function for the output layer.
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer=SGD(lr=0.01),
loss='binary_crossentropy',
metrics=['accuracy'])
Step 4: Training model
Next we will train our model on training set. We use 10 epochs for training and evaluate performance of model on validation set after each epoch.
history = model.fit(train_set,
steps_per_epoch=len(train_set),
epochs=10,
validation_data=test_set,
validation_steps=len(test_set))
Step 5: Evaluating model
Finally we will evaluate performance of our model on test set using evaluate() method. We will also plot accuracy and loss curves for both training and validation sets.
score = model.evaluate(test_set, steps=len(test_set))
print('Test loss:', score[0])
print('Test accuracy:', score[1])
# Plotting accuracy and loss curves
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend()
plt.show()
Here are some useful links related to lung cancer detection and the tools used in the Python program:
- Keras documentation: https://keras.io/
- TensorFlow documentation: https://www.tensorflow.org/
- OpenCV documentation: https://docs.opencv.org/
- Lung cancer dataset used in this article: https://www.kaggle.com/kmader/lung-cancer-detection
- ImageDataGenerator documentation: https://keras.io/api/preprocessing/image/
- Google Colab: https://colab.research.google.com/