It's an open-source machine learning library developed by Google, Used for Building and training neural networks, Performing complex math computations as well as developing deep learning models Tensorflow has some key features like it has GPU acceleration support
A comprehensive plotting and visualization library, used for creating static, animated and interactive visualizations and plotting data in various formats(line plots, scatter plots, histograms, etc.)
Package for numerical computing in python. It has core features like it can take multi-dimensional array objects, can compute advanced mathematical functions as well as linear algebra operations
Keras Dataset is a part of the Keras deep learning library(now integrated into tensorflow) It gives collection of preloaded datasets for ML such as
(training_images, training_labels), (testing_images, testing_labels) = datasets.cifar10.load_data()
training_images, testing_images = training_images / 255, testing_images / 255
class_names = ['Plane', 'Car', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck']
for i in range(16):
plt.subplot(4, 4, i+1)
plt.xticks([])
plt.yticks([])
plt.imshow(training_images[i], cmap=plt.cm.binary)
plt.xlabel(class_names[training_labels[i][0]])
# Reducing the Dataset Size (for Quick Training)
training_images = training_images[:30000]
training_labels = training_labels[:30000]
testing_images = testing_images[:5000]
testing_labels = testing_labels[:5000]
# Train the CNN
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3),activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=10,validation_data=(testing_images, testing_labels))
loss, accuracy = model.evaluate(testing_images, testing_labels)
print(f'Loss: {loss}')
print(f'Accuracy: {accuracy}')
model.save('image_classifier.model')
datasets.cifar10.load_data() loads the CIFAR-10 dataset, containing 60,000 images across 10 classes.
training_images, training_labels: Arrays of 50,000 training images and their labels.
testing_images, testing_labels: Arrays of 10,000 test images and their labels.
Each image pixel value is divided by 255 to normalize the data into a 0-1 range, which helps improve model training.
class_names lists the names of the CIFAR-10 classes.
The code plots the first 16 images from the training set in a 4x4 grid with plt.imshow, displaying each image's corresponding label as plt.xlabel.
models.Sequential()
creates a linear stack of layers. Each layer is added one by one in sequence, where the output of one layer is fed as the input to the next.model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
32
: The number of filters (kernels) used, each generating a 2D activation map.(3, 3)
: The size of each filter, a 3x3 grid.activation='relu'
: Uses the Rectified Linear Unit (ReLU) activation function, which helps introduce non-linearity by setting any negative values to zero.input_shape=(32, 32, 3)
: Specifies the shape of each input image (32x32 pixels, with 3 color channels - RGB).model.add(layers.MaxPooling2D((2, 2)))
Purpose: Reduces the spatial dimensions of the feature maps by a factor of 2 (in both height and width).
Parameters:(2, 2)
: The pooling size, which takes the maximum value in each 2x2 block, effectively downsampling the feature map.
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
10
: The number of neurons corresponds to the number of target classes.activation='softmax'
: The softmax activation function converts the output scores into probabilities for each class. The class with the highest probability is selected as the model’s prediction.model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
optimizer='adam'
: Uses the Adam optimizer, a popular choice because it adapts the learning rate during training.loss='sparse_categorical_crossentropy'
: The loss function used here is suitable for multi-class classification with integer labels.metrics=['accuracy']
: Tracks accuracy during training and evaluation.model.fit(training_images, training_labels, epochs=10, validation_data=(testing_images, testing_labels))
training_images
and training_labels
: The input images and their corresponding labels for training.epochs=10
: The number of complete passes through the training dataset.validation_data=(testing_images, testing_labels)
: Specifies the validation dataset to monitor the model's performance on unseen data during training.model.save('image_classifier.model')
# Analyze the image
def checkImage(image):
class_names = ['Plane', 'Car', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck']
model = models.load_model('image_classifier.model')
prediction = model.predict(np.array([image]) / 255)
index = np.argmax(prediction)
return class_names[index]
In this blog, we've walked through implementing Convolutional Neural Networks using Python and essential libraries like TensorFlow, NumPy, and Matplotlib. We covered everything from data preparation to model training and evaluation. CNNs are powerful tools for image processing tasks, and with the right implementation strategy, they can achieve remarkable results. Whether you're working on image classification, object detection, or other computer vision tasks, these fundamental concepts will serve as your building blocks for more advanced applications.