|
588 | 588 | " with open(filepath, \"rb\") as f:\n", |
589 | 589 | " return pickle.load(f)\n", |
590 | 590 | "\n", |
591 | | - "def plot_image(img, figsize = (2,3)):\n", |
| 591 | + "def plot_image(image, figsize = (2,3)):\n", |
592 | 592 | " plt.figure(figsize = figsize)\n", |
593 | | - " plt.imshow(img)\n", |
| 593 | + " plt.imshow(image)\n", |
594 | 594 | " plt.axis(\"off\")\n", |
595 | 595 | " \n", |
596 | 596 | "def plot_multiple_images(*images_titles, figsize = (2, 3)):\n", |
|
750 | 750 | "\n", |
751 | 751 | "import cv2\n", |
752 | 752 | "import numpy as np\n", |
753 | | - "def solution_scale_image(img: np.ndarray, scale_factor: float):\n", |
| 753 | + "def solution_scale_image(image: np.ndarray, scale_factor: float):\n", |
754 | 754 | " \"\"\"\n", |
755 | 755 | " The function takes an image as input and rescales it to a new dimension.\n", |
756 | 756 | " For that, you need to compute the new dimensions of the image using the scale factor \n", |
757 | | - " and then use OpenCV's `cv2.resize` function to resize the image, e.g., `cv2.resize(img, (new_width, new_height))`.\n", |
| 757 | + " and then use OpenCV's `cv2.resize` function to resize the image, e.g., `cv2.resize(image, (new_width, new_height))`.\n", |
758 | 758 | "\n", |
759 | 759 | " Args:\n", |
760 | | - " img (np.ndarray): The input image.\n", |
| 760 | + " image (np.ndarray): The input image.\n", |
761 | 761 | " scale_factor (float): The factor by which to scale the image.\n", |
762 | 762 | "\n", |
763 | 763 | " Returns:\n", |
|
812 | 812 | "\n", |
813 | 813 | "import cv2\n", |
814 | 814 | "import numpy as np\n", |
815 | | - "def solution_crop_image(img: np.ndarray, x: int, y: int, width: int, height: int):\n", |
| 815 | + "def solution_crop_image(image: np.ndarray, x: int, y: int, width: int, height: int):\n", |
816 | 816 | " \"\"\"\n", |
817 | 817 | " The function takes an image as input and crops it to a specified rectangular region.\n", |
818 | 818 | " In OpenCV, images are represented as NumPy arrays. You can crop the image by \n", |
819 | 819 | " using array slicing, keeping in mind that the first dimension is the y-axis (rows) \n", |
820 | | - " and the second dimension is the x-axis (columns): `img[y_start:y_end, x_start:x_end]`. \n", |
| 820 | + " and the second dimension is the x-axis (columns): `image[y_start:y_end, x_start:x_end]`. \n", |
821 | 821 | " You will need to calculate these start and end coordinates using the provided \n", |
822 | 822 | " x, y, width, and height parameters.\n", |
823 | 823 | "\n", |
824 | 824 | " Args:\n", |
825 | | - " img (np.ndarray): The input image.\n", |
| 825 | + " image (np.ndarray): The input image.\n", |
826 | 826 | " x (int): The starting x-coordinate (column) of the crop.\n", |
827 | 827 | " y (int): The starting y-coordinate (row) of the crop.\n", |
828 | 828 | " width (int): The width of the cropped region.\n", |
|
880 | 880 | "\n", |
881 | 881 | "import cv2\n", |
882 | 882 | "import numpy as np\n", |
883 | | - "def solution_horizontal_flip_image(img: np.ndarray):\n", |
| 883 | + "def solution_horizontal_flip_image(image: np.ndarray):\n", |
884 | 884 | " \"\"\"\n", |
885 | 885 | " The function takes an image as input and flips it horizontally (creating a mirror image).\n", |
886 | 886 | " To do this, use OpenCV's `cv2.flip(src, flipCode)` function. The `flipCode` integer \n", |
|
890 | 890 | " - Use -1 for both axes.\n", |
891 | 891 | "\n", |
892 | 892 | " Args:\n", |
893 | | - " img (np.ndarray): The input image.\n", |
| 893 | + " image (np.ndarray): The input image.\n", |
894 | 894 | "\n", |
895 | 895 | " Returns:\n", |
896 | 896 | " np.ndarray: The horizontally flipped image.\n", |
|
944 | 944 | "\n", |
945 | 945 | "import cv2\n", |
946 | 946 | "import numpy as np\n", |
947 | | - "def solution_vertical_flip_image(img: np.ndarray):\n", |
| 947 | + "def solution_vertical_flip_image(image: np.ndarray):\n", |
948 | 948 | " \"\"\"\n", |
949 | 949 | " The function takes an image as input and flips it vertically.\n", |
950 | 950 | " To do this, use OpenCV's `cv2.flip(src, flipCode)` function. The `flipCode` integer \n", |
|
954 | 954 | " - Use -1 for both axes.\n", |
955 | 955 | "\n", |
956 | 956 | " Args:\n", |
957 | | - " img (np.ndarray): The input image.\n", |
| 957 | + " image (np.ndarray): The input image.\n", |
958 | 958 | "\n", |
959 | 959 | " Returns:\n", |
960 | 960 | " np.ndarray: The vertically flipped image.\n", |
|
1008 | 1008 | "\n", |
1009 | 1009 | "import cv2\n", |
1010 | 1010 | "import numpy as np\n", |
1011 | | - "def solution_rotate_image(img: np.ndarray, angle: float):\n", |
| 1011 | + "def solution_rotate_image(image: np.ndarray, angle: float):\n", |
1012 | 1012 | " \"\"\"\n", |
1013 | 1013 | " The function takes an image as input and rotates it by a specified angle. \n", |
1014 | 1014 | " To ensure the corners of the image are not cropped after rotation, you must \n", |
|
1029 | 1029 | " mat[0, 2] += (new_w / 2) - center[0]\n", |
1030 | 1030 | " mat[1, 2] += (new_h / 2) - center[1]\n", |
1031 | 1031 | " \n", |
1032 | | - " 6. Return the final rotated image using: `cv2.warpAffine(img, mat, (new_w, new_h))`\n", |
| 1032 | + " 6. Return the final rotated image using: `cv2.warpAffine(image, mat, (new_w, new_h))`\n", |
1033 | 1033 | "\n", |
1034 | 1034 | " Args:\n", |
1035 | | - " img (np.ndarray): The input image.\n", |
| 1035 | + " image (np.ndarray): The input image.\n", |
1036 | 1036 | " angle (float): The angle of rotation in degrees.\n", |
1037 | 1037 | "\n", |
1038 | 1038 | " Returns:\n", |
|
1102 | 1102 | "\n", |
1103 | 1103 | "import cv2\n", |
1104 | 1104 | "import numpy as np\n", |
1105 | | - "def solution_average_filter(img: np.ndarray, kernel_size: tuple = (5, 5)):\n", |
| 1105 | + "def solution_average_filter(image: np.ndarray, kernel_size: tuple = (5, 5)):\n", |
1106 | 1106 | " \"\"\"\n", |
1107 | 1107 | " Applies an average filter to blur the image using a specific kernel size.\n", |
1108 | 1108 | " For that, you need to use OpenCV's `cv2.blur()` function, which takes the image and the kernel size as arguments.\n", |
1109 | 1109 | " `cv2.blur()` requires two arguments: the image, and the size of the kernel.\n", |
1110 | 1110 | " \n", |
1111 | 1111 | " Args:\n", |
1112 | | - " img (np.ndarray): The input image.\n", |
| 1112 | + " image (np.ndarray): The input image.\n", |
1113 | 1113 | " kernel_size (tuple): The width and height of the blurring window. Default is (5, 5).\n", |
1114 | 1114 | "\n", |
1115 | 1115 | " Returns:\n", |
|
1164 | 1164 | "\n", |
1165 | 1165 | "import cv2\n", |
1166 | 1166 | "import numpy as np\n", |
1167 | | - "def solution_median_filter(img: np.ndarray, ksize: int):\n", |
| 1167 | + "def solution_median_filter(image: np.ndarray, ksize: int):\n", |
1168 | 1168 | " \"\"\"\n", |
1169 | 1169 | " Applies a median filter to the image using a specific kernel size.\n", |
1170 | 1170 | " For that, you need to use OpenCV's `cv2.medianBlur()` function, which takes the image and the kernel size as arguments.\n", |
1171 | 1171 | " `cv2.medianBlur()` requires two arguments: the image, and the size of the kernel.\n", |
1172 | 1172 | " \n", |
1173 | 1173 | " Args:\n", |
1174 | | - " img (np.ndarray): The input image.\n", |
| 1174 | + " image (np.ndarray): The input image.\n", |
1175 | 1175 | " ksize (int): The size of the median filter kernel. Must be a positive odd integer.\n", |
1176 | 1176 | "\n", |
1177 | 1177 | " Returns:\n", |
|
1226 | 1226 | "\n", |
1227 | 1227 | "import cv2\n", |
1228 | 1228 | "import numpy as np\n", |
1229 | | - "def solution_gaussian_filter(img: np.ndarray, kernel_size: tuple = (5, 5), sigma: float = 0):\n", |
| 1229 | + "def solution_gaussian_filter(image: np.ndarray, kernel_size: tuple = (5, 5), sigma: float = 0):\n", |
1230 | 1230 | " \"\"\"\n", |
1231 | 1231 | " Applies a Gaussian filter to the image using a specific kernel size and sigma value.\n", |
1232 | 1232 | " For that, you need to use OpenCV's `cv2.GaussianBlur()` function, which takes the image, kernel size, and sigma as arguments.\n", |
1233 | 1233 | " `cv2.GaussianBlur()` requires three arguments: the image, the size of the kernel, and the sigma value.\n", |
1234 | 1234 | "\n", |
1235 | 1235 | " Args:\n", |
1236 | | - " img (np.ndarray): The input image.\n", |
| 1236 | + " image (np.ndarray): The input image.\n", |
1237 | 1237 | " kernel_size (tuple): The width and height of the Gaussian kernel. Default is (5, 5).\n", |
1238 | 1238 | " sigma (float): The standard deviation of the Gaussian kernel. Default is 0.\n", |
1239 | 1239 | "\n", |
|
1303 | 1303 | "\n", |
1304 | 1304 | "import cv2\n", |
1305 | 1305 | "import numpy as np\n", |
1306 | | - "def solution_adjust_brightness(img: np.ndarray, brightness_value: float):\n", |
| 1306 | + "def solution_adjust_brightness(image: np.ndarray, brightness_value: float):\n", |
1307 | 1307 | " \"\"\"\n", |
1308 | 1308 | " Adjusts the brightness of the image by adding a specified value to all pixel intensities.\n", |
1309 | 1309 | " To adjust the brightness, you can use OpenCV's `cv2.convertScaleAbs()` function, which scales, calculates absolute values, and converts the result to 8-bit.\n", |
1310 | 1310 | " `cv2.convertScaleAbs()` requires three arguments: the image, the alpha value (which is 1 for no scaling), and the beta value (which is the brightness adjustment value).\n", |
1311 | 1311 | " Args:\n", |
1312 | | - " img (np.ndarray): The input image.\n", |
| 1312 | + " image (np.ndarray): The input image.\n", |
1313 | 1313 | " brightness_value (float): The value to add to the pixel intensities. Positive values increase brightness, while negative values decrease it.\n", |
1314 | 1314 | " Returns:\n", |
1315 | 1315 | " np.ndarray: The brightness-adjusted image.\n", |
|
1366 | 1366 | "\n", |
1367 | 1367 | "import cv2\n", |
1368 | 1368 | "import numpy as np\n", |
1369 | | - "def solution_adjust_contrast(img: np.ndarray, contrast_value: float):\n", |
| 1369 | + "def solution_adjust_contrast(image: np.ndarray, contrast_value: float):\n", |
1370 | 1370 | " \"\"\"\n", |
1371 | 1371 | " Adjusts the contrast of the image by scaling the pixel intensities.\n", |
1372 | 1372 | " To adjust the contrast, you can use OpenCV's `cv2.convertScaleAbs()` function, which scales, calculates absolute values, and converts the result to 8-bit.\n", |
1373 | 1373 | " `cv2.convertScaleAbs()` requires three arguments: the image, the alpha value (which is the contrast adjustment value), and the beta value (which is 0 for no additional brightness adjustment).\n", |
1374 | 1374 | " Args:\n", |
1375 | | - " img (np.ndarray): The input image.\n", |
| 1375 | + " image (np.ndarray): The input image.\n", |
1376 | 1376 | " contrast_value (float): The value to scale the pixel intensities. Values greater than 1 increase contrast, while values between 0 and 1 decrease it.\n", |
1377 | 1377 | " Returns:\n", |
1378 | 1378 | " np.ndarray: The contrast-adjusted image.\n", |
|
1429 | 1429 | "\n", |
1430 | 1430 | "import cv2\n", |
1431 | 1431 | "import numpy as np\n", |
1432 | | - "def solution_adjust_saturation(img: np.ndarray, saturation_factor: float):\n", |
| 1432 | + "def solution_adjust_saturation(image: np.ndarray, saturation_factor: float):\n", |
1433 | 1433 | " \"\"\"\n", |
1434 | 1434 | " Adjusts the saturation of the image by modifying the saturation channel in the HSV color space.\n", |
1435 | 1435 | " To do that you need to convert the image from RGB to HSV, adjust the saturation channel, and then convert it back to RGB.\n", |
1436 | 1436 | " Follow these steps:\n", |
1437 | 1437 | " 1. It is very hard to change saturation in standard RGB format because the colors are mixed. \n", |
1438 | | - " First, convert the image to HSV format (`cv2.COLOR_RGB2HSV`) format using: `hsv_img = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)`.\n", |
1439 | | - " 2. Now that the image is in HSV, you can isolate the Saturation. Split the image into its three separate channels using: `cv2.split(hsv_img)`.\n", |
| 1438 | + " First, convert the image to HSV format (`cv2.COLOR_RGB2HSV`) format using: `hsv_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)`.\n", |
| 1439 | + " 2. Now that the image is in HSV, you can isolate the Saturation. Split the image into its three separate channels using: `cv2.split(hsv_image)`.\n", |
1440 | 1440 | " 3. Multiply the saturation channel by the `saturation_factor`. \n", |
1441 | 1441 | " 4. Pixel values cannot go above 255 or below 0. Use NumPy's clip function (`np.clip`) to enforce this limit.\n", |
1442 | 1442 | " 5. Math operations can change the data type. Force the new saturation channel back into standard image format `uint8`.\n", |
1443 | 1443 | " 6. Put the three channels back together in the correct order using `cv2.merge()`.\n", |
1444 | 1444 | " 7. Finally, convert the image back to normal RGB format (`cv2.COLOR_HSV2RGB`).\n", |
1445 | 1445 | "\n", |
1446 | 1446 | " Args:\n", |
1447 | | - " img (np.ndarray): The input image in RGB format.\n", |
| 1447 | + " image (np.ndarray): The input image in RGB format.\n", |
1448 | 1448 | " saturation_factor (float): The multiplier for the saturation channel. Values greater than 1 increase saturation, while values between 0 and 1 decrease it.\n", |
1449 | 1449 | "\n", |
1450 | 1450 | " Returns:\n", |
|
2136 | 2136 | "\n", |
2137 | 2137 | "To prepare the image for Grad-CAM visualization:\n", |
2138 | 2138 | "\n", |
2139 | | - "- First, convert it to (Height, Width, Channels) format using ```img_np = np.transpose(img, (1, 2, 0)) # shape: (H, W, C)```, and normalize its values to the [0, 1] range with ```img_np = (img_np - img_np.min()) / (img_np.max() - img_np.min())```.\n", |
| 2139 | + "- First, convert it to (Height, Width, Channels) format using ```image_np = np.transpose(image, (1, 2, 0)) # shape: (H, W, C)```, and normalize its values to the [0, 1] range with ```image_np = (image_np - image_np.min()) / (image_np.max() - image_np.min())```.\n", |
2140 | 2140 | " This processed image is used only for visualization, as expected by the PyTorch-GradCAM library.\n", |
2141 | | - "- Next, modify the original image for model inference by adding a batch dimension: ```img = np.expand_dims(img, axis=0)```, then convert it to a PyTorch tensor: ```img = torch.from_numpy(img)```, and move it to the appropriate computation device using ```img = img.to(DEVICE)```.\n", |
| 2141 | + "- Next, modify the original image for model inference by adding a batch dimension: ```image = np.expand_dims(image, axis=0)```, then convert it to a PyTorch tensor: ```image = torch.from_numpy(image)```, and move it to the appropriate computation device using ```image = image.to(DEVICE)```.\n", |
2142 | 2142 | "- Finally, retrieve the predicted and true labels, as both are required for computing and visualizing the Grad-CAM output." |
2143 | 2143 | ] |
2144 | 2144 | }, |
|
2154 | 2154 | "\n", |
2155 | 2155 | "# Get a batch of images\n", |
2156 | 2156 | "idx = 1\n", |
2157 | | - "img = original_images[idx]\n", |
| 2157 | + "image = original_images[idx]\n", |
2158 | 2158 | "pred_label = predicted_labels[idx]\n", |
2159 | 2159 | "true_label = true_labels[idx]" |
2160 | 2160 | ] |
|
2168 | 2168 | "\n", |
2169 | 2169 | "To compute the Grad-CAM heatmap:\n", |
2170 | 2170 | "\n", |
2171 | | - "- First, ensure that the `requires_grad` attribute of the input image tensor is set to `True` by using `img.requires_grad = True`.\n", |
| 2171 | + "- First, ensure that the `requires_grad` attribute of the input image tensor is set to `True` by using `image.requires_grad = True`.\n", |
2172 | 2172 | " This enables gradient computation with respect to the image, which is necessary for generating class activation maps.\n", |
2173 | 2173 | "- Next, specify the layer to inspect using ```target_layers = [model.conv3]```.\n", |
2174 | 2174 | "- Typically, the last convolutional layer of the image classifier is chosen because it preserves spatial information, which is crucial for identifying the regions of the input image that most strongly influence the model's prediction. \n", |
|
2177 | 2177 | "```python\n", |
2178 | 2178 | "# Create CAM object\n", |
2179 | 2179 | "with GradCAM(model=model, target_layers=target_layers) as cam:\n", |
2180 | | - " grad_cam_matrix = cam(input_tensor=img, targets=targets)\n", |
| 2180 | + " grad_cam_matrix = cam(input_tensor=image, targets=targets)\n", |
2181 | 2181 | " grad_cam_matrix = grad_cam_matrix[0, :]\n", |
2182 | 2182 | "```" |
2183 | 2183 | ] |
|
2208 | 2208 | "source": [ |
2209 | 2209 | "#### Visualise Grad-CAM heatmap with the image\n", |
2210 | 2210 | "\n", |
2211 | | - "After obtaining the Grad-CAM heatmap, we overlay it on the input image to visualise the regions that contributed most to the model’s prediction (```visualisation = show_cam_on_image(img_np, grad_cam_matrix, use_rgb=True)```).\n", |
| 2211 | + "After obtaining the Grad-CAM heatmap, we overlay it on the input image to visualise the regions that contributed most to the model’s prediction (```visualisation = show_cam_on_image(image_np, grad_cam_matrix, use_rgb=True)```).\n", |
2212 | 2212 | "This helps identify which pixels the model focused on when predicting the class." |
2213 | 2213 | ] |
2214 | 2214 | }, |
|
2227 | 2227 | "# Plot image with GradCAM output\n", |
2228 | 2228 | "true_class = CIFAR_10_CLASSES[true_label]\n", |
2229 | 2229 | "pred_class = CIFAR_10_CLASSES[pred_label]\n", |
2230 | | - "plot_multiple_images((img_np, f\"Original - {true_class}\"), (visualisation, f\"Grad-CAM - {pred_class}\"), figsize = (5,6))" |
| 2230 | + "plot_multiple_images((image_np, f\"Original - {true_class}\"), (visualisation, f\"Grad-CAM - {pred_class}\"), figsize = (5,6))" |
2231 | 2231 | ] |
2232 | 2232 | } |
2233 | 2233 | ], |
|
0 commit comments