Skip to content

Commit 23d1840

Browse files
authored
Merge pull request #54078 from Orin-Thomas/orthomas-1apr26b
Rework of Introduction to Computer Vision with TensorFlow module
2 parents 4b013f5 + 5aba732 commit 23d1840

27 files changed

Lines changed: 999 additions & 2923 deletions

learn-pr/tensorflow/intro-computer-vision-tensorflow/1-introduction.yml

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,10 @@ metadata:
66
description: Introduction
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 04/01/2026
1010
ms.update-cycle: 180-days
1111
ms.topic: unit
12-
ms.collection:
13-
- ce-advocates-ai-copilot
14-
ms.custom: team=nextgen
12+
ms.collection: ce-advocates-ai-copilot
1513
durationInMinutes: 1
1614
content: |
1715
[!include[](includes/1-introduction.md)]

learn-pr/tensorflow/intro-computer-vision-tensorflow/2-image-data.yml

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,19 +2,17 @@
22
uid: learn.tensorflow.intro-computer-vision.image-data
33
title: Introduction to image data
44
metadata:
5-
title: Introduction to image data
5+
title: Introduction to Image Data
66
description: Learn how image data can be represented as tensors for neural network training
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 04/01/2026
1010
ms.update-cycle: 180-days
1111
ms.topic: unit
12-
ms.collection:
13-
- ce-advocates-ai-copilot
14-
ms.custom: team=nextgen
12+
ms.collection: ce-advocates-ai-copilot
1513
durationInMinutes: 10
16-
sandbox: true
17-
notebook: notebooks/2-image-data.ipynb
14+
content: |
15+
[!include[](includes/2-image-data.md)]
1816
quiz:
1917
title: Check your knowledge
2018
questions:
@@ -25,7 +23,7 @@ quiz:
2523
explanation: "Don't forget to take into account that the image has color"
2624
- content: "32x32x3 tensor of floats in the range 0..1"
2725
isCorrect: true
28-
explanation: "Correct. We have 3 color channels, and each channel is normalized to the range of 0..1"
26+
explanation: "Correct. We have three color channels, and each channel is normalized to the range of 0..1"
2927
- content: "32x32x3 tensor of integers in the range 0..255"
3028
isCorrect: false
3129
explanation: "Neural networks are trained with float numbers, and they should be normalized in the range 0..1"

learn-pr/tensorflow/intro-computer-vision-tensorflow/3-train-dense-neural-networks.yml

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,19 +2,17 @@
22
uid: learn.tensorflow.intro-computer-vision.train-dense-neural-networks
33
title: Training a dense neural network
44
metadata:
5-
title: Training a dense neural network
5+
title: Training a Dense Neural Network
66
description: Learn how to classify images using one-layer dense neural network
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 04/01/2026
1010
ms.update-cycle: 180-days
1111
ms.topic: unit
12-
ms.collection:
13-
- ce-advocates-ai-copilot
14-
ms.custom: team=nextgen
12+
ms.collection: ce-advocates-ai-copilot
1513
durationInMinutes: 10
16-
sandbox: true
17-
notebook: notebooks/3-train-dense-neural-networks.ipynb
14+
content: |
15+
[!include[](includes/3-train-dense-neural-networks.md)]
1816
quiz:
1917
title: Check your knowledge
2018
questions:
@@ -42,19 +40,19 @@ quiz:
4240
explanation: "Correct, both of those solutions are valid."
4341
- content: "We want to monitor the accuracy of the model on validation dataset during training. What do we need to do?"
4442
choices:
45-
- content: "Specify `metrics=['acc']` in a call to `model.compile`"
43+
- content: "Specify `metrics=['accuracy']` in a call to `model.compile`"
4644
isCorrect: false
4745
explanation: "This isn't all"
48-
- content: "Specify `metrics=['acc']` in a call to `model.fit`"
46+
- content: "Specify `metrics=['accuracy']` in a call to `model.fit`"
4947
isCorrect: false
5048
explanation: "Metrics are specified during model compilation phase"
5149
- content: "Provide validation dataset using `validation_data` parameter in `model.fit`"
5250
isCorrect: false
5351
explanation: "This isn't all"
54-
- content: "Options (a) and (c)"
52+
- content: "Specify `metrics=['accuracy']` in a call to `model.compile`, and provide validation dataset using `validation_data` parameter in `model.fit`"
5553
isCorrect: true
56-
explanation: "Correct, you need to make sure that metric is specified and validation dataset is provided"
57-
- content: "Options (b) and (c)"
54+
explanation: "Correct, you need to make sure that metric is specified during compilation and validation dataset is provided during fitting"
55+
- content: "Specify `metrics=['accuracy']` in a call to `model.fit`, and provide validation dataset using `validation_data` parameter in `model.fit`"
5856
isCorrect: false
59-
explanation: "Metrics are specified during model compilation phase"
57+
explanation: "Metrics are specified during model compilation phase, not in the fit call"
6058

learn-pr/tensorflow/intro-computer-vision-tensorflow/4-multilayer-dense-neural-networks.yml

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,19 +2,17 @@
22
uid: learn.tensorflow.intro-computer-vision.multilayer-dense-neural-networks
33
title: Multi-layer networks
44
metadata:
5-
title: Multi-layer networks
5+
title: Multi-Layer Networks
66
description: Learn how multi-layer neural networks can be constructed to improve model accuracy
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 04/01/2026
1010
ms.update-cycle: 180-days
1111
ms.topic: unit
12-
ms.collection:
13-
- ce-advocates-ai-copilot
14-
ms.custom: team=nextgen
12+
ms.collection: ce-advocates-ai-copilot
1513
durationInMinutes: 10
16-
sandbox: true
17-
notebook: notebooks/4-multilayer-dense-neural-networks.ipynb
14+
content: |
15+
[!include[](includes/4-multilayer-dense-neural-networks.md)]
1816
quiz:
1917
title: Check your knowledge
2018
questions:

learn-pr/tensorflow/intro-computer-vision-tensorflow/5-convolutional-networks.yml renamed to learn-pr/tensorflow/intro-computer-vision-tensorflow/5-convolution-network.yml

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,18 @@
11
### YamlMime:ModuleUnit
2-
uid: learn.tensorflow.intro-computer-vision.convolutional-networks
2+
uid: learn.tensorflow.intro-computer-vision.convolution-network
33
title: Convolutional neural networks
44
metadata:
5-
title: Convolutional neural networks
5+
title: Convolutional Neural Networks
66
description: Learn special convolutional architecture of neural networks for image recognition tasks
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 04/01/2026
1010
ms.update-cycle: 180-days
1111
ms.topic: unit
12-
ms.collection:
13-
- ce-advocates-ai-copilot
14-
ms.custom: team=nextgen
12+
ms.collection: ce-advocates-ai-copilot
1513
durationInMinutes: 15
16-
sandbox: true
17-
notebook: notebooks/5-convolutional-networks.ipynb
14+
content: |
15+
[!include[](includes/5-convolution-network.md)]
1816
quiz:
1917
title: Check your knowledge
2018
questions:

learn-pr/tensorflow/intro-computer-vision-tensorflow/6-transfer-learning.yml

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,21 @@
22
uid: learn.tensorflow.intro-computer-vision.transfer-learning
33
title: Pretrained models and transfer learning
44
metadata:
5-
title: Pretrained models and transfer learning
5+
title: Pretrained Models and Transfer Learning
66
description: Learn how to use models pretrained on large datasets, and how to train our own models using them.
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 04/01/2026
1010
ms.update-cycle: 180-days
1111
ms.topic: unit
12-
ms.collection:
13-
- ce-advocates-ai-copilot
14-
ms.custom: team=nextgen
12+
ms.collection: ce-advocates-ai-copilot
1513
durationInMinutes: 10
16-
sandbox: true
17-
notebook: notebooks/6-transfer-learning.ipynb
14+
content: |
15+
[!include[](includes/6-transfer-learning.md)]
1816
quiz:
1917
title: Check your knowledge
2018
questions:
21-
- content: "For transfer learning, we're using a VGG-16 network pretrained on 1000 classes. What is the number of classes we can have in our network?"
19+
- content: "For transfer learning, we're using a VGG-16 network pretrained on 1,000 classes. What is the number of classes we can have in our network?"
2220
choices:
2321
- content: "Any"
2422
isCorrect: true

learn-pr/tensorflow/intro-computer-vision-tensorflow/7-summary.yml

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,13 @@ uid: learn.tensorflow.intro-computer-vision.summary
33
title: Summary
44
metadata:
55
title: Summary
6-
description: In this last unit, we briefly recap what we have learned, and set directions for further learning.
6+
description: In this last unit, we briefly recap the module.
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 04/01/2026
1010
ms.update-cycle: 180-days
1111
ms.topic: unit
12-
ms.collection:
13-
- ce-advocates-ai-copilot
14-
ms.custom: team=nextgen
12+
ms.collection: ce-advocates-ai-copilot
1513
durationInMinutes: 1
1614
content: |
1715
[!include[](includes/7-summary.md)]
Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,9 @@
1-
In this module we will learn how to perform different computer vision tasks using [TensorFlow](https://www.tensorflow.org/) and Keras. We will start with an introduction to a small Dense Neural Network (DNNs), then build on that knowledge to dive into Convolutional Neural Networks (CNNs). Lastly, we will look at how we can use pre-trained models in transfer learning to improve our results with less data.
1+
In this module, we learn how to perform different computer vision tasks using [TensorFlow](https://www.tensorflow.org/) and Keras. We start with an introduction to dense neural networks, then build on that knowledge to dive into Convolutional Neural Networks (CNNs). Lastly, we look at how we can use pretrained models in transfer learning to improve our results with less data.
22

3-
## Learning objectives
4-
- Introduction to how to build computer vision machine learning models
5-
- Intro to training with Dense Neural Networks (DNNs)
6-
- Intro to Convolutional Neural Networks (CNNs)
3+
In this module you will:
4+
5+
- Build computer vision machine learning models
6+
- Train image classifiers with dense neural networks
7+
- Apply Convolutional Neural Networks (CNNs) to extract spatial patterns
8+
- Use transfer learning with pretrained models to classify custom images
79

8-
## Prerequisites
9-
- Knowledge of Python
10-
- Basic knowledge about how to use Jupyter Notebooks
11-
- Basic understand of machine learning
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
In computer vision, we normally solve one of the following problems:
2+
3+
- **Image Classification** is the simplest task, when we need to classify an image into one of many predefined categories, for example, distinguish a cat from a dog on a photograph, or recognize a handwritten digit.
4+
- **Object Detection** is a bit more difficult task, in which we need to find known objects on the picture and localize them, that is, return the **bounding box** for each of recognized objects.
5+
- **Segmentation** is similar to object detection, but instead of giving bounding box we need to return an exact pixel map outlining each of the recognized objects.
6+
7+
![An image showing how computer vision object detection can be performed with cats, dogs, and ducks.](../media/cat-dog-duck.png)
8+
9+
*Image from [CS231n Stanford Course](https://cs231n.stanford.edu/)*
10+
11+
## Images as tensors
12+
13+
Computer Vision works with Images. As you probably know, images consist of pixels, so they can be thought of as a rectangular collection (array) of pixels.
14+
15+
In the first part of this module, we'll deal with handwritten digit recognition. We'll use the MNIST dataset, which consists of grayscale images of handwritten digits, 28x28 pixels. Each image can be represented as 28x28 array, and elements of this array would denote intensity of corresponding pixel - either in the scale of range 0 to 1 (in which case floating point numbers are used), or 0 to 255 (integers). A popular Python library called `numpy` is often used with computer vision tasks, because it allows you to operate with multidimensional arrays effectively.
16+
17+
To deal with color images, we need some way to represent colors. In most cases, we represent each pixel by 3 intensity values, corresponding to Red (R), Green (G), and Blue (B) components. This color encoding is called RGB, and thus color image of size W×H will be represented as an array of size H×W×3 (sometimes the order of components might be different, but the idea is the same). In array representation, the height (number of rows) comes before the width (number of columns), which is the opposite of the common image convention of W×H.
18+
19+
Multi-dimensional arrays are also called **tensors**. Using tensors to represent images also has an advantage, because we can use an extra dimension to store a sequence of images. For example, to represent a video fragment consisting of 200 frames with 800x600 dimension (width × height), we may use the tensor of size 200x600x800x3. Remember that tensor dimensions use H×W (row-major) order, not the W×H convention commonly seen in image editors. The order here's frames × height (600) × width (800) × channels. This ordering is known as `channels_last` and is the default in TensorFlow; some other frameworks place channels before height and width (`channels_first`).
20+
21+
```python
22+
import tensorflow as tf
23+
import keras
24+
import matplotlib.pyplot as plt
25+
import numpy as np
26+
27+
# Prints the installed TensorFlow version
28+
print(tf.__version__)
29+
```
30+
31+
We're going to use the [Keras](https://keras.io/) framework for our experiments. Throughout this module, we use `import keras` (the standalone Keras 3 import style), which requires **TensorFlow 2.16 or later** (or a standalone installation via `pip install keras>=3.0`). If you're using an older TensorFlow 2.x version, replace `import keras` with `from tensorflow import keras`.
32+
33+
```python
34+
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
35+
36+
# Output: (60000, 28, 28) (60000,)
37+
print(x_train.shape, y_train.shape)
38+
# Output: (10000, 28, 28) (10000,)
39+
print(x_test.shape, y_test.shape)
40+
```
41+
42+
## Visualize the digits dataset
43+
44+
Now that we have downloaded the dataset we can visualize some of the digits:
45+
46+
```python
47+
fig, ax = plt.subplots(1, 7)
48+
for i in range(7):
49+
ax[i].imshow(x_train[i])
50+
ax[i].set_title(y_train[i])
51+
ax[i].axis('off')
52+
# Displays a row of seven handwritten digit images with their labels
53+
```
54+
55+
## Dataset structure
56+
57+
We have a total of 60,000 training images and 10,000 test images, and each image has a size of 28×28 pixels:
58+
59+
```python
60+
print('Training samples:', len(x_train))
61+
print('Test samples:', len(x_test))
62+
print('Tensor size:', x_train[0].shape)
63+
print('First 10 digits are:', y_train[:10])
64+
print('Type of data is ', type(x_train))
65+
# Output:
66+
# Training samples: 60000
67+
# Test samples: 10000
68+
# Tensor size: (28, 28)
69+
# First 10 digits are: [5 0 4 1 9 2 1 3 1 4]
70+
# Type of data is <class 'numpy.ndarray'>
71+
```
72+
73+
As you can see, the type of data is `numpy` array. Each pixel intensity is represented by an integer value between 0 and 255:
74+
75+
```python
76+
print('Min intensity value: ', x_train.min())
77+
print('Max intensity value: ', x_train.max())
78+
# Output:
79+
# Min intensity value: 0
80+
# Max intensity value: 255
81+
```
82+
83+
The reason it is between 0 and 255 is because each pixel is represented by an 8-bit integer. In many cases, especially when working with neural networks, it's more convenient to scale all values to the range [0, 1] by dividing by 255. This process is called **normalization**:
84+
85+
```python
86+
x_train = x_train.astype(np.float32) / 255.0
87+
x_test = x_test.astype(np.float32) / 255.0
88+
# Pixel values are now floating point numbers in the range [0, 1]
89+
```
90+
91+
Now we have the data, and we're ready to start training our first neural network!

0 commit comments

Comments
 (0)