Skip to content

Commit 3321474

Browse files
authored
Merge pull request #54021 from Orin-Thomas/orthomas-30Mar26a
Rework of NLP tensorflow model
2 parents 574e538 + c6c0126 commit 3321474

40 files changed

Lines changed: 1451 additions & 2828 deletions

learn-pr/tensorflow/intro-natural-language-processing-tensorflow/1-introduction.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,8 @@ metadata:
66
description: Introduction to natural language processing with TensorFlow
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 03/29/2026
1010
ms.topic: unit
11-
ms.custom: team=nextgen
1211
durationInMinutes: 1
1312
content: |
1413
[!include[](includes/1-introduction.md)]

learn-pr/tensorflow/intro-natural-language-processing-tensorflow/2-represent-text-as-tensors.yml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,11 @@ uid: learn.tensorflow.intro-natural-language-processing.representing-text
33
title: Representing text as Tensors
44
metadata:
55
title: Representing text as Tensors
6-
description: In this unit, we discuss different ways text can be represented as tensors and solve a simple text classification problem.
6+
description: In this unit, we discuss different ways text can be represented as tensors and solve a text classification problem.
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 03/29/2026
1010
ms.topic: unit
11-
ms.custom: team=nextgen
1211
durationInMinutes: 10
13-
sandbox: true
14-
notebook: notebooks/2-represent-text-as-tensors.ipynb
12+
content: |
13+
[!include[](includes/2-represent-text-as-tensors.md)]

learn-pr/tensorflow/intro-natural-language-processing-tensorflow/3-embeddings.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,8 @@ metadata:
66
description: Embeddings are a way to represent words using some vector representation that has nice semantic properties. We discuss different embeddings, and how using embeddings can improve classification accuracy.
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 03/29/2026
1010
ms.topic: unit
11-
ms.custom: team=nextgen
1211
durationInMinutes: 15
13-
sandbox: true
14-
notebook: notebooks/3-embeddings.ipynb
12+
content: |
13+
[!include[](includes/3-embeddings.md)]

learn-pr/tensorflow/intro-natural-language-processing-tensorflow/4-recurrent-networks.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,8 @@ metadata:
66
description: While traditional fully connected networks don't allow us to capture word order, RNN is a mechanism that can capture patterns in sequences. We show how to use RNN for text classification, and discuss different RNN architectures, such as LSTM and GRU.
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 03/29/2026
1010
ms.topic: unit
11-
ms.custom: team=nextgen
1211
durationInMinutes: 15
13-
sandbox: true
14-
notebook: notebooks/4-recurrent-networks.ipynb
12+
content: |
13+
[!include[](includes/4-recurrent-networks.md)]

learn-pr/tensorflow/intro-natural-language-processing-tensorflow/5-generative-networks.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,8 @@ metadata:
66
description: In this unit, we learn how to use recurrent networks to generate text, by using the output at each network layer.
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 03/29/2026
1010
ms.topic: unit
11-
ms.custom: team=nextgen
1211
durationInMinutes: 15
13-
sandbox: true
14-
notebook: notebooks/5-generative-networks.ipynb
12+
content: |
13+
[!include[](includes/5-generative-networks.md)]

learn-pr/tensorflow/intro-natural-language-processing-tensorflow/6-knowledge-check.yml

Lines changed: 27 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4,23 +4,22 @@ title: Module assessment
44
metadata:
55
title: Module assessment
66
description: Check your knowledge
7-
ms.date: 07/07/2021
7+
ms.date: 03/29/2026
88
author: Orin-Thomas
99
ms.author: orthomas
1010
ms.topic: unit
11-
ms.custom: team=nextgen
1211
module_assessment: true
1312
durationInMinutes: 5
1413
quiz:
1514
questions:
16-
- content: "Suppose your text corpus contains 80,000 different words. Which of the below would you complete reducing the dimensionality of the input vector to neural classifier?"
15+
- content: "Suppose your text corpus contains 80,000 different words. Which of the following would help reduce the dimensionality of the input vector to a neural classifier?"
1716
choices:
1817
- content: "Randomly select 10% of the words and ignore the rest."
1918
isCorrect: false
2019
explanation: "It's definitely not a good idea, especially because you risk omitting semantically important words"
2120
- content: "Use convolutional layer before fully connected classifier layer"
2221
isCorrect: false
23-
explanation: "Convolutional layers don't reduce the dimensionality of input vectors"
22+
explanation: "Convolutional layers extract spatial features from an already-encoded input, but they don't reduce the vocabulary dimension itself. An embedding layer is the standard approach to map sparse high-dimensional word representations into dense low-dimensional vectors."
2423
- content: "Use embedding layer before fully connected classifier layer"
2524
isCorrect: true
2625
explanation: "This is correct"
@@ -45,32 +44,41 @@ quiz:
4544
choices:
4645
- content: "A network is applied for each input element and output from the previous application is passed to the next one"
4746
isCorrect: true
48-
explanation: "This is correct."
47+
explanation: "This is correct. The same network weights are applied at each time step, and the hidden state from the previous step is passed as input to the next, creating a recurrence."
4948
- content: "It's trained by a recurrent process"
5049
isCorrect: false
51-
explanation: "Recurrent neural network is trained in the same manner as any other neural network"
52-
- content: "It consists of layers which include other subnetworks"
50+
explanation: "Recurrent neural networks are trained using backpropagation through time, but that is the training algorithm, not the reason they're called recurrent."
51+
- content: "It consists of layers, which include other subnetworks"
5352
isCorrect: false
54-
explanation: "While you can consider recurrent block to be a combination of two linear layers, it has nothing to do with recurrence"
53+
explanation: "While you can consider a recurrent block to be a combination of two linear layers, nesting subnetworks has nothing to do with recurrence."
54+
- content: "The network processes the entire input multiple times in repeated passes"
55+
isCorrect: false
56+
explanation: "An RNN processes the input sequence once, stepping through it one token at a time. The recurrence refers to passing state between steps, not revisiting the entire input."
5557
- content: "What is the main idea behind LSTM network architecture?"
5658
choices:
5759
- content: "Fixed number of LSTM blocks for the whole dataset"
5860
isCorrect: false
59-
explanation: "Number of LSTM blocks depend on the sequence length in the minibatch"
61+
explanation: "The number of LSTM blocks depends on the sequence length in the minibatch, not on the dataset as a whole."
6062
- content: "It contains many layers of recurrent neural networks"
6163
isCorrect: false
62-
explanation: "LSTM can consist of one or more levels"
63-
- content: "Explicit state management with forgetting and state triggering"
64+
explanation: "An LSTM can consist of one or more layers. The defining feature of an LSTM is its gating mechanism, not the number of layers."
65+
- content: "LSTMs use gating mechanisms (forget, input, and output gates) that explicitly control which information is retained or discarded across time steps"
6466
isCorrect: true
65-
explanation: "In LSTM, each block receives and outputs a state, which is manipulated upon inside the block depending on input and previous state."
66-
- content: "What is the main idea of attention?"
67+
explanation: "Correct. LSTM gates solve the vanishing gradient problem found in simple RNNs by allowing the network to selectively retain or discard information in its cell state across many time steps."
68+
- content: "LSTMs use a larger hidden state vector than simple RNNs"
69+
isCorrect: false
70+
explanation: "The hidden state size is a hyperparameter that can be set to any value for both simple RNNs and LSTMs. The key innovation of LSTMs is the gating mechanism, not the size of the hidden state."
71+
- content: "What is the main advantage of using TF-IDF representation over a simple bag-of-words representation?"
6772
choices:
68-
- content: "Attention assigns a weight coefficient to each word in the vocabulary to show how important it's"
73+
- content: "TF-IDF captures the order of words in a sentence"
6974
isCorrect: false
70-
explanation: "Not correct. Attention works inside each sentence, and reflects relative importance between words."
71-
- content: "Attention is a network layer that uses attention matrix to see how much input states from each step affect the final result."
75+
explanation: "Neither bag-of-words nor TF-IDF captures word order. Both represent documents as unordered collections of word weights."
76+
- content: "TF-IDF gives higher weight to words that are more important for distinguishing documents, by down-weighting common words"
7277
isCorrect: true
73-
explanation: "Correct. By looking at attention matrix we can visually estimate which words play more important role in different parts of the sentence."
74-
- content: "Attention builds global correlation matrix between all words in vocabulary, showing their cooccurrence"
78+
explanation: "Correct. TF-IDF reduces the weight of frequently occurring words (like 'the' and 'a') and increases the weight of words that are distinctive to specific documents."
79+
- content: "TF-IDF uses neural networks to learn word importance"
80+
isCorrect: false
81+
explanation: "TF-IDF is a purely statistical method based on term frequency and document frequency. It doesn't involve any neural network training."
82+
- content: "TF-IDF produces lower-dimensional vectors than bag-of-words"
7583
isCorrect: false
76-
explanation: "This isn't correct, attention computer relative importance of words inside each sentence."
84+
explanation: "TF-IDF vectors have the same dimensionality as bag-of-words vectors (one element per vocabulary term). The difference is that TF-IDF assigns floating-point weights instead of simple counts."

learn-pr/tensorflow/intro-natural-language-processing-tensorflow/7-summary.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,8 @@ metadata:
66
description: In this final unit, we summarize what we have learned and what should you focus on next if you want to continue your journey into NLP.
77
author: Orin-Thomas
88
ms.author: orthomas
9-
ms.date: 07/07/2021
9+
ms.date: 03/29/2026
1010
ms.topic: unit
11-
ms.custom: team=nextgen
1211
durationInMinutes: 2
1312
content: |
1413
[!include[](includes/7-summary.md)]
Lines changed: 10 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,16 @@
1-
In this module, we will explore different neural network architectures for dealing with natural language text. In recent years, **Natural Language Processing** (NLP) has experienced fast growth as a field, both because of improvements to the language model architectures and because they've been trained on increasingly large text corpora. As a result, their ability to "understand" text has vastly improved, and large pre-trained models such as BERT have become widely used.
1+
In this module, we explore different neural network architectures for dealing with natural language text. In recent years, **Natural Language Processing** (NLP) has experienced fast growth as a field, both because of improvements to the language model architectures and because they've been trained on increasingly large text corpora. As a result, their ability to "understand" text has vastly improved.
22

3-
We will focus on the fundamental aspects of representing NLP as tensors in TensorFlow, and on classical NLP architectures, such as using bag-of-words, embeddings and recurrent neural networks.
3+
We focus on the fundamental aspects of representing NLP as tensors in TensorFlow, and on classical NLP architectures, such as using bag-of-words, embeddings, and recurrent neural networks.
44

5-
## Natural Language Tasks
5+
## Natural language tasks
66

77
There are several NLP tasks that we can solve using neural networks:
8-
* **Text Classification** is used when we need to classify a text fragment into one of several predefined classes. Examples include e-mail spam detection, news categorization, assigning a support request to a category, and more.
9-
* **Intent Classification** is one specific case of text classification, where we want to map an input utterance in the conversational AI system into one of the intents that represent the actual meaning of the phrase, or intent of the user.
10-
* **Sentiment Analysis** is a regression task, where we want to understand the degree of positivity of a given piece of text. We may want to label text in a dataset from most negative (-1) to most positive (+1), and train a model that will output a number representing the positivity of the input text.
11-
* **Named Entity Recognition** (NER) is the task of extracting entities from text, such as dates, addresses, people names, etc. Together with intent classification, NER is often used in dialog systems to extract parameters from the user's utterance.
12-
* A similar task of **Keyword Extraction** can be used to find the most meaningful words inside a text, which can then be used as tags.
13-
* **Text Summarization** extracts the most meaningful pieces of text, giving the user a compressed version of the original text.
8+
* **Text Classification** is used when we need to classify a text fragment into one of several predefined classes. Examples include e-mail spam detection, news categorization, assigning a support request to a category, and more.
9+
* **Intent Classification** is one specific case of text classification, where we want to map an input utterance in the conversational AI system into one of the intents that represent the actual meaning of the phrase, or intent of the user.
10+
* **Sentiment Analysis** is the task of understanding the degree of positivity of a given piece of text. It can be approached as a classification task (for example, labeling text as positive, negative, or neutral) or as a regression task, where we label text from most negative (-1) to most positive (+1) and train a model that outputs a number representing the positivity of the input text.
11+
* **Named Entity Recognition** (NER) is the task of extracting entities from text, such as dates, addresses, people names, etc. Together with intent classification, NER is often used in dialog systems to extract parameters from the user's utterance.
12+
* A similar task of **Keyword Extraction** can be used to find the most meaningful words inside a text, which can then be used as tags.
13+
* **Text Summarization** extracts the most meaningful pieces of text, giving the user a compressed version of the original text.
1414
* **Question Answering** is the task of extracting an answer from a piece of text. This model takes a text fragment and a question as input, and finds the exact place within the text that contains the answer. For example, the text "*John is a 22 year old student who loves to use Microsoft Learn*", and the question *How old is John* should provide us with the answer *22*.
1515

16-
In this module, we will mostly focus on the **Text Classification** task. However, we will learn all the important concepts that we need to handle more difficult tasks in the future.
17-
18-
## Learning objectives
19-
- Understand how text is processed for NLP tasks
20-
- Learn about Recurrent Neural Networks (RNNs) and Generative Neural Networks (GNNs)
21-
- Learn about Attention Mechanisms
22-
- Learn how to build text classification models
23-
24-
## Prerequisites
25-
- Knowledge of Python
26-
- Basic understanding of machine learning
16+
In this module, we'll mostly focus on the **Text Classification** task. However, we'll learn all the important concepts that we need to handle more difficult tasks in the future.

0 commit comments

Comments
 (0)