Merge pull request #54079 from GraemeMalcolm/main

prmerger-automator[bot] · web-flow · commit 165c172dabec · 2026-04-01T20:15:36.000Z
Updates to include language detection and PII
diff --git a/learn-pr/wwl-data-ai/get-started-ai-fundamentals/includes/5-natural-language-processing.md b/learn-pr/wwl-data-ai/get-started-ai-fundamentals/includes/5-natural-language-processing.md
@@ -14,12 +14,13 @@
 
 Natural language processing (NLP) is a broad term that covers AI models and techniques for making sense of language. NLP is the foundation on which generative AI large language models (LLMs) are built.
 
-While many natural language processing scenarios are handled by generative AI models today, there are common text analysis use cases where simpler NLP language models can be more cost-effective.
+While many natural language processing scenarios are handled by generative AI models today, there are common text analysis use cases where specialist NLP tools are used to produce predictable results or apply custom rules.
 
 ![Diagram of text being analyzed for sentiment, keywords, and summarization.](../media/text-analysis.png)
 
+- *Language detection* - determining which language (or languages) a document is written in. Language detection is often the first step in a multi-stage text processing workflow.
 - *Text classification* - assigning document to a specific category; including *sentiment analysis* to determine whether a body of text is positive, negative, or neutral.
-- *Key-term extraction* and *entity detection* - identifying key words or phrases in a document, and finding mentions of entities like people, places, organizations.
+- *Key-term extraction* and *entity detection* - identifying key words or phrases in a document, and finding mentions of entities like people, places, and organizations. A particularly specialized form of entity detection is to detect and redact *personally identifiable information (PII)*; such as names, addresses, telephone numbers, and other private details.
 - *Summarization* - Reducing the volume of text while still encapsulating the main points.
 
 ## Text analysis scenarios
@@ -29,5 +30,6 @@ Common uses of NLP technologies for text analysis include:
 - Analyzing document or transcripts of calls and meetings to determine key subjects and identify specific mentions of people, places, organizations, products, or other entities.
 - Analyzing social media posts, product reviews, or articles to evaluate sentiment and opinion.
 - Implementing chatbots that can answer frequently asked questions or orchestrate predictable conversational dialogs that don't require the complexity of generative AI.
+- Redacting PII before sharing or analyzing data to comply with privacy policies and legislation.
 
 ::: zone-end
diff --git a/learn-pr/wwl-data-ai/introduction-language/includes/1-introduction.md b/learn-pr/wwl-data-ai/introduction-language/includes/1-introduction.md
@@ -10,8 +10,10 @@ Within artificial intelligence (AI), text analysis is a subset of natural langua
 
 Techniques to process and analyze text evolved over many years, from simple statistical calculations based on term-frequency to vector-based language models that encapsulate semantic meaning. Some common use cases for text analysis include:
 
+- **Language detection**: Determining the language (or languages) in which text is written - often as the first step in a multi-step text processing workflow.
 - **Key term extraction**: Identifying important words and phrases in text, to help determine the topics and themes it discusses.
 - **Entity detection**: Identifying named entities mentioned in text; for example, places, people, dates, and organizations.
+- **Personally identifiable information (PII) detection**: Identifying and redacting personal details in text, such as names, addresses, telephone numbers, financial account details, and other sensitive information.
 - **Text classification**: Categorizing text documents based on their contents. For example, filtering email as *spam* or *not spam*.
 - **Sentiment analysis**: A particular form of text classification that predicts the *sentiment* of text - for example, categorizing social media posts as *positive*, *neutral*, or *negative*.
 - **Text summarization**: Reducing the volume of text while retaining its salient points. For example, generating a short one-paragraph summary from a multi-page document.