Skip to content

Latest commit

 

History

History
94 lines (68 loc) · 4.71 KB

File metadata and controls

94 lines (68 loc) · 4.71 KB
title Use ai.fix_grammar with PySpark
description Learn how to correct the spelling, grammar, and punctuation of input text by using the ai.fix_grammar function with PySpark.
ms.reviewer vimeland
ms.topic how-to
ms.date 11/13/2025
ms.search.form AI functions

Use ai.fix_grammar with PySpark

The ai.fix_grammar function uses generative AI to correct the spelling, grammar, and punctuation of input text, with a single line of code.

Note

Overview

The ai.fix_grammar function is available for Spark DataFrames. You must specify the name of an existing input column as a parameter.

The function returns a new DataFrame that includes corrected text for each input text row, stored in an output column.

Syntax

df.ai.fix_grammar(input_col="input", output_col="corrections")

Parameters

Name Description
input_col
Required
A string that contains the name of an existing column with input text values to correct for spelling, grammar, and punctuation.
output_col
Optional
A string that contains the name of a new column to store corrected text for each row of input text. If you don't set this parameter, a default name generates for the output column.
error_col
Optional
A string that contains the name of a new column to store any OpenAI errors that result from processing each row of input text. If you don't set this parameter, a default name generates for the error column. If there are no errors for a row of input, the value in this column is null.

Returns

The function returns a Spark DataFrame that includes a new column that contains corrected text for each row of text in the input column. If the input text is null, the result is null.

Example

# This code uses AI. Always review output for mistakes.

df = spark.createDataFrame([
        ("There are an error here.",),
        ("She and me go weigh back. We used to hang out every weeks.",),
        ("The big picture are right, but you're details is all wrong.",)
    ], ["text"])

results = df.ai.fix_grammar(input_col="text", output_col="corrections")
display(results)

This example code cell provides the following output:

:::image type="content" source="../../media/ai-functions/fix-grammar-example-output.png" alt-text="Screenshot showing a data frame with a 'text' column and a 'corrections' column, which has the text from the text column with corrected grammar." lightbox="../../media/ai-functions/fix-grammar-example-output.png":::

Multimodal input

The ai.fix_grammar function supports file-based multimodal input. You can fix grammar in the content of PDFs and text files by setting input_col_type="path". For more information about supported file types and setup, see Use multimodal input with AI functions.

# This code uses AI. Always review output for mistakes.

results = custom_df.ai.fix_grammar(
    input_col="file_path",
    input_col_type="path",
    output_col="corrections",
)
display(results)

Related content