Skip to content

Latest commit

 

History

History
211 lines (141 loc) · 14.4 KB

File metadata and controls

211 lines (141 loc) · 14.4 KB
title Analyze Fraudulent Call Data with Azure Stream Analytics
description Learn how to analyze fraudulent call data using Azure Stream Analytics and visualize results in a Power BI dashboard with this step-by-step tutorial.
author ajetasin
ms.author ajetasi
ms.reviewer spelluru
ms.service azure-stream-analytics
ms.topic tutorial
ms.date 02/19/2026

Tutorial: Analyze fraudulent call data with Stream Analytics and visualize results in Power BI dashboard

This tutorial shows you how to analyze phone call data using Azure Stream Analytics. The phone call data, generated by a client application, contains fraudulent calls, which the Stream Analytics job detects. You can use techniques from this tutorial for other types of fraud detection, such as credit card fraud or identity theft.

In this tutorial, you perform the following tasks:

[!div class="checklist"]

  • Generate sample phone call data and send it to Azure Event Hubs.
  • Create a Stream Analytics job.
  • Configure job input and output.
  • Define queries to filter fraudulent calls.
  • Test and start the job.
  • Visualize results in Power BI.

Prerequisites

Before you start, make sure you complete the following steps:

  • If you don't have an Azure subscription, create a free account.
  • Download the phone call event generator app, TelcoGenerator.zip from the Microsoft Download Center or get the source code from GitHub.
  • You need a Power BI account.

[!INCLUDE event-generator-app]

Configure job output

The last step is to define an output sink where the job can write the transformed data. In this tutorial, you output and visualize data by using Power BI.

  1. From the Azure portal, open All resources, and select the ASATutorial Stream Analytics job.

  2. In the Job Topology section of the Stream Analytics job, select the Outputs option.

  3. Select + Add output > Power BI.

    :::image type="content" source="./media/stream-analytics-real-time-fraud-detection/select-output-type.png" alt-text="Screenshot that shows the Outputs page with Add output -> Power BI menu selected." lightbox="./media/stream-analytics-real-time-fraud-detection/select-output-type.png":::

  4. Fill the output form with the following details:

    Setting Suggested value
    Output alias MyPBIoutput
    Group workspace My workspace
    Dataset name ASAdataset
    Table name ASATable
    Authentication mode User token
  5. Select Authorize and follow the prompts to authenticate Power BI.

    :::image type="content" source="media/stream-analytics-real-time-fraud-detection/configure-stream-analytics-output.png" alt-text="Screenshot of configuring Stream Analytics output." lightbox="media/stream-analytics-real-time-fraud-detection/configure-stream-analytics-output.png":::

  6. Select Save at the bottom of the Power BI page.

    This tutorial uses the User token authentication mode. To use Managed Identity, see Use Managed Identity to authenticate your Azure Stream Analytics job to Power BI.

Create queries to transform real-time data

At this point, you set up a Stream Analytics job to read an incoming data stream. The next step is to create a query that analyzes the data in real time. The queries use a SQL-like language that has some extensions specific to Stream Analytics.

In this section of the tutorial, you create and test several queries to learn a few ways you can transform an input stream for analysis.

The queries you create here just display the transformed data to the screen. In a later section, you write the transformed data to Power BI.

For more information about the language, see the Azure Stream Analytics Query Language Reference.

Test by using a pass-through query

If you want to archive every event, use a pass-through query to read all the fields in the payload of the event.

  1. Go to your Stream Analytics job in the Azure portal and select Query under Job topology in the left menu.

  2. In the query window, enter this query:

    SELECT 
        *
    FROM 
        CallStream

    [!NOTE] As with SQL, keywords aren't case-sensitive, and whitespace isn't significant.

    In this query, CallStream is the alias that you specified when you created the input. If you used a different alias, use that name instead.

  3. Select Test query.

    The Stream Analytics job runs the query against the sample data from the input and displays the output at the bottom of the window. The results indicate that the Event Hubs and the Streaming Analytics job are configured correctly.

    :::image type="content" source="media/stream-analytics-real-time-fraud-detection/sample-output-passthrough.png" alt-text="Screenshot of sample output from test query." lightbox="media/stream-analytics-real-time-fraud-detection/sample-output-passthrough.png":::

    The exact number of records you see depends on how many records were captured in the sample.

Reduce the number of fields by using a column projection

In many cases, your analysis doesn't need all the columns from the input stream. Use a query to project a smaller set of returned fields than in the pass-through query.

Run the following query and notice the output.

SELECT CallRecTime, SwitchNum, CallingIMSI, CallingNum, CalledNum 
INTO
    [MyPBIoutput]
FROM 
    CallStream

Count incoming calls by region: Tumbling window with aggregation

Suppose you want to count the number of incoming calls per region. In streaming data, when you want to perform aggregate functions like counting, you need to segment the stream into temporal units, since the data stream itself is effectively endless. You do this segmentation by using a Streaming Analytics window function. You can then work with the data inside that window as a unit.

For this transformation, you want a sequence of temporal windows that don't overlap - each window has a discrete set of data that you can group and aggregate. This type of window is referred to as a Tumbling window. Within the Tumbling window, you can get a count of the incoming calls grouped by SwitchNum, which represents the region where the call originated.

  1. Paste the following query in the query editor:

    SELECT 
        System.Timestamp as WindowEnd, SwitchNum, COUNT(*) as CallCount 
    FROM
        CallStream TIMESTAMP BY CallRecTime 
    GROUP BY TUMBLINGWINDOW(s, 5), SwitchNum

    This query uses the Timestamp By keyword in the FROM clause to specify which timestamp field in the input stream to use to define the Tumbling window. In this case, the window divides the data into segments by the CallRecTime field in each record. If you don't specify a field, the windowing operation uses the time that each event arrives at the event hub. For more information, see "Arrival Time vs Application Time" in Stream Analytics Query Language Reference.

    The projection includes System.Timestamp, which returns a timestamp for the end of each window.

    To specify that you want to use a Tumbling window, use the TUMBLINGWINDOW function in the GROUP BY clause. In the function, you specify a time unit (anywhere from a microsecond to a day) and a window size (how many units). In this example, the Tumbling window consists of 5-second intervals, so you get a count by region for every 5 seconds' worth of calls.

  2. Select Test query. In the results, notice that the timestamps under WindowEnd are in 5-second increments.

Detect SIM fraud using a self-join

In this example, consider fraudulent usage to be calls that originate from the same user but in different locations within five seconds of one another. For example, the same user can't legitimately make a call from the US and Australia at the same time.

To check for these cases, use a self-join of the streaming data to join the stream to itself based on the CallRecTime value. You can then look for call records where the CallingIMSI value (the originating number) is the same, but the SwitchNum value (region of origin) isn't the same.

When you use a join with streaming data, you must provide some limits on how far the matching rows can be separated in time. As noted earlier, the streaming data is effectively endless. You specify the time bounds for the relationship inside the ON clause of the join, using the DATEDIFF function. In this case, the join is based on a five-second interval of call data.

  1. Paste the following query in the query editor:

    SELECT System.Timestamp AS WindowEnd, COUNT(*) AS FraudulentCalls
        INTO "MyPBIoutput"
        FROM "CallStream" CS1 TIMESTAMP BY CallRecTime
        JOIN "CallStream" CS2 TIMESTAMP BY CallRecTime
        ON CS1.CallingIMSI = CS2.CallingIMSI
        AND DATEDIFF(ss, CS1, CS2) BETWEEN 1 AND 5
        WHERE CS1.SwitchNum != CS2.SwitchNum
        GROUP BY TumblingWindow(Duration(second, 1))

    This query is like any SQL join except for the DATEDIFF function in the join. This version of DATEDIFF is specific to Streaming Analytics, and it must appear in the ON...BETWEEN clause. The parameters are a time unit (seconds in this example) and the aliases of the two sources for the join. This function is different from the standard SQL DATEDIFF function.

    The WHERE clause includes the condition that flags the fraudulent call: the originating switches aren't the same.

  2. Select Test query. Review the output, and then select Save query.

Start the job and visualize output

  1. To start the job, go to the job Overview and select Start.

  2. Select Now for job output start time and select Start. You can view the job status in the notification bar.

  3. After the job succeeds, go to Power BI and sign in with your work or school account. If the Stream Analytics job query outputs results, the ASAdataset dataset you created appears under the Datasets tab.

  4. From your Power BI workspace, select + Create to create a new dashboard named Fraudulent Calls.

  5. At the top of the window, select Edit and Add tile.

  6. In the Add tile window, select Custom Streaming Data and Next.

  7. Choose the ASAdataset under Your Datasets, and select Next.

  8. Select Card from the Visualization type dropdown, add fraudulent calls to Fields, and then select Next.

    :::image type="content" source="./media/stream-analytics-real-time-fraud-detection/chart-settings.png" alt-text="Screenshot that shows the chart settings for a Power BI dashboard." lightbox="./media/stream-analytics-real-time-fraud-detection/chart-settings.png":::

  9. Enter a name for the tile (for example, Fraudulent calls), and then select Apply to create the tile.

    :::image type="content" source="./media/stream-analytics-real-time-fraud-detection/tile-details.png" alt-text="Screenshot that shows the Tile details page." lightbox="./media/stream-analytics-real-time-fraud-detection/tile-details.png":::

  10. Follow step 5 again with the following options:

    • When you get to Visualization Type, select Line chart.
    • Add an axis and select windowend.
    • Add a value and select fraudulent calls.
    • For Time window to display, select the last 10 minutes.

    :::image type="content" source="./media/stream-analytics-real-time-fraud-detection/line-chart-settings.png" alt-text="Screenshot that shows settings for a line chart on the dashboard." lightbox="./media/stream-analytics-real-time-fraud-detection/line-chart-settings.png":::

  11. Your dashboard looks like the following example once both tiles are added. If your event hub sender application and Streaming Analytics application are running, your Power BI dashboard periodically updates as new data arrives.

    :::image type="content" source="media/stream-analytics-real-time-fraud-detection/power-bi-results-dashboard.png" alt-text="Screenshot that shows the Power BI dashboard." lightbox="media/stream-analytics-real-time-fraud-detection/power-bi-results-dashboard.png":::

Embedding your Power BI Dashboard in a web application

For this part of the tutorial, use a sample ASP.NET web application created by the Power BI team to embed your dashboard. For more information about embedding dashboards, see the embedding with Power BI article.

To set up the application, go to the Power BI-Developer-Samples GitHub repository and follow the instructions under the User Owns Data section (use the redirect and homepage URLs under the integrate-web-app subsection). Since you're using the Dashboard example, use the integrate-web-app sample code located in the GitHub repository. After you run the application in your browser, follow these steps to embed the dashboard you created earlier into the web page:

  1. Select Sign in to Power BI, which grants the application access to the dashboards in your Power BI account.

  2. Select the Get Dashboards button, which displays your account's Dashboards in a table. Find the name of the dashboard you created earlier, powerbi-embedded-dashboard, and copy the corresponding EmbedUrl.

  3. Finally, paste the EmbedUrl into the corresponding text field and select Embed Dashboard. You can now view the same dashboard embedded within a web application.

Next steps

In this tutorial, you created a simple Stream Analytics job, analyzed the incoming data, and presented results in a Power BI dashboard. To learn more about Stream Analytics jobs, continue to the next tutorial:

[!div class="nextstepaction"] Run Azure Functions within Stream Analytics jobs