|
| 1 | +--- |
| 2 | +title: What to ingest into the data lake |
| 3 | +description: How to choose which log sources to ingest into your Microsoft Sentinel data lake. |
| 4 | +ms.topic: conceptual |
| 5 | +ms.date: 01/29/2026 |
| 6 | +author: EdB-MSFT |
| 7 | +ms.author: edbaynash |
| 8 | +ms.service: microsoft-sentinel |
| 9 | +ms.subservice: sentinel-graph |
| 10 | +--- |
| 11 | + |
| 12 | + |
| 13 | + |
| 14 | +# What to ingest into the data lake |
| 15 | + |
| 16 | +After onboarding to Microsoft Sentinel data lake, you can decide which logs to ingest into the data lake. |
| 17 | + |
| 18 | +The analytics tier in Sentinel provides real-time analysis and alerting capabilities using log data ingested into Sentinel workspaces. The analytics tier supports the following use cases: |
| 19 | ++ **Real-time detection and correlation**: Immediate alerting on critical events, such as endpoints, identity, cloud security, perimeter. |
| 20 | ++ **Rapid investigation**: Live searches for active incidents and threat responses. |
| 21 | ++ **High-fidelity, actionable logs**: Focus on sources with direct security value, such as EDR signals, privileged access, authentication, threat alerts. |
| 22 | + |
| 23 | +The data lake tier in Sentinel provides large-scale, long term storage and advanced analytics capabilities. The data lake supports the following use cases: |
| 24 | ++ **High-volume, lower-priority logs**: Sources that are valuable for deep forensics, analysis of past incidents to understand attack vectors and impacts, or periodic hunts but costly to keep the analytics tier. |
| 25 | ++ **Analytics and threat hunting**: Cross-log searching, long-term trend analysis, and proactive exploration of historical data to identify hidden threats and patterns. |
| 26 | ++ **Batch analytics and summarization**: Use Spark, KQL, or similar tools to enrich, correlate, or summarize data before forwarding only the high-risk signals to the analytics tier for active monitoring. |
| 27 | ++ **Advanced analytics and machine learning**: Use big data techniques to uncover complex relationships and trends. |
| 28 | + |
| 29 | +Depending on your organization's security needs, you may choose to ingest different log sources into the data |
| 30 | +lake. Store high volume logs that are less critical for real-time detection but valuable for deep analysis and forensics in the lake and retain only high-value logs in the analytics tier. |
| 31 | + |
| 32 | +The following table provides guidance on common log source types, their typical log volume, and their value for different security use cases. Use this information to help determine which log sources to ingest into your data lake based on your organization's specific needs and priorities. |
| 33 | + |
| 34 | +| Log source type | Typical Log Volume |Value for real-time threat detection/alerting | Value for threat hunting | Value for incident investigation/forensics | Ingest to data lake | |
| 35 | +|-------------------------------------------------|--------------------|-------------------------------------|----------------|-----------------------------------|-----------------------| |
| 36 | +| AAA (TACACS/Radius) | Medium | High | High | High | Yes | |
| 37 | +| Active Directory (on-premises) | High | High | High | High | No | |
| 38 | +| Application Logs | High | Medium | Medium | High | Yes | |
| 39 | +| AV Logs (Windows Events 5000s & 3rd party) | Medium | High | High | High | No | |
| 40 | +| Azure Activity | Medium | High | High | High | No | |
| 41 | +| Biometric Access System Logs | Low | Medium | Low | High | Yes | |
| 42 | +| Building Security System Logs | Low | Low | Low | Medium | Yes | |
| 43 | +| Call Center/VoIP Logs | Medium | Low | Low | Medium | Yes | |
| 44 | +| CASB | High | High | High | High | Yes | |
| 45 | +| Citrix/Horizon/ALBs | Medium | Medium | Medium | High | Yes | |
| 46 | +| Cloud IAM | Medium | High | High | High | No | |
| 47 | +| Cloud PaaS | High | High | High | High | Yes | |
| 48 | +| Cloud Security Controls | Medium | High | Medium | High | No | |
| 49 | +| Cloud Storage (S3, Blob, etc.) Logs | High | High | High | High | No | |
| 50 | +| CRM Audit Logs | Low-Medium | Low | Low | Medium | Yes | |
| 51 | +| Database Audit Tools | Medium | High | High | High | Yes | |
| 52 | +| DHCP Logs | Medium | Medium | Medium | High | Yes | |
| 53 | +| DLP Alerts | Low | High | High | High | Yes | |
| 54 | +| DNS Logs | High | High | High | High | Yes | |
| 55 | +| Endpoint Detection and Response (EDR) (Alerts) | Medium | High | High | High | No | |
| 56 | +| Endpoint Detection and Response (EDR) (Raw) | High | High | High | High | Yes | |
| 57 | +| Email Security (3rd party alerts) | Medium | High | Medium | High | No | |
| 58 | +| ERP Audit Logs | Low-Medium | Low | Low | Medium | Yes | |
| 59 | +| File Integrity | Low | Medium | Medium | High | Yes | |
| 60 | +| Firewall Threat/Malware/IPS/IDS | High | High | High | High | No | |
| 61 | +| Firewall Traffic Logs | High | High | High | High | Yes | |
| 62 | +| GitHub/GitLab/Code Repo Logs | Low-Medium | Medium | Medium | High | Yes | |
| 63 | +| Google Workspace Logs | Medium | Medium | Medium | High | Yes | |
| 64 | +| Identity (Entra ID, Okta, LDAP) | Medium | High | High | High | No | |
| 65 | +| IIS/Apache Logs | Medium | High | High | High | Yes | |
| 66 | +| IoT Device Logs | High | Medium | Medium | Medium | Yes | |
| 67 | +| Kubernetes/Container Logs (alerts, critical) | High | High | High | High | No | |
| 68 | +| Kubernetes/Container Logs (raw logs) | High | High | High | High | Yes | |
| 69 | +| LAN/WAN Router Switch | High | Medium | Medium | Medium | Yes | |
| 70 | +| Linux Server AuditD | Medium | High | High | High | No | |
| 71 | +| Mobile Device Management (Intune) | Medium | Medium | Medium | Medium | Yes | |
| 72 | +| Microsoft Office Logs (Teams, Office, SharePoint)| Medium | Medium | Medium | High | No | |
| 73 | +| Microsoft XDR Alerts (Defender: Office, Identity, Endpoint, CloudApp) | Medium | High | High | High | No | |
| 74 | +| Multifactor authentication (MFA) | Medium | High | Medium | High | No | |
| 75 | +| Netflow | High | Medium | High | Medium | Yes | |
| 76 | +| Network Detection (Corelight, Vectra, Darktrace)| High | High | High | High | No | |
| 77 | +| OT/ICS System Logs | Medium | High | High | High | Yes | |
| 78 | +| PAM (Privileged Access Management) | Low | High | High | High | No | |
| 79 | +| PIM (Privileged Identity Management) | Low | High | High | High | No | |
| 80 | +| POS System Logs | High | High | High | High | Yes | |
| 81 | +| Proxy Logging (URL filtering) | High | High | High | High | Yes | |
| 82 | +| Salesforce Audit Logs | Medium | Medium | Medium | High | Yes | |
| 83 | +| SD-WAN | Medium | Medium | Medium | Medium | Yes | |
| 84 | +| ServiceNow Audit Logs | Low | Low | Low | Medium | Yes | |
| 85 | +| SIEM/SOAR Platform Logs | Medium | High | High | High | No | |
| 86 | +| Slack/Teams Collaboration Logs | Medium | Low | Medium | Medium | Yes | |
| 87 | +| Sysmon (Endpoint, for EDR complement) | Medium | High | High | High | Yes | |
| 88 | +| Threat Intelligence Indicators | Low | High | High | High | No | |
| 89 | +| VDI Logs | Medium | Medium | Medium | High | Yes | |
| 90 | +| VPN | Medium | High | High | High | No | |
| 91 | +| Vulnerability Scanning | Low | Medium | Medium | Medium | Yes | |
| 92 | +| Web Application Firewall (WAF) Logs | Medium | High | High | High | Yes | |
| 93 | +| Windows Server Events | High | High | High | High | No | |
| 94 | +| XDR Source Logs (Defender: Office, Identity, Endpoint, CloudApp) | Medium | High | High | High | No | |
| 95 | +| Zoom Meeting Logs | Low-Medium | Low | Low | Medium | Yes | |
0 commit comments