Question 1

What gets logged in a log management program?

Accepted Answer

Effective log management covers:

- **Authentication events** — successful and failed login attempts, password changes, MFA challenges
- **Authorization events** — access grants, denials, privilege escalations
- **System events** — configuration changes, service starts and stops, errors
- **Network events** — firewall decisions, DNS queries, connection attempts
- **Application events** — user actions, API calls, data access patterns
- **Security events** — malware detections, vulnerability scan results, intrusion alerts

Question 2

What is log management architecture?

Accepted Answer

A mature log management program combines multiple components into a pipeline that moves raw event data from source to searchable, retained storage.

Log sources

Logs originate from every layer of the technology stack:

- **Servers and operating systems** — Linux auth logs, Windows Event Log, macOS Unified Log
- **Cloud platforms** — AWS CloudTrail, Azure Activity Log, GCP Admin Activity audit logs
- **SaaS applications** — Microsoft 365 Unified Audit Log, Google Workspace audit logs, Salesforce event monitoring
- **Endpoints** — EDR telemetry, local application logs, mobile device management events
- **Network devices** — firewalls, routers, switches, load balancers, VPN concentrators
- **Security tools** — IDS/IPS alerts, vulnerability scanners, DLP engines, email gateways

Collection methods

Getting logs from source to a central platform requires reliable collection mechanisms:

- **Agents** — lightweight forwarders installed on hosts (Fluentd, Filebeat, NXLog, Splunk Universal Forwarder) that ship logs in near real time
- **Syslog** — the legacy standard (RFC 5424) still widely used by network devices; syslog-ng and rsyslog add filtering and reliable delivery
- **API polling** — scheduled calls to SaaS and cloud provider APIs to pull audit logs (e.g., Microsoft Graph API, AWS CloudTrail Lake queries)
- **Cloud-native streams** — managed pipelines like AWS Kinesis Data Firehose, Azure Event Hubs, or GCP Pub/Sub that deliver logs without managing agents
- **Webhooks** — event-driven push from SaaS applications that support real-time notification (Slack audit API, GitHub audit log streaming)

Centralization

Logs are only useful when they are searchable in one place:

- **Commercial SIEM** — Splunk Enterprise Security, Microsoft Sentinel, IBM QRadar provide correlation, detection rules, and case management
- **Cloud-native logging** — AWS CloudWatch Logs, Azure Monitor, Google Cloud Logging offer tight integration with their respective platforms
- **Open-source stacks** — the Elastic Stack (Elasticsearch, Logstash, Kibana), Grafana Loki, and OpenSearch provide cost-effective alternatives with community-driven detection content
- **Security data lakes** — Snowflake, Amazon Security Lake, and similar platforms store massive volumes at low cost using the Open Cybersecurity Schema Framework (OCSF) for normalization

Storage tiers

Log storage strategies balance search speed against cost and compliance retention:

- **Hot storage** — fully indexed, real-time searchable data for active investigations and alerting (typically 30–90 days)
- **Warm storage** — recent history available for on-demand search with slightly slower query times (typically 90 days to 12 months)
- **Cold storage** — compressed, archived logs in object storage (S3, Azure Blob, GCS) retained for compliance and forensic purposes (1–7 years depending on framework requirements)
- **Immutable storage** — write-once, read-many storage that prevents tampering, critical for audit trail integrity and legal hold requirements

Question 3

What are the log retention requirements?

Accepted Answer

Different compliance frameworks set varying expectations for how long logs must be kept. The table below summarizes key requirements:

| Framework | Minimum retention | Key requirements |
| --- | --- | --- |
| PCI DSS | 12 months (3 months immediately available) | Req 10.7 — retain audit trail history |
| SOC 2 | Based on risk assessment | CC7.2 — monitor system components |
| ISO 27001 | Based on risk assessment | A.8.15 — log retention policy required |
| HIPAA | 6 years for policies; log retention not specified but implied | Audit controls for ePHI access |
| NIST CSF | Based on organizational needs | DE.CM — continuous monitoring |

Organizations subject to multiple frameworks should align retention to the most stringent requirement. For most companies handling payment card data alongside health information, a 12-month hot/warm retention period with 6-year cold archival provides adequate coverage.

Question 4

What should you alert on in log management?

Accepted Answer

Collecting logs without monitoring them defeats the purpose. Effective alerting focuses on high-fidelity signals across several categories:

Authentication anomalies

- Brute-force attempts — multiple failed logins against the same account within a short window
- Impossible travel — successful logins from geographically distant locations within an implausible time frame
- New device or location — first-time access from an unrecognized device, IP range, or country
- Credential stuffing patterns — failed logins across many accounts from a small set of source IPs

Privilege escalation

- Sudo or run-as usage outside of expected maintenance windows
- Admin role assignments or membership changes in identity providers (Azure AD, Okta, Google Workspace)
- Permission changes on sensitive resources — S3 bucket policies, database grants, file share ACLs
- Service account creation or key generation

Data exfiltration signals

- Unusual download volumes — user downloading significantly more data than their baseline
- Access outside business hours — especially to sensitive repositories, databases, or file shares
- Mass file access — sequential reads across large numbers of records in short succession
- Outbound data transfers to uncommon destinations — cloud storage services, personal email, file-sharing sites

Configuration changes

- Firewall rule modifications — new allow rules, disabled security groups, removed deny entries
- Security group changes in cloud environments — opening ports, widening IP ranges
- IAM policy changes — new inline policies, permission boundary modifications, role trust policy updates
- DNS changes — new records, zone transfers, nameserver modifications

Compliance-specific events

- Access to [cardholder data](/glossary/pci-dss) environments — any read, write, or copy operation
- PHI access in [HIPAA](/glossary/hipaa)-regulated systems — views, exports, or modifications of protected health information
- Encryption key operations — key creation, rotation, deletion, or export
- Audit log access or modification attempts — anyone trying to read, delete, or alter the logs themselves

Question 5

What are common log management mistakes?

Accepted Answer

Even organizations that invest in logging often fall into patterns that undermine the value of their program:

- **Logging too much** — capturing every debug-level event creates massive storage costs and drowns analysts in noise. Focus on security-relevant events and tune verbosity by source.
- **Logging too little** — the opposite problem is equally dangerous. Missing authentication events, not capturing cloud control plane activity, or skipping DNS logs leaves blind spots that attackers exploit.
- **Not protecting log integrity** — if an attacker can delete or modify logs, they can cover their tracks. Logs should be forwarded to a separate system with immutable storage, and access to log management platforms should be tightly controlled.
- **No correlation across sources** — reviewing logs from individual systems in isolation misses the bigger picture. A failed VPN login followed by a successful cloud console login from the same IP tells a story that neither log tells alone.
- **Alert fatigue from untuned rules** — deploying default SIEM detection rules without tuning them to the environment generates hundreds of false positives per day. Analysts stop investigating, and real incidents get buried.
- **Not testing log pipeline reliability** — log collection silently fails more often than most teams realize. Agents crash, API tokens expire, syslog forwarding breaks after a network change. Regularly validate that expected log sources are still delivering data.
- **Ignoring time synchronization** — logs from systems with drifting clocks are nearly impossible to correlate during incident response. Enforce NTP across all log sources and normalize timestamps to UTC.

Question 6

How do compliance frameworks address log management?

Accepted Answer

- **SOC 2** — CC7.1 through CC7.4 require monitoring, detection, and response capabilities that depend on logging
- **ISO 27001** — A.8.15 (logging) and A.8.16 (monitoring activities) address log collection and analysis
- **HIPAA** — the Security Rule requires audit controls to record and examine activity in systems containing ePHI
- **PCI DSS** — Requirement 10 mandates logging and monitoring all access to network resources and cardholder data
- **NIST CSF** — DE.CM (continuous monitoring) and DE.AE (anomaly detection) rely on log data

Question 7

What are best practices for log management?

Accepted Answer

- Centralize logs in a SIEM or log aggregation platform for correlation and analysis
- Set retention periods that meet both compliance requirements and operational needs (typically 90 days to one year)
- Protect log integrity with immutable storage or tamper-evident mechanisms
- Establish alerting rules for high-risk events like failed authentication spikes or unauthorized access attempts
- Regularly review and tune logging to ensure coverage without excessive noise

Question 8

How does episki help with log management?

Accepted Answer

episki documents log management policies, tracks retention schedules, and links logging controls to evidence for audit readiness. Learn more on our [compliance platform](/frameworks).

What is Log Management?