TL;DR #
While most teams believe “if it’s in the logs, we’ll catch it,” attackers have become incredibly skilled at ensuring malicious actions never make it into logs—or worse, at hiding in plain sight. This post explores how poor logging hygiene provides adversaries with a backdoor, and how tools like Elastic Security and Grafana Loki can help—if used correctly.
The False Comfort of Logs #
In most organizations’ observability stacks, logs serve as forensic oracles:
“If it happened, it’s in the logs.”
But reality is messier. Logs are only as good as:
- What you choose to log
- How you store and parse them
- What you alert on
- What attackers can manipulate
Attackers know this—and they exploit your assumptions.
Entry Point: The API Layer #
APIs are among the most exposed parts of modern architectures. They interact with services, users, mobile apps—and attackers. Here’s how an attack can slip through due to dirty log hygiene:
Undetected API Breach Timeline #
- 2025-08-01T12:00:00Z: Reconnaissance begins—attacker probes with crafted requests
- 2025-08-01T12:05:00Z: Authentication bypass attempted via logic flaw
- 2025-08-01T12:06:00Z: Attack succeeds—attacker enters as a normal user
- 2025-08-01T12:10:00Z: Privilege escalation via unlogged internal API
- 2025-08-01T12:12:00Z: SQL injection performed—payload is Base64-encoded
- 2025-08-01T12:13:00Z: Data exfiltration using large paginated responses
- 2025-08-01T12:15:00Z: API logs show “200 OK,” no alerts triggered
- 2025-08-02T08:00:00Z: SOC reviews logs—no anomalies detected
Where Logging Failed #
- Reconnaissance Not Logged: WAF caught strange requests, but WAF logs aren’t correlated with app logs. Missed.
- Auth Bypass Returned 200 OK: Endpoint returned success because the logic flaw wasn’t caught. Logs lack reason codes or trace IDs for context.
- Internal API Didn’t Log User Context: Privilege escalation used an undocumented API which logs only success/failure, not caller identity.
- Payload Obfuscation: SQLi was Base64-encoded; app logs show a placeholder query, with no decoded payload.
- No Anomaly Alerts: No thresholds on pagination abuse, no alerting on burst traffic, no session monitoring. Exfiltration looks like a customer binge.
Common Log Hygiene Issues #
Problem | Real-World Consequence |
---|---|
Log Injection (CRLF, special chars) | Attacker spoofs entries or breaks parsers |
Sensitive Data in Logs | PII/credentials exposed in plaintext |
Inconsistent Logging Across Services | Gaps in incident timelines |
No User Attribution | Hard to trace attacker movement |
Logs Not Centralized | Forensic blind spots |
Over-Reliance on 200/500 | Attackers get in via 200s too! |
Tools That Can Help—If Used Right #
Elastic Security (ELK Stack)
- Filebeat + Logstash pipelines sanitize, parse, and enrich logs
- Use runtime fields to decode payloads on the fly
- Combine with Elastic SIEM rules for threshold and anomaly detection
Grafana Loki
- Fast, scalable log aggregation using labels (app, method, status)
- Integrates with Prometheus for correlated metrics
- Great for streaming real-time queries for suspicious activity
🔐 Tip: Both tools support immutability and centralization—vital for preventing tampering.
Logging Security Checklist #
Category | Best Practices |
---|---|
What to Log | Auth attempts, headers, session IDs, API inputs (sanitized), payload size |
Log Hygiene | Sanitize logs for CRLF, control characters, and injection |
Attribution | Always log user_id, session_id, IP, user-agent |
Normalization | Use structured logging (JSON preferred) |
Redaction | Mask PII, secrets, tokens |
Retention | Encrypt logs at rest, set retention per regulation (GDPR, HIPAA) |
Alerts | Set up alerts for abnormal behavior (excessive 200s, large payloads, unexpected) |
Best Practices to Keep Logs Honest #
- Sanitize Inputs: Escape line breaks, control characters, and any untrusted input before logging.
- Centralize Logging: Don’t log only to local disk. Use agents like Filebeat, FluentBit, or Promtail to forward logs.
- Use Structured Logging: Stick to JSON logs and avoid free-form strings. This prevents parsing issues and supports analytics.
- Protect the Log Pipeline: Use IAM or token-based authentication for all log-forwarding agents.
- Implement Trace Correlation IDs: Every API request should have a trace ID logged across services.
- Detect Absence of Logs: Silence can be an attack signal. Alert if a critical service hasn’t logged activity in a while.
💡 Pro Tip: Don’t Just Log. Trace. Modern systems use distributed tracing (e.g., OpenTelemetry). By tracing requests end-to-end, you catch behaviors that logs alone can’t reveal:
- Add trace IDs to every log
- Correlate user sessions across services
- Detect latency anomalies, unauthorized access, or lateral movement
Final Thoughts #
Attackers love your logs—because they know what isn’t in them.
Good logging is not just about visibility; it’s about accountability and auditability. An API attack that doesn’t get logged is not just a security risk—it’s a compliance and trust nightmare.
Start treating logs as attack surfaces—because in many organizations, that’s exactly what they are.