Application logging #
TLDR: Don't do PII, and don't log sensitive data. If it feels _icky_, it probably is.
Rationale: Application-level logging, while crucial for monitoring and troubleshooting, poses significant challenges in terms of data privacy and security. This document outlines guidelines for effective and compliant logging practices that strictly prohibit the logging of Personal Identifiable Information (PII).
In short #
- Event Selection: Log critical events like input/output validation failures, authentication, authorization, session management issues, and application errors. Avoid logging sensitive PII or unnecessary details.
- Structured Logging: Use formats like JSON for clarity and ease of analysis. Consistency in log format is key.
- Log Levels: Appropriately categorize logs as
DEBUG
,INFO
,WARN
, orERROR
to reflect their severity and impact. - Contextual Information: Include necessary context like user IDs, transaction IDs, action performed and reasons for failures. Do not include sensitive data like encryption keys or personal information.
- Compliance and Security: Ensure logging practices comply with legal standards and company policies. Regularly review and audit logs for security and operational efficiency.
- Use a logging framework: Logging frameworks ensure flexibility in configuration, performance optimizations and abstracts away the underlying implementations.
Defining events to log #
It’s important to log events that provide a clear picture of your application’s behavior. This includes:
- Application errors
- Input and output validation failures
- Authentication successes and failures
- Authorization failures
- Session management failures
- Privilege elevation successes and failures
- Other high-risk events like data import/export
- Opt-ins like terms of service acceptance
- External service interactions
- Performance metrics
These events are crucial for troubleshooting, monitoring performance, understanding user behavior, and meeting security and auditing requirements.
Including pertinent details #
Logs should include contextual information that aids in understanding the events. This includes:
- The action performed
- The logged-in user performing the action
- Transaction IDs if applicable
- The entity performing the action
- Reasons for failures
- Remediation information for warnings and errors
- HTTP request IDs, if applicable
Such details provide enough context to understand the events without compromising security or privacy.
Excluding sensitive information #
In line with the zero PII logging policy:
- Never log sensitive PII, encryption keys, or secrets.
- Ensure compliance with your company’s privacy policy and data residency requirements.
This is crucial for maintaining security and meeting compliance needs.
Using structured logging #
Structured logging, in JSON format, is recommended and encouraged for easier parsing, querying, and processing for analytics. It adds value, especially as logs become more complex or the request throughput increases.
Logging at the correct level #
Use the appropriate log levels:
DEBUG
for verbose information. This level is not recommended for production environments.INFO
for informational messagesWARN
for potential problems without user experience impactERROR
for serious problems impacting user experience
Best practices for logging #
-
Log Level Management: Implement distinct log levels (DEBUG, INFO, WARN, ERROR) to categorize and prioritize log entries.
-
Consistent Log Format: Use a uniform log format, like JSON, across applications to facilitate analysis and parsing.
-
Timestamps and Time Zones: Ensure logs have accurate timestamps, preferably in UTC, to maintain consistency across different time zones.
-
Log Rotation and Retention: Use log rotation to manage file sizes and define retention policies that balance historical data needs and storage constraints. In the case of cloud-based projects using Kubernetes, Sumologic will handle this automatically.
-
Security and Access Control: Secure log files against unauthorized access and maintain robust access control mechanisms.
-
Performance Optimization: Ensure logging processes do not adversely impact application performance, using asynchronous logging where necessary.
Query parameters and path parameters #
The full url of an api call might/will be logged. You should therefore avoid using path and query parameters with PII in them. An alternative to a GET with path + query params is to use a POST call with a body instead or put the query params in a request header.
Zero PII Logging Policy #
-
Absolute Prohibition of PII: Under no circumstances should PII be logged. This includes indirect identifiers that can be linked to specific individuals.
-
Data Anonymization: If user data must be logged for analysis, ensure it is anonymized and cannot be reverse-engineered to identify individuals.
-
GDPR Compliance: Adhere to GDPR principles by maintaining data privacy and security, even without logging PII.
-
Regular Audits for PII Detection: Conduct regular audits to ensure no PII is being logged inadvertently.
-
Strict Access Controls: Implement stringent access controls and monitoring to safeguard log data.
Do’s and don’ts in application logging #
Do #
-
Regularly review logs: Monitor logs for anomalies or errors indicative of security or operational issues.
-
Use automated monitoring tools: Employ tools for real-time monitoring and alerting of specific log events.
Dont do #
-
Do not log sensitive data: Refrain from logging any sensitive or potentially identifying information.
-
Don’t ignore error logs: Pay attention to error logs as they are critical for identifying potential issues early on.
-
Never compromise on log security: Ensure robust security measures are in place to protect log integrity and confidentiality.
Code examples #
The appropriate logging format such as JSON should be configured in the log configuration.
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;
public class ExampleLoggingWithMDC {
private static final Logger logger = LoggerFactory.getLogger(ExampleLoggingWithMDC.class);
public static void main(String[] args) {
// Put some MDC information
MDC.put("userId", "123");
MDC.put("transactionId", "ABC");
try {
// Example of different log levels
logger.info("Starting application");
logger.warn("User is about to perform a critical operation");
logger.error("Critical operation failed");
logger.info("Shutting down application");
} finally {
// Clear MDC information when it's no longer needed
MDC.clear();
}
}
}
using System;
using Newtonsoft.Json;
public class LogEntry {
public static void Main() {
var log = new {
Level = "error",
Event = "InputValidationFailure",
Timestamp = DateTime.UtcNow,
Description = "Invalid input format"
};
string jsonLog = JsonConvert.SerializeObject(log);
Console.WriteLine(jsonLog);
}
}
const log = {
level: "error",
event: "SessionManagementFailure",
timestamp: new Date().toISOString(),
description: "Session timeout"
};
console.log(JSON.stringify(log));
console.log(log); // will also work
NOTE: These examples was created with AI and should be quality controlled by users of the respective languages.
Examples #
Example 1: Customer Profile Update #
Customer’s full name or other personal details in the log.
Don’t do:
{
"action": "Profile Update",
"full_name": "Alice Johnson",
"status": "Completed",
"timestamp": "2023-11-10T15:30:00Z"
}
Do:
{
"action": "Profile Update",
"userId": "1234567890",
"status": "Completed",
"timestamp": "2023-11-10T15:30:00Z"
}
Example 2: Loan Application Processing #
Applicant’s national ID or sensitive identity information.
Don’t do:
{
"application_id": "APP789456",
"national_id": "1234567890",
"status": "Under Review",
"timestamp": "2023-11-10T16:00:00Z"
}
Do:
{
"application_id": "APP789456",
"userId": "1234567890",
"status": "Under Review",
"timestamp": "2023-11-10T16:00:00Z"
}
Example 3: Online Account Registration #
New user’s physical address.
Don’t do:
{
"status": "Success",
"address": "Portveien 2, 2010, BarneTV",
"timestamp": "2023-11-10T17:10:00Z"
}
Do:
{
"status": "Success",
"userId": "1234567890",
"timestamp": "2023-11-10T17:10:00Z"
}
Example 4: Secrets and other sensitive data #
Sensitive data such as secrets.
Don’t do:
[2023-11-10T17:10:00Z] Calling 3rd party API with secret: LS0tLS1CR...LQo=
If you want to validate a service having correct credentials, set up other ways of validating this.
If you have to log, make sure you hash (>= SHA-256) the value before logging it.
[2023-11-10T17:10:00Z] Calling 3rd party API with hashed secret: 0c2693d000b151680f501757ce669fda815b899032c5c5986949621a1440fc06
That way you can apply the same hashing algorithm to the you want to validate against and filter the logs.