How to Configure Grafana 12 and Loki 3.0 for 1M Log Lines per Second Ingestion
High-volume logging is critical for modern distributed systems, and ingesting 1 million log lines per second (1M LPS) requires careful tuning of your observability stack. Grafana 12 and Loki 3.0 introduce performance improvements, native scaling support, and streamlined configuration that make this throughput achievable with the right setup. This guide walks through every step to deploy, configure, and validate a 1M LPS Loki ingestion pipeline with Grafana 12 for visualization and alerting.
Prerequisites
Before starting, ensure you have:
- Kubernetes cluster (v1.28+) with at least 10 nodes, each with 16 vCPU, 64GB RAM, and 1Gbps network bandwidth. For on-prem bare metal, equivalent hardware is required.
- Object storage: S3, GCS, Azure Blob Storage, or MinIO (for on-prem) to store Loki log chunks and indexes.
- Helm v3.12+ installed for deploying Loki and Grafana.
- Basic knowledge of Loki architecture and Grafana data source configuration.
Step 1: Deploy Distributed Loki 3.0
Loki 3.0’s distributed mode splits components into separate scalable microservices, which is mandatory for 1M LPS throughput. Use the official Loki Helm chart (v3.0+) to deploy:
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install loki grafana/loki -f loki-values.yaml
Create a loki-values.yaml with initial distributed configuration:
loki:
image:
tag: 3.0.0 # Pin to Loki 3.0 stable
structuredConfig:
schema_config:
configs:
- from: 2024-01-01
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
storage_config:
s3:
endpoint: minio.example.com:9000
region: us-east-1
bucketnames: loki-chunks
access_key_id: minioadmin
secret_access_key: minioadmin
insecure: true
tsdb:
dir: /loki/tsdb
limits_config:
ingestion_rate_mb: 1000 # Per user ingestion rate in MB/s
ingestion_burst_size_mb: 2000
max_streams_per_user: 100000
max_line_size: 1048576 # 1MB per log line
ingester_config:
lifecycler:
num_tokens: 512
heartbeat_period: 15s
join_after: 10s
chunk_idle_period: 30m
chunk_block_size: 262144
chunk_target_size: 1536000
distributor_config:
rate_limit:
enabled: false # Disable global rate limiting for initial tuning
remote_timeout: 10s
instances: 5 # Start with 5 distributor replicas
ingester:
replicas: 10 # 10 ingester replicas for 1M LPS
distributor:
replicas: 5
querier:
replicas: 3
queryFrontend:
replicas: 2
compactor:
replicas: 1
indexGateway:
replicas: 2
Scale components horizontally as needed: distributors handle incoming write requests, ingesters process and batch log chunks, and queriers handle read traffic without impacting ingestion.
Step 2: Tune Loki 3.0 for High Ingestion Throughput
Loki 3.0 includes a rewritten write path with improved batching and reduced GC overhead. Apply these additional tuning tweaks to loki-values.yaml:
- Increase
ingestion_rate_mbper user to 1000MB/s (equivalent to ~1M LPS assuming 1KB average log line size). - Set
chunk_idle_periodto 30 minutes to reduce small chunk creation overhead. - Enable
tsdbschema v13 for faster index writes, a new Loki 3.0 feature that improves ingestion performance by 40% over prior TSDB versions. - Configure
distributor_config.instancesto match the number of distributor pods to balance write load evenly.
Restart Loki pods after applying changes: kubectl rollout restart statefulset -l app=loki
Step 3: Configure Log Shippers (Grafana Agent)
Use Grafana Agent 0.37+ (compatible with Loki 3.0) to ship logs. The Agent’s optimized gRPC write path reduces overhead compared to Promtail. Deploy the Agent via Helm:
helm install grafana-agent grafana/grafana-agent -f agent-values.yaml
Sample agent-values.yaml for high-throughput shipping:
agent:
image:
tag: 0.37.0
config:
server:
log_level: info
logs:
positions_directory: /tmp/positions
configs:
- name: loki
clients:
- url: http://loki-distributor:3100/loki/api/v1/push
batchwait: 5s # Wait up to 5s to batch logs
batchsize: 100000 # Batch up to 100k lines per request
max_concurrent_streams: 1000
timeout: 10s
positions:
filename: /tmp/positions/positions.yaml
scrape_configs:
- job_name: system
static_configs:
- targets: [localhost]
labels:
job: system
__path__: /var/log/*.log
Adjust batchsize and batchwait to match your log volume: larger batches reduce request overhead but increase latency for real-time logs.
Step 4: Deploy Grafana 12 and Connect to Loki
Deploy Grafana 12 using the official Helm chart, then configure Loki as a data source:
helm install grafana grafana/grafana -f grafana-values.yaml
Sample grafana-values.yaml:
grafana:
image:
tag: 12.0.0 # Pin to Grafana 12 stable
adminUser: admin
adminPassword: securepassword
datasources:
- name: Loki
type: loki
url: http://loki-query-frontend:3100
access: proxy
isDefault: true
Grafana 12 includes a new Loki query editor with syntax highlighting, auto-complete for log labels, and native support for Loki 3.0’s TSDB index features. You can also create alerts for ingestion rate drops or error spikes directly in Grafana 12’s unified alerting UI.
Step 5: Performance Tuning and Scaling
To reach 1M LPS, scale Loki components horizontally and tune OS-level settings:
- Scale distributors: Add replicas until CPU utilization per distributor is ~70%. For 1M LPS, 5-8 distributor replicas are typically sufficient.
- Scale ingesters: Each ingester can handle ~100k LPS with 16 vCPU and 64GB RAM. For 1M LPS, 10-12 ingester replicas are recommended.
- OS tuning: Increase file descriptor limits to 65535, set net.core.somaxconn=4096, and net.ipv4.tcp_max_syn_backlog=4096 on all Loki nodes.
- Monitor Loki metrics: Scrape Loki’s
/metricsendpoint with Prometheus, and use the pre-built Loki dashboards in Grafana 12 to trackloki_distributor_bytes_received_total,loki_ingester_ingested_lines_total, andloki_distributor_errors_total.
Step 6: Validate 1M LPS Ingestion
Use a log generator to test throughput. Deploy a simple Pod that sends 1KB log lines at 1M LPS:
kubectl run log-generator --image=busybox -- /bin/sh -c 'while true; do echo {"level":"info","msg":"test log line $(date +%s%N)","trace_id":"$(uuidgen)"}'; done' | pv -l -a -b -r > /dev/null
Check the loki_ingester_ingested_lines_total metric in Prometheus: if the per-second delta is ~1M, your setup is working.
Common issues to troubleshoot:
-
rate limit exceedederrors: Increaselimits_config.ingestion_rate_mbin Loki. - High ingester CPU: Add more ingester replicas or increase their resource requests.
- Storage latency: Use a high-performance object storage, or enable Loki’s chunk caching for frequently accessed data.
Conclusion
Configuring Grafana 12 and Loki 3.0 for 1M log lines per second ingestion requires a distributed Loki deployment, careful component tuning, and horizontal scaling. Loki 3.0’s improved write path and TSDB v13 schema reduce the hardware required for high throughput, while Grafana 12’s enhanced Loki integration simplifies monitoring and alerting. Follow the steps above, validate with load testing, and scale components as your log volume grows.
For more details, refer to the Loki 3.0 Documentation and Grafana 12 Documentation.





