Log Analysis System: Boost Performance with Apache Doris
A log analysis system is essential for modern cybersecurity operations. One provider recently upgraded its system and achieved significant improvements: 3X faster data ingestion, 7X faster query execution, and intuitive visual management. By optimizing both storage and analysis, they enhanced overall security monitoring and operational efficiency.
Understanding a Log Analysis System
The system collects logs from enterprise users, scans files for malware, and tracks events. Each file is converted into an MD5 hash and sent to a cloud engine for threat assessment. The engine returns logs containing information such as file name, size, risk level, and event time. These logs enter an Apache Kafka topic, are normalized in a real-time data warehouse, and backed up to an offline warehouse. High-risk data is further analyzed using an Extended Detection and Response (XDR) engine.

Challenges in Legacy Log Systems
Despite the robust design, the previous system faced two major issues:
Slow Data Writing
With tens of millions of endpoints generating over 100 billion logs daily, the StarRocks-based system struggled. Scaling the cluster from 3 to 13 nodes offered limited improvement. Backlogs during peak hours affected stability and delayed real-time monitoring.
Slow Query Execution
Keyword-based searches using SQL LIKE operators required full scans of massive datasets. Even filtered queries took seconds to minutes, and concurrent requests further slowed performance.
Architectural Upgrade with Apache Doris
To overcome these limitations, the provider evaluated Apache Doris 2.0. It offers an inverted index for fast text search and an NGram BloomFilter to accelerate LIKE operations. While StarRocks originated from Apache Doris, Doris 2.0 includes advanced optimizations for log analysis.
300% Faster Data Writing
Tests on a three-server cluster connected to Apache Kafka showed that Doris could handle daily ingestion with only 30% CPU usage. Enabling the inverted index further optimized performance, and disabling it could increase speed by another 50%.
Reduced Storage Costs
The system also reduced storage by 60%. Columnar storage and ZStandard compression allowed the index and data files to be nearly the same size, improving efficiency over the old StarRocks setup.
690% Faster Queries
Query tests across 79 common SQL statements showed a 7X average improvement. Specific optimizations included:
- Inverted Index: Speeds up keyword searches, sometimes over 88X faster.
- NGram BloomFilter: Accelerates LIKE operations by splitting text into sub-strings, enabling rapid filtering.
- Top-N Query Optimization: Dynamic ranking and predicate pushdown reduce scanned data, improving performance over large datasets.
Visualized Operations and Maintenance for Log analysis system
Apache Doris offers the Doris Manager tool for cluster monitoring, configuration changes, scaling, and upgrades. It also provides a WebUI for interactive log analysis, trend charts, keyword search, and filtering. Teams familiar with ELK Stack will find it intuitive, simplifying operational workflows.
Holistic Support from ZippyOPS
ZippyOPS provides consulting, implementation, and managed services for DevOps, DevSecOps, DataOps, Cloud, Automated Ops, AIOps, MLOps, Microservices, Infrastructure, and Security. They assist organizations in deploying optimized log analysis systems, integrating real-time monitoring, and training teams. Explore services, solutions, and products, or watch demos on YouTube.
For high-authority guidance on big data and log management, see the Apache Doris project documentation.
Conclusion for Log analysis system
Upgrading to a modern log analysis system like Apache Doris dramatically improves ingestion speed, query performance, and maintainability. Organizations that combine advanced tools, structured processes, and expert guidance from ZippyOPS can handle hundreds of billions of daily logs efficiently and securely.
For professional consulting and managed services, contact sales@zippyops.com.



