By using tdwi.org website you agree to our use of cookies as described in our cookie policy. Learn More

TDWI Articles

Data Silos and Breaches: Building a Long-term Security Operations Platform with Elasticsearch

A trio of open source products may help you secure your data environment.

The recent Marriott data breach will likely run into the hundreds of millions of dollars, and class action lawyers are trying to turn it into billions. Unfortunately, the headline is all too familiar. Organized crime or nation-state actors burrow their way into IT networks, lying dormant through a back-door communications channel, slowly draining precious data from what were previously thought to be "protected" systems.

For Further Reading:

5 Best Data-Breach Planning Practices for 2019

Keep Your Data Secure with a Layered Approach

3 Fundamental Steps for Strong Big Data Security

Although many security vendors offer various solutions to address these threats, the truth is there is no simple answer. Every corporate IT environment is like a snowflake -- consisting of hundreds if not thousands of applications, legacy infrastructure and databases. There's no simple code-patch to deploy.

The only way to protect your environment is to collect, detect, correlate, track, analyze, and act on the digital exhaust left by the intruders. Our FBI and CIA do this by spending billions on centralized supercomputers that provide correlation across digital tracks such as credit card numbers, phone numbers, social media, license plates, etc. Most enterprises don't have billion-dollar IT security budgets.

An Open Source Solution

Fortunately, there is an open-source tool available to enterprises to start storing the digital footprints of activities across the enterprise. With over 350 million downloads, Elasticsearch is now a viable option for enterprises to build a scalable, centralized datastore for all of the logs being generated across their IT estate. This is crucial for the "collect, detect, correlate, track and analyze" part of your defense.

Elasticsearch provides a powerful search engine based on Apache Lucene with a distributed scale-out architecture that can be used from the smallest IT shop with a few dozen firewalls and servers to the largest banks on Wall Street with 10,000 servers. It's the datastore used in apps such as Lyft and Netflix and for tech giants including Verizon and eBay. It supports petabytes of data. It's free and open-source, with a community of thousands of developers and optional paid support available.

Elasticsearch isn't a panacea, though it provides the foundation for building your own security operations backbone. It's a free-form data-crunching machine that enables you to collect the logs from your Oracle server to your Apache web server to your key-card reader and Netapp filer. By collecting and storing this data centrally, you can start to build real-time intelligence. As data streams in, it is typically available for querying in less than one second. Thus, searching for failed password attempts, foreign IP address ranges, or off-hours access by employees becomes immediate. You now have a centralized, searchable datastore for building a world-class security operations center.

New machine learning capabilities can also be added to Elasticsearch. These algorithms can constantly scan your data streams and send alerts when it detects unusual data patterns, such as a firewall sees traffic routed to a new IP address that it has not encountered before or employees access files in folders not part of their normal access patterns.

Accelerating Log Adoption

To accelerate your adoption of logs into Elasticsearch there is Logstash. Logstash provides a pipeline for reading the logs from multiple types of servers, normalizing them, and feeding them into the Elasticsearch engine where they can be queried.

Visualization is a critical point of any data analytics platform. Fortunately, there's an app for that, too. Kibana is a dashboard visualization tool that runs as a separate application but connects to Elasticsearch. This enables a data analyst to build custom dashboards with graphs and tiles that look clean and polished. With Kibana, analysts don't have to write code to run queries, search logs, and create clean graphical dashboards for reporting.

Together, Elasticsearch, Logstash, and Kibana are collectively referred to as the ELK stack. This open-source stack provides the data ingestion pipeline, the data store, and the data visualization tool for building an enterprise security operations search engine. Because it runs on commodity servers on-premise or in the cloud, the enterprise doesn't have to pay for expensive proprietary hardware or exorbitant software licenses. This leaves the enterprise with more savings to invest in the cybersecurity personnel it needs.

A Final Word

We can't predict when the next Marriot-size data breach will happen, but we know it is in progress somewhere in the world right now. With the right toolsets such as Elasticsearch in place, perhaps it could have been detected and prevented.

About the Author

Geoff Tudor is VP and general manager of cloud services at Panzura. Tudor has over 22 years of experience in storage, broadband, and networking. As chief cloud strategist at Hewlett Packard Enterprise, Geoff led CxO engagements for Fortune 100 private cloud opportunities while positioning HPE as a top private cloud infrastructure supplier globally. He was nominated for the Ernst and Young Entrepreneur of the Year award and recognized by Discover Magazine with their Innovator award. Geoff holds an MBA from The University of Texas at Austin, a BA from Tulane University, and is a patent-holder in satellite communications.


TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.