Data Optimization enables “Big Data” Analytics

By Wayne Salpietro, Director of Marketing, Permabit Technology Corporation.

  • 10 years ago Posted in

It’s hard to avoid the buzzword “Big Data.” Within the last year or two, it’s been on the lips of nearly every technology vendor, analyst, journalist and enterprise CIO.

Data mining and analytics provides understanding and insight into business behaviors through the analysis of data stores. Usually the bigger the data sample, the more accurate the predictions will be. The most famously absurd example, revealed in the New York Times Bits Blog “Bizarre Insights from Big Data,” is that vegetarians are unlikely to miss their flights. So I am probably destined to miss a flight on occasion! (More relevant information might be whether your particular flight tends to land early, on time, or late, regardless of the meal.)

More importantly, businesses that employ Big Data analytics see data as a business asset that can predict and forecast behaviors. When applied to customer sales records, it analyzes who bought what, when, where, why, and for how much, and can predict repeat buying patterns as well as predict buyer types or groups that are more likely to buy or even suggest complementary product purchases. It can predict the ideal weekend for a studio to release its next blockbuster film or the right time to open stores on Black Friday.

Needless to say, Big Data means bigger dollars. How companies harness this phenomenon largely relies on their ability to house enough data to create a critical mass of information to deliver actionable insight and accurate predictions. But maintaining large data stores costs money, and IT budgets are still squeezed tightly.

Enter data optimization technology, including data deduplication and compression. Data optimization can be the game-changer that enables more affordable data stores that create a competitive advantage out of information.

Data optimization across the entire storage environment is critical for easing storage capacity consumption and reducing management and operations costs. It delivers IT efficiency and budget relief as well as improves performance, resource utilization and scalability. These benefits weren’t available or realistic in first-generation data deduplication technology intended for backup and archive. Today, due to advances in deduplication technology they are available and are increasingly becoming a required technology in any data storage offering.

Deduplication reduces the redundancy that is known to exist in typical corporate data stores (as much as 75 percent by some estimates, depending on the type of data). Data deduplication works across multiple files in a data store: if two files or parts of files (data chunks) are exactly the same, the duplicate is replaced with a simple pointer back to the first instance of the data chunk. Working at the sub-file level, breaks down data into as little as 4KB chunks which results in increased data and operational efficiency.

Current generation deduplication technology is fast, highly resource efficient and scalable enough to be used at the application level and/or operating system level for data stored on local servers across the IT storage stack and even into the cloud.

Compression works at the file level by squeezing out the redundancy in a particular file. It compacts redundant data strings and is particularly useful for reducing the storage capacity required by databases, backups, user files and log files. Compression has been used for many years and works well within a file to reduce redundancy. Recent advances in compression technology have yielded “smart compression” which does a quick analysis to determine if applying compression is beneficial. This approach enables smart compression to save CPU cycles, reduce storage consumption and increases compression throughput efficiency.

When smart compression is combined with next generation deduplication technology, it can reduce data storage requirements, optimize systems performance and reduce related storage costs to less than three percent (35X reduction) of previous consumption. These data optimization technologies deliver enormous business value by reducing storage consumption, optimizing systems performance and also reducing data footprint requirements (floor space, power and cooling). These all drop to the business bottom line by reducing operating and capital expenses.
As the industry shifts to broader data optimization use, data growth issues can be mitigated to a point where companies can keep pace affordably while mining and analyzing data for strategic purposes. With more efficient use of resources, information is once again an asset that can be used for greater profitability and competitive differentiation.
 

Quest Software has signed a definitive agreement with Clearlake Capital Group, L.P. (together with...
Infinidat has achieved significant milestones in an aggressive expansion of its channel...
Nearly all senior business decision-makers (96%) surveyed report data strategies as essential to...
SharePlex 10.1.2 enables customers to move data in near real-time to MySQL and PostgreSQL.
NetApp extends its collaboration to accelerate Ducati Corse’s digital transformation and deliver...
Partnership to be featured at COP26, highlighting how data-driven solutions and predictive...
Next-Gen solutions to deliver market-leading enterprise cloud scalability, cyber resilience and...
he EMEA external storage systems market value was up 3.3% year on year in dollars but down 5.5% in...