How to manage your Big Data

By Alex McMullan, Field CTO, Pure Storage

  • 9 years ago Posted in

The right knowledge at the right time has always been crucial to business success, so it is no surprise that companies are beginning to think about Big Data and Business Intelligence as something they should have and use as a matter of course.


We hear plenty about the various methods of extracting value from data, but what’s less understood is what the best infrastructure to house and manage big data for the next ten years should look like.


The problem is that ten years is a very long time – and when it comes to big data, the landscape isn’t even fully formed, let alone evolving. Unsurprisingly, people misunderstand what they need to do to turn this technological innovation it into a business tool.


Big Data can be used to spot correlations – but, as Tim Harford, the FT correspondent who coined the phrase Digital Exhaust said, it doesn’t provide the theory to test against the pattern, and it can be easy to mistake correlation for causation.


The last moving part
The stumbling block for many organisations when it comes to managing big data lies with the last bit of the data centre that still physically moves – the hard disks. To even come close to matching the speed of flash, you have to bunch several together into a costly, room energy, and wallet-busting disk array. Those disks could spend more than 95% of their time and energy seeking and rotating, and less than 5% actually doing the job of writing, erasing and reading data. This can account for more than 40% of a data centre’s power budget, which is crippling for any business.


Flash can reduce this because it uses a fraction of the electricity, which reduces running costs, generates far less heat, and requires far less energy for cooling, all for less space, too. And because of the sheer speed of flash, it makes regular (even real-time) analysis and insight into big chunks of data both possible and affordable.


Knowing before or during, not after
As organisations gather more granular data on what they do, the potential to gain understanding and plan accordingly becomes a more profitable undertaking. Retailers we have helped with our All Flash Arrays have had this problem repeatedly: they know that, inside the sales and distribution data for each day’s trading lies valuable insight. But because it takes a prohibitively long time to process batches, they can’t get that insight in enough time to act on it. This is a standard problem across all industries; the scale of the data they handle is increasing rapidly, possibly far faster than their IT budgets can cope with. On top of increased scale is significant complexity. The variety of data points collected by store card programmes, for example, offer the chance to understand shopper habits and overall buying patterns – but only if there is enough data processing power to sift insight from the data.


In short: the big benefit of flash is that you can ask more questions of the data you hold, more frequently, for the same cost as a big data solution that relies on a hard disk array, If you're a retailer, like Picard or Kiabi in France - both of whom are Pure Storage customers - you can analyse till receipts, inventory and buying patterns on a far more frequent basis than before. That could make the difference between having enough of a popular item of clothing, or fast-selling food or beverage brand to meet demand, or selling out those items and losing revenue as a result.


Don’t be put off by old perceptions
Some people think that flash is not affordable for mainstream adoption, and as a result, some companies will take cautious steps towards implementing an all-flash solution and choose the comfort of a legacy array strapped with flash or the promise of a hybrid. But both of these will only help in the short-term. Companies that want to combine the speed of flash with the cost of disk use hybrid storage. And a disk based storage array strapped with some flash doesn’t work quite the same as an all-flash array with hardware and software built from the ground up for flash specifically. Each of these legacy systems combines both the weaknesses and strengths of both flash and disk, which ultimately makes them fall short. This is because flash stores and serves data in a manner that is fundamentally different than disk. Read activity is lightning fast, and too many writes can wear out the medium but only if you are trying to make flash behave like disk.


Up to speed in no time
There is no longer any need to long and costly integration work to get the benefit of flash. If you buy the right all flash array, generally, you can take out the old hard disk setup and have a flash replacement up and running within an hour with minimal disruption to the business, and a far faster big data solution in place.

Quest Software has signed a definitive agreement with Clearlake Capital Group, L.P. (together with...
Infinidat has achieved significant milestones in an aggressive expansion of its channel...
Nearly all senior business decision-makers (96%) surveyed report data strategies as essential to...
SharePlex 10.1.2 enables customers to move data in near real-time to MySQL and PostgreSQL.
NetApp extends its collaboration to accelerate Ducati Corse’s digital transformation and deliver...
Partnership to be featured at COP26, highlighting how data-driven solutions and predictive...
Next-Gen solutions to deliver market-leading enterprise cloud scalability, cyber resilience and...
he EMEA external storage systems market value was up 3.3% year on year in dollars but down 5.5% in...