Do You Really Want to Send Your Data FedEx?
Jinfo Blog
23rd April 2013
By Matt Benati
Abstract
Moving data, and in particular, big data, is a specialist task. Matt Benati explains why it's essential that the transfer of large datasets be carried out speedily and efficiently - and how analytics rely on the freshest data for competitive advantage.
Item
In this age of always-on, cloud-based connections, most of us never think much about how data gets from Point A to Point B. But movement of data is the core of what Attunity does, and when the data is actually "big data", Attunity's expertise is essential for realising the value of computing power.
FreePint conducted an interview with Matt Benati, VP Global Marketing for Attunity, to understand the impact of improved data movement on the success of big data projects. This extract focuses on the nub of the question: "Why does it matter how the data moves?"
FreePint: When I read about big data, the movement of data isn’t something I run into much. Most of the case studies and articles look at analytics and output. Tell me why data movement is an important part of the equation.
Matt Benati: If you think about the principles of big data, you’ll see that they rely on data of all kinds streaming continuously into the system - it might be HR data, financial data, inventory data. Before you can even get to the analytics, you have to think about how that data is going to get there efficiently and in a timely manner.
Analytics relies on three steps: 1. Build a model. 2. Train the model. And 3. Score the data. You’ve got to be bringing fresh data in all the time, moving it through an iterative process to train the model. And when the model’s trained, you have to keep bringing in that data to get the benefit.
Do you know how most big data is moved today?
FreePint: No...
MB: FedEx. Seriously. Data is transferred to a set of CDs, because there’s too much data to put on a single disk. The CDs are shipped to a data warehouse centre to upload the data to the cloud, and THEN you’re ready to train the model, score the data, do the analysis, and so on. Even if you do this as quickly as possible, the data is already a day old.
We call this the big data bottleneck.
FreePint: How does Attunity address this?
MB: Attunity’s heritage is in data availability. Without going into too much of the technical detail, we streamline the process of moving data. Traditionally, data transfer systems are an “all or nothing” proposition - you transfer the whole dataset, or none of the dataset.
What Attunity focuses on, however, is that only a small fraction of the data might have changed. After the initial data load, if you can transfer just the changed data rather than the whole dataset every time, it has a huge impact on the speed and efficiency of transfer. If traditional transfer is a commuter train, making several stops along the way, Attunity has created a bullet express train that goes from Point A to Point B rapidly, without stops.
FreePint Subscribers can read the full interview by logging in to view Transferring Big Data by Bullet Express Train.
Editor's Note: Big Data in Action
This article is part of the FreePint Topic Series: Big Data in Action, which includes articles, reports, webinars and resources published between April and June 2013. Learn more about the series here.
- Blog post title: Do You Really Want to Send Your Data FedEx?
- Link to this page
- View printable version
- Mixed Messages on Big Data
Thursday, 28th March 2013 - Big Data Creates New Opportunities for Collaboration
Thursday, 7th February 2013
- FreePint for Skills Upgrades: In an Era of Constant Change
Thursday, 14th March 2013 - Big Data, FreePint-Style
Monday, 18th February 2013 - Big Data = Big Opportunity
Tuesday, 20th November 2012
From information retrieval to integrated intelligence - with Dow Jones
23rd January 2025
AI contracting and licensing; Strategic information managers; End-user training
10th December 2024
- Jinfo Community session (TBC - Mar 2025) (Community) 20th March 2025
- Jinfo Community session (TBC - Feb 2025) (Community) 25th February 2025
- From information retrieval to integrated intelligence - with Dow Jones (Community) 23rd January 2025