How to Gear Up for Big DataAdded 30th Nov 2012
You will need to think about big data.
Big data analysis got its start from the large Web service providers such as Google, Yahoo and Twitter, which all needed to make the most of their user generated data. But enterprises will use big data analysis to stay competitive, and relevant, as well.
You could be a really small company and have a lot of data. A small hedge fund may have terabytes of data, says Jo Maitland, GigaOm research director for big data. In the next couple of years, a wide number of industries—including healthcare, public sector, retail, and manufacturing—will all financially benefit by analyzing more of their data, consulting firm McKinsey and Company anticipated in a recent report.
There is an air of inevitability with Hadoop and big data implementations, says Eric Baldeschwieler, chief technology officer of Hortonworks, a Yahoo spinoff company that offers a Hadoop distribution. It’s applicable to a huge variety of customers. Collecting and analyzing transactional data will give organizations more insight into their customers’ preferences. It can be used to better inform the creation of new products and services, and allow organizations to remedy emerging problems more quickly.
Useful data can come from anywhere (and everywhere).
You may not think you have petabytes of data worth analyzing, but you will, if you don’t already. Big data is collected data that used to be “dropped on the floor,” says Baldeschwieler.
Big data could be your server’s log files, for instance. A server keeps track of everyone who checks into a site, and what pages they visit when they are there. Tracking this data can offer insights into what your customers are looking for. While log data analysis is nothing new, it can be done to dizzying new levels of granularity.
Another source of data will be sensor data. For years now, analysts have been speaking of the Internet of Things, in which cheap sensors are connected to the Internet, offering continual streams of data about their usage. They could come from cars, or bridges, or soda machines. “The real value around the devices is their ability to capture the data, analyze that information and drive business efficiencies,” says Microsoft Windows Embedded General Manager Kevin Dallas.
You will need new expertise for big data.
When setting up a big data analysis system, your biggest hurdle will be finding the right talent who knows how to work the tools to analyze the data, according to Forrester Research analyst James Kobielus.
Big data relies on solid data modeling. Organizations will have to focus on data science, says Kobielus. They have to hire statistical modelers, text mining professionals, people who specialize in sentiment analysis. This may not be the same skill set that today’s analysts versed in business intelligence tools may readily know.
Such people may be in short supply. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions, McKinsey and Company estimates.
Another skill you will need to have on hand is the ability to wrangle the large amounts of hardware needed to store and parse the data. Managing 100 servers is a fundamentally different problem than handling 10 servers, Maitland points out. You may need to hire a few supercomputer administrators from the local university or research lab.
Big Data doesn’t require organization beforehand.
CIOs who are used to rigorously planning out every sort of data that would go into an Enterprise Data Warehouse (EDW) can breathe a little easier with big data setups. Here, the rule is, collect the data first, and then worry about how you will use it later.
With a data warehouse, you have to lay out the data schema before you can start laying in the data itself. “This basically means you have to know what you are looking for beforehand,” says Jack Norris, vice president of marketing for MapR. As a result, “you are flattening the data and losing some of the granularity,” he says. “Later on, if you change your mind, or want to do a historical analysis, you’ve limited yourself.”
“You can use a [big data repository] as a dumping ground, and run the analysis on top of it, and discover the relationships later,” says Norris. Many organizations may not know what they are looking for until after they’ve culled the data, so this kind of freedom “is kind of a big deal,” he says.
Big Data is not only about Hadoop
When people talk about big data, most times they are referring to the Hadoop data analysis platform. “Hadoop is a hot button initiative, with budgets and people being assigned to it,” in many organizations, Kobielus points out. Ultimately, however, you may go with other software.
Recently legal research giant LexusNexus, no slouch at big data analysis itself, open sourced its own platform for analysis, HPCC Systems. MarkLogic has also outfitted its own database for unstructured data, the MarkLogic Server, for Big Data style jobs as well. Another tool gaining favor in the US is the Splunk search engine, which can be used to search and analysis data generated by machines, such as the log files from a server. “Whatever data you can extract from your logs, there is a good chance that Splunk can help,” notes Curt Monash of Monash Research.
The new entity 0f Hitachi Systems Micro Clinic will emerge as an end-to-end IT service provider for CIOs in India, says Yoshinori Okami, Vice President & Executive Officer, Hitachi Systems.
A CA veteran and essentially a technologist, Yogesh Gupta was a surprising candidate for many in the industry. We spoke to Kaseya’s new CEO to discuss where he wants to take Kaseya in the coming years and how he plans to do it.
To discuss how organizations can successfully impact business outcomes through analytics, and enable end-users to leverage BI tools without a steep learning curve, CIO magazine, in association with Microsoft, had recently held roundtables at Mumbai and Delhi.
How Dell PowerEdge VRTX Converges Servers, Storage, and Networking into a Single Chassis to Manage and Consolidate Business Applications.
The Dell PowerEdge VRTX is designed to deliver IT simplicity, efficiency and agility to offices of all sizes, and specifically for remote offices and small and midsize businesses.
In the past, advanced storage optimization techniques, available only at huge scale, were the secret to large enterprise success.
Succession planning isn’t about filling an empty chair.
One in every five Indian CIOs will retire within the next two years. Do you have a succession plan?
At long last, after much searching, a flood of whispered rumors, and more than a little journalistic hand-wringing, Microsoft has found its new CEO: Satya Nadella.
The efforts that IT leaders in India are putting to make it a better country.
Hemanth D.P., COO-Hub Development, Free Trade Zone and Logistics Business at Rajiv Gandhi International Airport Hyderabad, wants to transform the airport into a logistical hub. To ramp up the amount of cargo the airport moves, he’s going to need IT. Here's his vision.
The security game has changed. The simple tactics of moves and counter moves is no longer working. More businesses are being successfully attacked despite the numerous point solutions available; worse, many don't even know they have been attacked until it's too late.
Debra Martucci, CIO and VP -IT at Synopsys and a former contractor to NASA, is upbeat about the company's investments in India.
Load balancing isn't just for websites that expect surges in traffic any more. Companies of all sizes, and in all verticals, find load balancing an effective way to address disaster recovery, scalability, failover and application virtualization needs.
As government organizations continue to deal with an increasing number of cyber threats, one thing has become clear to those who protect our digital assets: there is no silver bullet.
To give your organization the best shot at success during a disaster, you need to put a current, tested plan in the hands of all personnel responsible for carrying out any part of that plan.