Key challenges of developing a big data lake

Over three quarters of executives from around the world believe that they would like to incorporate AI into their business practices over the next three years; but now there is no reason to wait as technology is already transforming every aspect of the way an organization operates.

Ashish Kalra Sep 14th 2018 A-A+

For years, experts have spoken about the future of AI in making a tangible difference to businesses, and we believe that the future is now. Over three quarters of executives from around the world believe that they would like to incorporate AI into their business practices over the next three years; but now there is no reason to wait as technology is already transforming every aspect of the way an organization operates.

The size of a business does not matter, as there are AI services and solutions that can be integrated into any business model to improve the way they do everything. The ultimate goal of most businesses is to help their clients analyze their business needs and challenges, develop a unique plan to meet them, and put that plan into action.

Access to large volumes of unstructured and structured data is changing the information landscape at our disposal. Enterprises that are at the forefront of Big Data and Cloud adoption are the ones that are going to have a competitive advantage because of the innate capabilities of AI to create measurable impact from datasets.
Big Data is transforming the ways in which businesses function – from transactional to relationship basis. Exponential data growth in terms of unstructured data and real-time data from IoT networks, and economics of digital storage and cloud computing are growth drivers for ML across industries. According to Gartner, 80 percent of, enterprise data is unstructured and critical for strategic business decisions, further adding to an inevitable need for AI and ML implementation.

The old ways of data warehousing are not sufficient to deal with real-time to provide right information at the right time in the right format. This is leading to disruption and innovation of technologies that can create data lakes capable of storing data in its native format with flexible access and being future proof.
Organizations should consider building a data lake to overcome the limitations of data consolidation and standardization, single-source data availability and agility in terms of managing change in data and processes. Data lakes have immense potential to present businesses with new insights contrary to data warehouses but businesses need to take calculated steps with a pragmatic approach in implementing data lakes.

Data lakes are highly agile and less time is required to retrieve data for any event or decision while data warehouses are more suitable for data with defined use and being utilized in past for already known decisions. Thus, the agility of data lakes can actually help in getting answers for new problems but varying data formats and cleansing them can be a formidable task and challenge in terms of building it for right usage for every department.
Data lakes need to be business centric rather than IT centric. Every organization needs to focus on data consciousness by recruiting people who understand the role data can play in the business.

Data lakes and Data Warehousing both of them have separate sets of advantages and disadvantages and it is subjective to enterprise requirements, data ingestion and other factors to choose one or both of them. Compared to data warehousing, data lake techniques saves the time required to have a predefined schema, enabling more agile and flexible architecture to help businesses in getting panoramic view of cross-functional data sets.

Summing up, businesses are going to end up with ‘haves and have-nots’ in the AI space. There will be businesses with resources to implement data lakes and use AI to get to decisions, and there will be businesses standing still in the have-not space. As we approach the future, we will see startups and enterprises alike take advantage of data lakes to create valuable funnel of insights.

Aashish Kalra is Chairman, Cambridge Technology.

Disclaimer: This article is published as part of the IDG Contributor Network. The views expressed in this article are solely those of the contributing authors and not of IDG Media and its editor(s).