Topics: Data Lakes, Analyze, Data Center
Adam Kocoloski, CTO of IBM Cloud database, talks about Data Lakes, what they are, how people use them, and the kind of things they will be thinking about as they set one up to power their applications. Data Lakes is a highly structured data storage area that houses all the structured and unstructured data from a single source or many sources and exists because people are all awash with data. People have systems of record, systems of engagement, streaming data, batch data, and internal-external data always attached to them. It’s a combination of these different kinds of data sources that leads people to get powerful insights about what users are doing, the way the world is working around people and how they are interacting with technology which ultimately aides’ companies in developing more intelligent applications. Data Lakes start by collecting all those different, individualized, types of data sources through a common ingestion framework. And that ingestion framework is something that typically wants to be able to support a diverse array of different types of data. It wants to kind of standardize and centralize all a user’s data into a common storage repository that’s not always required. But typically, people don’t want to analyze the source data directly. People want to be able to take a copy of it so that they’ve got the flexibility to do the kind of things they need to do with that data.