Open source changed everything about how we write code, can open-sourcing data do the same for big data?

The experts say yes.

Free software has been with computing since day one, but proprietary software ruled businesses. It took open source and its licenses to transform how we coded our programs. Today, even Microsoft has embraced open source. Now, The Linux Foundation has created a new open license framework, Community Data License Agreement (CDLA), which may do for data what open source did for programming.

In Prague, at Open Source Summit Europe, The Linux Foundation announced a new family of open-data licenses. The CDLA licenses are an effort to define a licensing framework to support collaborative communities built around curating and sharing “open” data.

Specifically, the CDLA licenses enable individuals and organizations to share data as easily as they share open-source code. These licensing models are made to help people form communities to assemble, curate, and maintain big data. This will bring new value to data-based communities and businesses and to power new data-based applications.

Big data, thanks to open-source programs such as Hadoop, Spark, and MongoDB, have enabled us to transform unstructured data into useful information. Today, the challenge is to assemble the critical mass of data for those tools to analyze. The CDLA licenses are designed to help governments, academic institutions, businesses, and other organizations open and share data, with the goal of creating communities that curate and share data openly.

For example, the Foundation stated, “If automakers, suppliers and civil infrastructure services can share data, they may be able to improve safety, decrease energy consumption and improve predictive maintenance. Self-driving cars are heavily dependent on AI systems for navigation, and need massive volumes of data to function properly. Once on the road, they can generate nearly a gigabyte of data every second. For the average car, that means two petabytes of sensor, audio, video and other data each year.”

But, how do you legally share valuable data? Until now, there’s no plan on how to do legally manage data sharing. Each data-sharing agreement is unique. That’s where the CDLA licenses come in.

“Data is the oil of the 21st century,” said Mark Radcliffe, partner and global chair of the FOSS Practice Group at global legal powerhouse DLA Piper. “Yet, the legal protection for and licensing of data is in its infancy. Many current licenses take a variety of inconsistent (and frequently incomplete) approaches to the use and licensing of data. The CDLA provides a valuable tool for companies and lawyers in managing the use and licensing of data. In the best tradition of the open source community, The Linux Foundation used a collaborative process to get the best possible agreement. I will be using the CDLA for many of my clients.”

By Stephen J. Vaughan-Nichols

Source : ZDNet