Data Science Skills To Boost Your Salary
(Click image for larger view and slideshow.)
On Tuesday, MapR introduced its Streamsreal-time service based on Kafka. Another up-and-coming player in the big data, open source space has made a related announcement that is just as important.Confluent has updated its own real-time messaging system based on Apache Kafka — the Apache Software Foundation Project that underlies the social network LinkedIn, popular industry disruptor Uber, and many others.
Also on Tuesday, Confluent, founded by the engineers behind Apache Kafka, announced the release of open sourceConfluent Platform 2.0, which is based on an updated Apache Kafka 0.9 core.
Apache Kafka is a software project that was incubated inside LinkedIn and then donated to the Apache Software Foundation. The engineers who worked on the Kafka project at LinkedIn went on to start the company Confluent, a major committer to the Apache Kafka project.
“Confluent’s mission is to help companies make the transition to using real-time data,” cofounder and CEO Jay Kreps told InformationWeek in an interview. While at LinkedIn, Kreps and his fellow engineers Jun Rao and Neha Narkhede saw that most companies operated with continuous streams of data, but enterprise systems weren’t designed to take advantage of them.
In 2008, the trio started creating Kafka, a distributed, real-time messaging system to work with all the data at LinkedIn. But they eventually realized that bringing this system to the masses couldn’t happen from within the business-oriented social networking site, so they formed Confluent.
Still in start-up mode, the one-year-old company closed a $24 million B round of venture capital funding in July 2015, which it said it plans to use to expand its platform. But Kafka itself is already being used by some of the giants who have pioneered the use of real-time data to drive their market disruptions such as Uber, Airbnb, and Pinterest. Kreps said that Netflix is also swapping out its proprietary home-built system for Kafka-based technology.
The company describes its technology as a central stream data pipeline that turns an organization’s data into readily available low-latency streams and acts as a buffer between systems that might produce or consume such data at different rates.
Confluent Platform 2.0 adds a framework for management of connections to other systems. Kreps said it is now easier for organizations to plug Kafka into the other systems that run in enterprises with Kafka Connect, a connector-driven data integration feature for large-scale real-time data import and export for Kafka.
The platform also adds features for large enterprises that deal with highly sensitive personal financial or medical data, and that operate multi-tenant environments with strict quality of service standards, according to the company.
[Has Hadoop hit the mainstream? Find out here: Will 2015 Be the Year of Hadoop?]
New security features include encryption over-the-wire using SSL, authentication, and authorization that can be set on a per-user or per-application basis, and configurable quotas that enable throttling of reads and writes, according to the company.
In addition, Confluent Platform 2.0 lets developers integrate data sources without the need for writing code, and it offers a 24/7 production environment with automatic fault-tolerance, transparent, high-capacity scale-out, and centralized management, the company said. It also adds clients for different languages, including ones for C programmers, Kreps said.
Last week, Confluent also announced plans to host the first-ever ApacheKafka Summit. Scheduled for April 26, 2016 in San Francisco, Kreps said it will bring together people who are using the core Kafka technology to talk about best practices and use-cases.
“There hasn’t been a gathering of these people other than on mailing lists,” Kreps said. “This will be the first event of its kind, and everybody is really excited about that.”