Apache Spark vs Apache Flink :
While back in FlinkForward conference I demonstrated the easy process of designing a simple performance experiment to battle Apache Spark against Apache Flink. Right after that we started a more extensive study project with my two master students Shelan and Ashansa and we aimed to compare Flink and Spark for both batch and stream processing. For the first phase of this study we chose two other experiments and we attempted to make them easily reproducible on Amazon EC2 by Karamel. For the batch processing we reused Terasort experiment by Dongwon Kim, you can read about our results here. For stream processing we used Yahoo Streaming Benchmark, our results can be found here.
OpenStack added, 1000 commits reached!!
On Feb 2, our support for OpenStack Nova was added into Karamel which is a great news. I need to thank our friend and one of the oldest contributors of the Karamle project, Alberto Lorente Leal , for making this happen. Now you can use Karamel to run any distributed systems with your own OpenStack managed clusters.
By adding OpenStack number of Karamel commits reached 1000 with 87 number of fixed/closed issues. Congratulations to all you who have been helping through this success!!
Two Master Thesis: 1.Support for Elastic Computing 2.Orchestration of Container-Based Virtualization in Karamel
On January 15, our two new master students, Shelan Perera and Ashansa Perera have started their thesis with us. Shelan and Ashansa are part of the EMDC Distributed Computing program 2014 - that is a joint program between KTH-Sweden and IST-Lisbon. Both have been working on open source enterprise middleware while Shelan has been Apache Committer too, they are passionate to contribute in open source projects, and they have a good industrial experience in cloud computing. Welcome Shelan and Ashansa, I hope that you will enjoy the journey and we will do a nice work together.
Degree Fair 2015 @ Kista Electrum
On October 14 the day after I arrived from Berlin, it was thesis fair for students who are looking for degree thesis. There were many research groups from ICT and Physics department and also plenty of companies. Me and Hooman managed to prepare this colourful poster for Karamel in a short time and some handouts for our visitors. There were a mention of a few exiting ideas that are currently needed for extending Karamel among the others. Many students stoped by and discussed some of them while we tried to clafify the idea by showing some recorded screencasts of the project.
Karamel went live at Flink-Forward 2015
First conference of Apache Flink happened in Berlin where Karamel was officially presented for the first time. Personally I found the conference quite rewarding, its growing athomosphere and ambitious people. Plentiful of brilliant talks was presented around this stream processing engine and despite the rumor of Flink being just a European version of Apache Spark, incubation of new ideas beyond what Spark has was obviouse.
Among others Karamel had also been elected for a presentation. I gave a short talk about Reproducing Distributed Experiments with a demo of designing an experiment in Karamel - the experiment was to measure performance of word-count by Flink on 4 AWS vms. Demo session went very well and Karamel got a very good feedback from the audiences.
Even though the original idea was to compare Flink and Spark with this simple demo but due to the time limit we decided to do it just on Flink. Just before my talk I decided to run that experiment while I am scaling out number of nodes, taskmanagers. It was amazing that in less that one hour I could do the scale-out experiment for 2, 4, 8 and 16 taskmanagers and measure the performance of all Karamel's phases and Flink wordcount, it led me to the following plot.
SAASFEE @ VLDB 2015
On Sep 4, Jörgen Brandt presented SAASFEE paper on VLDB conference in Hawaii. SAASFEE is Scientific Workflow Execution Engine by Marc Bux - I am glad to be among the co-authors. Marc has used Karmel for installation on EC2 and Baremetal - video tutorials are available here.
Apache Flink Meetup @ Bay Area
On August 27, Flink Meetup has happened in Bay Area hosted by MapR. In this meetup Paris Carbone has presented Karamel. There is a mention of Karamel in MapR blog.
Master Thesis: Cost Effective Recommendation System for Inter-Cloud Clusters
I am glad to announce that today, August 1, our new master student, Sina Molazemhosseini, has join Karamel. He is going to work on a unified cost model and recommendation system for multi-cloud clusters. Welcome abroad Sina and I hope that you will enjoy the challenge :)