Thoughts about Strata, Big Data, Data Science, D3.js and Data Visualization
„Create more value than you capture.“
From 19th to 21th of November, Una, Miroslav and me from Kaywa Zurich and Belgrade were in Barcelona. This was our first time at an O’Reilly’s Strata +Hadoop World conference which is mainly about big data and data science. Strata takes place 4 times per year twice in Europe, twice in the US.
In Barcelona, the weather was impressively sunny and warm, the huge conference location was within view of the Mediterranean, the food was delicious and the programme was packed. It took me over two weeks to digest and to write about it.
A main takeaway was: data science, data visualization and big data are still nascent and the US dominates it so far (at least at this conference). Well acquainted with big data are the finance & insurance sectors (some since the 1980s), as well as retargeting, real-time bidding advertising companies. One could call them the Data Doers in comparison to Data Hoarders from IT, Data Mixers from the creative sector and Data Deniers to be found mainly in manufacturing and retail. Maybe this is why Tim O’Reilly was so insistent to exhort the audience once again: „Create more value then you capture“.
On Wednesday it was tutorial day and I choose Communicating Data Clearly by Naomi Robbins and the D3.js Tutorial – D3 For Everyone! by Sebastian Gutierrez from DashingD3js.com.
Prior to the conference, we already wanted to switch from Google Charts to D3.js to create the data visualizations (line and doughnut charts) for our QR Management. There were two reasons for this: first Google Charts are not as flexible as Mike Bostock’s D3, second Google Charts do not work in Hong Kong nor Mainland China. We had only discovered this recently as some of our Hong Kong clients told us they would not see the visualization of the data and asked if we are using Google’s products. At that time we did, so we saw the D3.js workshop as an opportunity to learn and then to embrace D3.js.
Before D3.js, we went to Communicating Data Clearly where Naomi Robbins showed us some bad chart examples and then offered some good common sense advice:„stacked bar charts are difficult to read, grouped bar charts are almost always better “, „be aware of pre-attentive processing and use it to your benefit”, „reading length charts is easier then area charts and area charts are easier then volume charts“ and I could go on. After the first one and half hours however her talk became hard to follow, as my brain was saturated by the amount of data she was throwing at us. In retrospect, I also miss not having been to the Spark workshop by Paco Nathan.
Sebastien told us repeatedly: „data visualization is a super power“, and this might well be true. The more complex and big the data, the more essential visualization becomes, so that we can understand it, discover patterns, detecte anomalies, etc. And this holds true equally for exploring and explaining data.
Sebastian’s tutorial together with Scott Murray’s Interactive Data Visualization for the Web gave us the knowledge to accomplish our goal: replace our Google Charts with D3.js. Our new charts are now online at http://qrcode.kaywa.com. And we are looking forward to do more creative stuff. We are also open for suggestions, so let us know if you have any idea, hint, tip in regard to data visualization.
All photos courtesy of Miroslav Mitrovic.
The photos are from Strata, but not from the workshops in the text.