Am 20. und 21. November 2019 ist es wieder soweit und die Digitale Welt Convention (DIGICON) findet in München statt. Dieses Mal steht das Ganze unter dem spannenden Motto „Artificial Intelligence – Mit kognitiven Technologien zu autonomen Systemen“.

In diesem Jahr findet bereits die 4. Edition der DIGICON statt. Digitalexperten und Entscheidungsträger aus Wirtschaft, Politik und Wissenschaft tauschen an diesen zwei Tagen ihre Erfahrungen aus und geben Einblick, was wir von einer digitalisierten Welt erwarten können. Dabei verfolgt die DIGICON den Ansatz “Wissenschaft trifft auf Wirtschaft“.

Wir freuen uns daher riesig an unserem eigenen Stand (Standnummer folgt), den Bogen zu spannen vom datengetriebenen Marketing, hin zur künstlichen Intelligenz und in diesem Zusammenhang aufzuzeigen, wie der Intelligenz-Ansatz für „Happy Customer, Happy Business“ eingesetzt werden kann.

Tickets und weitere Infos sind hier zu finden.

Hi, I´m Linda. I am part of the Data Science team at FELD M and was excited to participate this year’s useR!2019 conference, which took place in Toulouse.

That meant 4 days full of great

  • 3h tutorials
  • keynotes
  • 30 min blocks of 6*5 min lightning talks
  • 1,5h blocks of 5*18 min talks
  • sponsor talks
  • poster session
  • social events, …on up to 6 parallel tracks!

The complete list of talks including slides can be found here http://www.user2019.fr/talk_schedule/ and video recordings of the keynotes here: https://www.youtube.com/channel/UC_R5smHVXRYGhZYDJsnXTwg/videos. The video recordings of all talks are uploaded here: https://www.youtube.com/channel/UC_R5smHVXRYGhZYDJsnXTwg/videos.

Let me tell you about the conference’s input as I guide you through a typical project´s timeline. I took advantage of a nice Machine Learning Workflow Hexa-Diagramm and added a 6th Hexagram, adding ‘Communication’ of projects.

Let’s go through the 2nd, 3rd and 6th Hexagon to give some examples, what I took with me from useR! and where we now are taking some deep dives to improve our workflow.

 

  • {tidyr} by famous Hadley Wickham (a must read for everyone advancing in R is his recent 2nd edition of “Advanced R” book: https://adv-r.hadley.nz/index.html) is updated. In the area of web analytics we, at FELD M, receive raw data, in which all touchpoints of all visitors/customers are recorded in rows. In order to analyse customer journeys, we need to reshape our data, so that we have the customers in rows and all touchpoints per customer, i.e. the customer journey in another column. The transformation of reshaping the data from long format to wide format is therefore a regulary used transformation in Data Science projects. The current functions to reshape data are spread() and gather(), where many R-users had to strugggle with the logic. So, Hadley Wickham showed us the work in progress functions pivot_longer() and pivot_wider(), with a more intuitive function and arguments name to reshape data. https://tidyr.tidyverse.org/
  • When working with large data sets we usually use either data.table or SparkR (which we currently prefer over sparklyr because of its more similar syntax to PySpark and hence easier switch between Python and R). The latter two methods rely on RAM for their performance. Since our datasets often don’t fit into the RAM anymore but are still below real big data (calculations can’t be handle by a single machine anymore), the newly developed package {disk.frame} (https://rpubs.com/xiaodai/intro-disk-frame) offers an interesting possibility to store and process medium sized datasets. Data larger-than-RAM is split up and stored in chunks on the harddrive and {disk.frame} provides an API for manipulating these chunks. Unlike Spark, {disk.frame} does not require a cluster and can use any function in R.
  • Before we build a model, we first analyse the data on a descriptive level to decide what assumptions we make to build a model. Visualizing high-dimensional data can then be a cumbersome task. In a tutorial Di Cook showed us her packages like {tourr} https://github.com/ggobi/tourr, which visualizes higher-dimensional (>3) data in an animated rotation. You can take a variable and rotate it out of the projection and see if the structure persists or disappears. The package {nullabor} https://github.com/dicook/nullabor is a tool for graphical inference. Your data plot will be displayed among several random nullplots (plots representing your nullhypothesis). If the difference is visible, there is probably a statistical significane in the structure of the plot.
  • Due to the individual advantages of Python and R, at FELD M Data/Software Engineering is mainly done in Python, while the analysis (building models, statistical tests) by the Data Science Team is more focused on R. Our Data/Software Engineering- and Data Science Team is already working very closely together on Advanced Analytics projects to take the advantage of both expertises and both languages (Python and R). Of course, it is in general our goal to build our (data) products in one programming language. Nevertheless, sometimes we build prototypes, which have to live in both worlds and require to use both languages. The {reticulate} package https://rstudio.github.io/reticulate/ makes it possible to call Python out of RStudio. Rounded off by the GUI developments of knit Rmarkdown, it will be easier to bridge language silos.

 

  • When it comes to building a model, it is always important to know the cause of a variable, as we all know “correlation != causation”. Under the assumption, that causal relationship leaves a structure in the data, there are many procedures that detect this causation. Causaldisco summarizes the causal discovery procedures in R and filters the appropriate procedures for your data when you choose your properties. http://biostatistics.dk/causaldisco/.

 

All in all, the success of a project depends not only on the methods, such as those mentioned above, but also on the environment you create in your company. Julie Lowndres showed us in her keynote (https://www.youtube.com/watch?v=Z8PqwFPqn6Y&t=2806s), how she and her team work by embracing open data science, openess and the power of welcome.

FELD M is looking forward to take some deep dives into the learnings listed above now and to put them into practice to improve our workflow and smoothen the journey for our customers.

If you are interested in our work, come and check out our portfolio: https://www.feld-m.de/service/data-strategy-advanced-analytics/.

Or if you are a NGO/NPO, come and check out our contribution to Data Science for good with our “Data Ambulance”: https://www.feld-m.de/datenambulanz/

Die TC Europe 2019 steht in weniger als zwei Wochen an und wir freuen uns auf die anstehenden Breakout-Sessions und Keynotes. Vom 17. Juni bis 19. Juni werden drei unserer Kollegen Teil von Hunderten Datenbegeisterten sein, die nach Berlin strömen, um sich über die neuesten Entwicklungen in Tableau zu informieren und ihre Kompetenz im Umgang mit den Tools weiter zu vertiefen. Die Tableau-Konferenzen, die wir in den letzten Jahren besucht haben, haben mehr als einmal Grundsteine für die Entwicklung neuer Konzepte und Ideen gelegt. Oft konnten wir diese direkt für verschiedene Projekte, an denen wir mit unseren Kunden gearbeitet haben, zum Einsatz bringen. Deshalb können wir es kaum erwarten zu sehen, was aus der diesjährigen Konferenz hervorgeht!

Worauf wir uns am meisten freuen

Da Embedded Analytics ein immer häufiger von unseren Kunden gefragtes Thema ist und sich die ersten Projekte bei uns bereits in der Entwicklung oder kurz vor dem Abschluss befinden, sind wir sehr gespannt auf die Breakout-Session Turn Data into Products | The Whys, Whens, and Hows of Embedding Tableau. Wir hoffen auf Denkanstöße, die uns helfen könnten, unsere bestehenden Konzepte weiterzuentwickeln und in Zukunft noch mehr analytische Anwendungsfälle zu bedienen. Die Möglichkeit, in der Breakout-Session Embedded Portal Applications at Deutsche Bahn AG Einblicke in die weitere Einbettung von Use Cases zu erhalten, wird sicher noch mehr Ideen beflügeln.

Da Data Science for Social Good ein Thema ist, an dem wir zunehmendes Interesse haben, haben wir uns sehr darüber gefreut, die Session Viz For Social Good im disjährigen Konferenzprogramm zu finden. Wir freuen uns sehr darauf, Gleichgesinnte zu treffen, die versuchen, ihre analytische Expertise für einen guten Zweck einzusetzen! Hier erfahren Sie alles über die Datenambulanz, eines unserer ersten Angebote im Bereich Data Science for Social Good: https://www.feld-m.de/datenambulanz/

Das Hinzufügen von Visualisierungen zu Tooltips mit der Veröffentlichung von Tableau 10.5 war ein echter Game Changer. Unsere Dashboards waren noch nie so sauber und schlank (mit sekundären Charts, die in den Tooltip verschoben wurden). Nie zuvor hatten unsere Kunden so viele Informationen direkt unter ihren Fingerspitzen (oder ihrem Mauszeiger). Die Jedi-Level-Sitzung Next-Level Viz in Tooltip wird Wege aufzeigen, wie man noch mehr analytischen Wert aus dieser Funktion ziehen kann, indem man sie beispielsweise noch interaktiver oder „drillbar“ gestaltet. Wir sind gespannt darauf, alles darüber zu hören.

Da sich unsere Kunden und – im Falle von Tableau-Projekten – unsere Dashboard-Benutzer stets im Mittelpunkt unserer Aufmerksamkeit befinden, sind wir daran interessiert, noch bessere Möglichkeiten zu finden, Daten für sie zugänglicher und aussagekräftiger zu machen und somit Lösungen zu schaffen, die es ihnen ermöglichen, wertvolle Erkenntnisse zu gewinnen und ihr Geschäft voranzutreiben. Daher freuen wir uns, an einer Vielzahl von Sitzungen teilzunehmen, die sich auf die Einführung von Dashboards, die Ausgewogenheit von Data Governance und Self-Service und vieles mehr konzentrieren: The Secret to Getting People to Use Dashboards, Avoiding the Flatline | Building an Analytical Culture at Your Organisation und How to Build Your User Community and Boost Adoption (um nur einige zu nennen) werden sicherlich wertvolle Anregungen bieten.

Nehmen Sie auch an TC Europe teil?

Wir würden uns sehr freuen, zwischen den Sessions ein kurzes Gespräch zu führen und über Ihre Anwendungsfälle im Bereich Visual Analytics zu sprechen! Benutzen Sie die Tableau Conference App? Suchen Sie einfach nach Teilnehmern von FELD M und schreiben Sie uns eine kurze Nachricht.

Sie nehmen nicht an TC Europe teil? Hier können Sie sich über unsere Dienstleistungen rund um Tableau und andere BI-Tools informieren: https://www.feld-m.de/service/dashboards-visualisation/

Superweek 2019 – this 5 day long digital analytics conference on a mountain top in Hungary was quite an unique and for sure very intense experience. I’m still a bit overwhelmed by such amount of great lectures and chats in between them. Not easy to put all my thought on the paper, but let’s give it a try. If you want to know how not to “puke with data”, what’s new in Google solutions and how to dance GTM Boogie, then please bear with me :).

Big topics: Big Query, Machine Learning and automation

After these few days, I had the feeling that nobody uses the Google Analytics interface any more. Sending data to BigQuery, playing around with BigQuery ML, visualizing with Data Studio – it’s now not only common but an obvious approach. This solution provides data quality and flexibility for advanced users, who don’t want to be limited by the tool. Traditional interface remains there to ensure data is easily accessible, however, we should still aim to go beyond it. The topic of machine learning and automation was mentioned numerous times, and it definitely is the future of analytics and digital marketing. Predicting conversions in order to define audiences, personalization of the website or gaining new insights were parts of the few presented solutions. My favourite case was presented by Zoran Arsovski and Ivaylo Shipochky. These guys used business data feed to automatically create and update ads for sport events, and then used trained models to run the campaign – minimizing workload and boosting revenue in quite impressive scale! Still, Mark Edmondson, showed us that machine learning alone cannot provide the same value as an expert in the field. Yet, as ML is getting more and more accessible (thanks to solutions like Cloud AutoML) combinations of these two can now offer whole new level of quality.

Simo Ahava uwielbiał drinki z paneli GTM na Superweek.
Simo Ahava loved the GTM Panel drinks. Photo: superweek.hu

Business Consulting, Cooperation and Processes

Ivan Rečević said, “we prefer to stay in the friend zone of technology, as we are afraid of serious business relationship”; “We are puking with data for 4500 years now” Sayf Sharif followed, while showing the first monthly report from ancient Egypt. Even though our work will always have roots in proper implementation (and as Brian Clifton showed, there’s still a lot to do: https://verified-data.com/study), digital analytics has reached the point where we should finally switch the focus from implementation and producing data to supporting our clients in their business goals by providing them valuable insights. This topic was mentioned in at least every second talk and panel. The idea seems pretty obvious, yet I observe that we are often stuck on reporting (even good) KPIs instead of providing actionable conclusions and recommendations for business purposes. Even Simo Ahava presented only one GTM hack (http://bit.ly/2FQZhyF), and focused on the importance of working in multidisciplinary teams to avoid silo thinking and including analytics into all the stages of the project.

And indeed, some great business insights were presented during Superweek. Lucia Hrašková talked about the importance of identifying customers who are killing the business and produce costs rather than revenue. In two other speeches, similarly, it was shown that thanks to machine learning we can reduce our marketing efforts not only for the costumers who are not likely to convert, but also the ones that will convert for sure (without us spending money on them).

Furthermore, I truly enjoyed Erik Driessen’s talk on Lean Measurement and his experiences with working agilely in analytics. He presented the concept of delivering a Minimum Viable Measurement Product (basic tracing) and building up custom tracking around it (instead of preparing bulky implementation guidelines). That way valuable data could be used faster and the cooperation with IT was improved. I absolutely loved the concept of the “failure wall” – a wall where his team sticks post stickers describing everything that has gone wrong. Once in a while they meet to “celebrate the failure” and learn from their mistakes by finding strengths, mistakes, challenges and annoyances that led to those failures.

News from Google

As always, everybody wants to know what’s coming next in Google solutions and this time was no different. Three talks from Google’s representatives gave us some answers and – unsurprisingly – left with a lot of uncertainties.

SEO

Gary Illyes presented methods on how to improve your SEO by Google images. Beside structured data that was mentioned around 6384 times, he also emphasized the importance of meaningful, informative context (mostly alt attribute of image tag), but also captions under images, page text in general, meta tags, titles, and site maps. At the evening Q&A session on organic search at the fireside, he successfully avoided giving “easy to implement” tips ;). We were again reminded that websites are built for people and not for robots, and instead of trying to meet mysterious SEO requirements, we should above all ensure that the user experience is good and the brand/product is trusted and likely to be recommended. He advised to follow John Mueller (@JohnMu) and to be cautious with the results of SEO “experiments” found in blogs (which often times have very poor quality). So sorry guys, no breaking news this time!

Sesja pytań i odpowiedzi z Garym Illyes na temat Superweek
Q&A session with Gary Illyes Photo: superweek.hu

GTM

Scott H. Herman and Brian Kuhn talked about what is coming next for Google Tag Manager. The ultimate goal for the future is to minimalize the amount of custom HTML in GTM. Solution? Custom Templates! Soon we will be able to easily develop our own custom tags and variables, with the same user-friendly interface to populate ids or any kind of web data. Debugging seems to be very easy. If the first question popping in your mind is “will we have the possibility to share templates?” – the answer is yes :).

Once they were asked about their solution to the Firefox 3rd part domain blocking and supporting workarounds related to it. They stated that to avoid “arms race”, they are currently talking with the biggest browsers’ teams to ensure GTM will not be blocked. Let’s keep our fingers crossed for that!

Firebase

Krista Seiden presented Google Analytics for Firebase and confirmed that the old SDK for mobile will not be supported in the future. Not everybody is happy about that –  even BigQuery integration for free users is a great benefit of Firebase, still, for many switching to new data models is not easy. For me, working mostly with Adobe products, these 50 custom events with 25 custom dimensions (beside the other bunch of automatically collected events) is much more appealing than the simple event category-action-label structure. What is also worth mentioning is that Kirstan said “we strongly believe in this data model”. Does it mean that we should get ready for Firebase for web? After Superweek I have a feeling the answer is positive, but let’s see what the future brings.

Data Privacy

Data Privacy is and will be the topic which we’ll be facing more and more (Aurélie Pols assured us of it in her GTPR talk). The interesting vision on possible solutions was presented by Kristoffer Ewald. In the times when “data is new oil” and everybody knows its value, users could allow user-centred analytics… but not for free! Instead of giving your data to data providers in an uncontrolled manner and for free, you could exchange it for profits like discounts, vouchers and so on. That could be a win-win situation for all (well, except for data vendors).

Superweek: Stephane Hamel teraz po bezpiecznej stronie.
Stephane Hamel now on the safe side. Photo: superweek.hu

Ethics

For me one of the most triggering topics of this conference was ethics in digital analytics – introduced by Steen Rasmussen and followed up by Stéphane Hamel. Since data protection has become the thing, we are fighting for any piece of data that we CAN track… often without asking ourselves if we SHOULD track it. Is it really ok to use data the way we want to? Where’s the thin line between conversion rate optimization for mutual benefit and manipulation? Are we ready to take the responsibility for collected data and use it wisely or are we just infants with guns? I guess there’s no easy answer for this question, but I’m happy we have started to discuss it.

The most interesting solutions shared

Superweek is the conference that focuses more on inspiration, vision for the future and higher level topics. Fortunately, there were also some goodies for lovers of the technical side of analytics. Here you can find some solutions that you can play around with J.

Attribution

Zorian Radovančević, this year’s Golden Punchcard (Superweek’s award) winner, presented attribution analysis that connects the data from multi channels reports with core reporting and allows to split attribution reports by product, product category, device, and so on. Everything in plain, free GA: https://bit.ly/2B9LD5z (open source)

Visual interface for R

Hussain Mehmood showed how to approach data science without coding, by using a visual interface for R: https://exploratory.io (free trial).

Propensity Modelling in BigQueryML

Ken Williams presented the machine learning solution to calculate the probability of conversion in next 30 days based on any given event. http://goo.gl/KJHFKZ (open source).

R and Google Analytics

A bit of Google Analytics data science in R was presented by Tim Wilson: http://bit.ly/ga-and-r (open source).

Chrome Extension for Google Analytics interface

Stéphane Hamel presented Da Vinci Tools – his chrome extensions that adds a lot of cool features directly into GA and GTM interfaces, which is already well known among many analysts. More cool features are coming! http://bit.ly/DaVinciTools (free).

The story of one t-shirt

Erik Driessen used Google Natural Language API to analyze a sentiment of the songs of Avicii. He turn one of the charts into the graphic, that he printed on the t-shirt. He was wearing it when he received the Silver Punchcard award for this creative analysis. http://www.edriessen.com/avicii/ (the graphic available for download).

Why so serious?

What is unique about Superweek is not only the inspiring talks that will give you a boost for a new year, but also the laid off atmosphere. I could not imagine keeping the energy high and maintaining focused for five days without awesome Doug Hall and Yehoshua Coren dancing and singing. So, this is how the speakers were introduced:

…and what’s more, dancing alone was not enough for Yehoshua. Here’s the world’s first Google Tag Manager Boogie! Not bad, right?

And one more thing: Superweek should actually be called “an analytics retreat”. Being in the mountains 2 hours away from Budapest, there’s no place to escape from digital analysts and data lovers. Restaurant, bar, bonfire side or even saunas – these people were everywhere. And great talks were there with them.

Fred Pike, Tim Wilson i Ivaylo Shipochky playing and singing on Superweek
Fred Pike, Tim Wilson, Robert Petković and Ivaylo Shipochky definitely have chosen wrong career path, but at least they stayed on the stage giving great lectures 🙂 Photo: superweek.hu

Bonfire Superweek
Thanks for the great inspiration, atmosphere and chats everyone! Hope to see you on the mountain top next year! Photo: superweek.hu

Some presentations available online:

Simo Ahava – You Can’t Spell „Measure“ Without „Customization“: https://www.slideshare.net/SimoAhava/you-cant-spell-measure-without-customization

Mark Edmondson – Man vs. Machine?: https://www.dropbox.com/s/8mnw5mhficv6k0k/superweek2019.pdf?dl=0

Brian Clifton – The State of Google Analytics Data: https://www.slideshare.net/omegadm/the-poor-state-of-google-analytics-data

Tim Wilson – Digital Analytics Meets Data Science: Use Cases for Google Analytics: https://www.slideshare.net/mobile/tgwilson/superweek-2019-digital-analytics-meets-data-science

Matt Gershoff – Surprise! Its Entropy, the Theory of Information: https://www.slideshare.net/mgershoff/entropy-an-end-to-the-data-love-affair-130390036

Kseniya Anikeeva – Analytics for Publishers: Pains and Aspirationshttps://yadi.sk/i/VzCb6Bs-48Lkfg

Danny Mawani Olsen – Repetitive Tasks Are Slowly Killing Us from Within: https://documentcloud.adobe.com/link/track?uri=urn%3Aaaid%3Ascds%3AUS%3A5989cac4-1afb-47dc-9608-6ad59914f877

Doug Hall – The Joy of Data V2:  https://docs.google.com/presentation/d/e/2PACX-1vTzN5HT7OzSioku0aZNPF_VG9WU93CqMeiCLoVDwXS8OQG98qaPMpUth0uE2Ems5bp1jXiGUY8rcD1W/pub

Doug Hall – Closing Keynotehttps://docs.google.com/presentation/d/e/2PACX-1vSVxKxZQdDFiFwgRKc6eTwDSUtmk4rshWQgd_GxMa6FWIdT6DuLUuSplNUc43b-hv47ybzpTafMN_6O/pub

 

Photos: superweek.hu