Update after 3 days on the Microsoft Azure Hack

Ryan Purvis
4 min readMay 17, 2018

The Microsoft Azure AI Hack was held in London at the new(ish) Microsoft Reactor venue. Its a great venue, really easy to get to (compared to Reading / Paddington) and not overstated.

The event itself has been well organised and run at a good pace with lots of opportunities to learn by doing. It has been been in a phrase ‘full on’.

To set the scene there are 18 teams of 3–4 members tackling their business problems. The approach like every hack is to try and do something quickly to try and address the problem using the tools that Microsoft offers. Its not an overnight exercise but rather spread over three days — which is just better.

The true benefit to being in this environment to do this (besides the coffee, snacks (sugar, SUGAR everywhere) is access to really excellent product experts from Microsoft.

Microsoft Azure Databricks

To use an over used cliche — this is a game changer.

Why do I think this, and why should you care?

Overall its a blended environment to do data analysis, data science experimentation and business analysis. This is important as the underlying data pipeline (that covers Event, Stream and Batch ingestion) can be lead with worked out use cases to prioritise which data to bring in, how it should be managed and where it should be delivered to.

What will likely be key is the ability to publish ML models into these pipeline flows. In the slide deck I received this image covered the various pieces quite well:

Overview of Azure Databricks

The service has been designed and built with the Apache Spark founders to remove the headache of setting up your own infrastructure. This means in a few clicks you have the underlying framework in place to get started doing the important ‘brain’ work.

  1. Firstly you’ll need to create Databricks workspace from within the Azure Portal
Azure Portal Databricks provisioned service

2. The workspace looks like this:

Azure Databricks Workspace

3. Clusters are used to process the data through the rules and models — these can be dedicated or shared (this will auto scale based on usage). In both cases the cluster will shutdown if unused for a period of time (default is 120mins).

4. The notebook is a collaborative environment to map out the rules and models and test them.

Notebook example

At this point its better to watch a video as it was introduced at the Build conference:

The experience so far its that its been tricky to get this up and running even with the SMEs involved which is typical of something new, the three main pain points

  1. there are documentation gaps, and thus its not as easy to get up and running
  2. databricks can only be used with a paid Azure account — the trial accounts don’t work, which is fine for most customers.
  3. learning curve will be high initially

That said I love the way its put together and instinctively it feels like the right place to invest time to gain skills. I had that nostalgic ‘drink the kool-aid’ feeling back when SharePoint was up and coming (almost 15 years ago!).

--

--

Ryan Purvis

I enjoy learning and sharing what I learn. I’m actively involved in AI, Machine Learning and Bot projects. Opinions are my own.