How #AzureML and #MachineLearning are helping managing my Inbox–Part I

In this post, how to build an AzureML model to auto suggest folders/labels for email classification & archiving. How to consume AzureML web service directly on Outlook.

note: sample AzureML experiments now published on the brand new AzureML gallery. Email Classifier-Training & Email Classifier – Scoring

Context: for years I’ve opted for an archive by folders strategy to handle my Inbox (not obsessed with inbox zero but  I try to keep it below ~50 inbox mails, and instead use inbox like a buffer – since my first contact with David Allen GTD years ago). true: once in a while I try to test the single  archive folder, but until now…always reverted back to my folders to organize so many parallel projects and threads going on

For this to work, I think one thing  is absolutely mandatory, very fast folder archiving and switching. Although have tried a few tools for this  I ended up coding some Outlook macros over the years to fit this… “peculiar” way of working and unproductive task switching…. 

So to archive a mail I just press alt-4, this window popups up, I type my search terms, enter, and it’s done. mail Archived

image

Or to switch context to a specific project or fast search, just press alt-3, search, enter and I’m there with the latest thread mails immediately available.

image

(that can admit, I obsess over searching…. “as you type” kind of search :) )

Fast forward to “the present”, #MachineLearning #DataIntelligence #AzureML era. I now have a few thousand “labeled” mails (to use machine learning terms) on several active folders. My macros were needing an improvement….  :)

So I exported my archive mails in the form:   |from | to | subject | time offset since fixed day|  format to a csv (tsv in fact)

image

and let AzureML do the heavy lifting of building my very personal email classifier suggestion web service, putting some multiclass classification models & also text handling /feature hashing AzureML features to work in a more useful scenario than classifying flowers :)

 

image

Sticking with the neural network model for now (default params), created and adjusted the scoring experiment & published the web service.

image

image

Used the VBA code sample from the new cool AzureML generated Excel files,

image

few lines of code added on my Outlook macros, based on AzureML Excel VBA code/macros

image

and we get a pretty impressive auto classifier ready to use & help manage our inbox, suggesting the folders where the message belongs when archiving.(note: also triggered when sending messages, archiving both sent and original message if needed.)

So for example,  if I’m disturbing Joana with another annoying mail :) about SmartCharts new features AzureML advises me:

image

On the other hand, if it were an help request to Romano on some stream analytics samples AzureML would opt for:

image

press Enter, for now it’s needed…:) ,  and that’s done. How cool is that? :)

(have to say that the accuracy is not 100% obviously , but pretty damn useful already)

All this running on a free AzureML workspace. Up & running & minutes, from training to online web service

(excluding the time to slightly adapt AzureML VBA code to call the web service & get my mail properly exported)

Training & scoring experiments are available in AzureML gallery fo you to test drive. Off course I loaded a small sample of my mail in those, so you will have to load your own to really see how it works.

…now, to be close to perfection this would benefit from automated data update & model retraining, and that will be the case for use the new AzureML training APIs.

But that will have to wait! :) (true, reduced some time processing my inbox, but not that much…!)

btw- hope to get feedback on this scenario, ex: feature engineering tips and model tuning suggestions to improve the model results

Cheers

Rui

Advertisements

AzureML Web Service Scoring with Excel and Power Query

While drafting some samples & labs for this saturday SQLPort AzureML workshop decided to dig a little bit trying to use some regular AzureML scoring web services (request/response) directly in Excel, using Power Query. :)

So… can it be done? Sure! Advisable? Far,far from it…. but for a quick sample/test scenario it might be handy. :)

Saying again, what I’ll show here is far from a recommended web service usage for AzureML in Excel. It will be slow, probably insecure if you need to distribute your file, and there will be I’m sure better ways to do it with the batch web service mode, soon I hope (without the need to pass through the Azure marketplace),

Side notes:

    How it works

    To the point, you can use the Power Query Web.Contents function to invoke the AzureML web service api. As long as you are careful on: 1) using post and not get, 2)setting the authorization header expected by azureml, 3)setting the content type to json  4) building the json request like shown in the api help page.
    PowerQuery will handle the json response automatically..We encapsulate the web request in a function, we can then use it to build our scored table passing the score input column values and getting the score in return.

Step by Step-From AzureML to Excel Power Query Scoring

You’ll need an AzureML  scoring web service, I used a simple two class log regression model to classify the well known Fisher Iris Dataset (with only two classes, as the original as three):

image

The AzureML Experiment & Web Service Experiment, can’t get much simpler than this.

Training Experiment

image

Web Service Experiment

image

Publishing the trained model as a Web Service

AzureML-Iris Classifier Web Service API Help page

image

this JSON request will be needed for Power Query, same thing as the service key:

image

In Power Query you’ll then create a function like this (text version below):

image

image

image

Invoking the Power Query scoring function

Then you just import the original table to Power Query and add a custom column invoking the new scoring  function

image

Note: to test this it is advisable to filter the top 1/2 rows first, it is not lightning fast…. Smile

image

The result should now be visible, a little bit more Power Query expanding and renaming and that’s it!

image

image

To check that I’m getting the expected score, just used another small tool (more on this later)

image

Power Query results

image

Scoring the whole 100 rows iris dataset

The problem using the regular Power Query web.contents function  is that you will be receiving api usage threshold errors from the AzureML API very, very soon. So using another trick inspired (again) on this post by Chris Webb (amazing tip by Alejandro Lopez-Lago), to add some delay between requests.

Just remove the top filter and refresh the query. Wait between 30 secs-60 secs (this will depend on the kind of endpoint you are using, AzureML is changing this whole “stag/production web service model “ right now from what I’m seeing)

image

note: think the latest Power Query update now shows the row count progress, neat! perfect for this Smile

image

image

image

Some misses there Smile guess the model needs some additional training!

image

Download the sample here, (note: will need your own AzureML endpoint & key, you do know that AzureML has a free tier right? :) ) the full Power Query score function for this model is below.

Take care!

Rui

let

     Score= (sepal_length as any,sepal_width,petal_length,petal_width) =>

let

serviceUri="*yourserviceuri_note:ends with score*",
serviceKey="yourservicekey”,

PostContents= "

{
  ""Id"": ""score00001"",
  ""Instance"": {
    ""FeatureVector"": {
      ""sepal_length"": """&sepal_length&""",
      ""sepal_width"": """&sepal_width&""",
      ""petal_length"": """&petal_length&""",
      ""petal_width"": """&petal_width&"""
    },
    ""GlobalParameters"": {}
  }
}
        
                   
",
      //Define Wait function
    Wait = (seconds as number, action as function) =>
            if (List.Count(
             List.Generate(
              () => DateTimeZone.LocalNow() + #duration(0,0,0,seconds),
              (x) => DateTimeZone.LocalNow() < x,
              (x) => x)
              ) = 0)
               then null else action()
    //Call Wait function to wait 5 seconds
    //then return the current time again
   
,
    Source=Wait ( 0.05, () => Web.Contents(serviceUri,
    [Content=Text.ToBinary(PostContents),
    Headers=[Authorization="Bearer "&serviceKey,#"Content-Type"="application/json; charset=utf-8"]])),
    #"Imported JSON" = Json.Document(Source),
    #"Table from List" = Table.FromList(#"Imported JSON", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Transposed Table" = Table.Transpose(#"Table from List")
   in
     #"Transposed Table"
    in Score