February 25, 2015 Leave a comment
In this post, how to build an AzureML model to auto suggest folders/labels for email classification & archiving. How to consume AzureML web service directly on Outlook.
Context: for years I’ve opted for an archive by folders strategy to handle my Inbox (not obsessed with inbox zero but I try to keep it below ~50 inbox mails, and instead use inbox like a buffer – since my first contact with David Allen GTD years ago). true: once in a while I try to test the single archive folder, but until now…always reverted back to my folders to organize so many parallel projects and threads going on
For this to work, I think one thing is absolutely mandatory, very fast folder archiving and switching. Although have tried a few tools for this I ended up coding some Outlook macros over the years to fit this… “peculiar” way of working and unproductive task switching….
So to archive a mail I just press alt-4, this window popups up, I type my search terms, enter, and it’s done. mail Archived
Or to switch context to a specific project or fast search, just press alt-3, search, enter and I’m there with the latest thread mails immediately available.
(that can admit, I obsess over searching…. “as you type” kind of search :) )
Fast forward to “the present”, #MachineLearning #DataIntelligence #AzureML era. I now have a few thousand “labeled” mails (to use machine learning terms) on several active folders. My macros were needing an improvement…. :)
So I exported my archive mails in the form: |from | to | subject | time offset since fixed day| format to a csv (tsv in fact)
and let AzureML do the heavy lifting of building my very personal email classifier suggestion web service, putting some multiclass classification models & also text handling /feature hashing AzureML features to work in a more useful scenario than classifying flowers :)
Sticking with the neural network model for now (default params), created and adjusted the scoring experiment & published the web service.
Used the VBA code sample from the new cool AzureML generated Excel files,
few lines of code added on my Outlook macros, based on AzureML Excel VBA code/macros
and we get a pretty impressive auto classifier ready to use & help manage our inbox, suggesting the folders where the message belongs when archiving.(note: also triggered when sending messages, archiving both sent and original message if needed.)
So for example, if I’m disturbing Joana with another annoying mail :) about SmartCharts new features AzureML advises me:
On the other hand, if it were an help request to Romano on some stream analytics samples AzureML would opt for:
press Enter, for now it’s needed…:) , and that’s done. How cool is that? :)
(have to say that the accuracy is not 100% obviously , but pretty damn useful already)
All this running on a free AzureML workspace. Up & running & minutes, from training to online web service
(excluding the time to slightly adapt AzureML VBA code to call the web service & get my mail properly exported)
Training & scoring experiments are available in AzureML gallery fo you to test drive. Off course I loaded a small sample of my mail in those, so you will have to load your own to really see how it works.
…now, to be close to perfection this would benefit from automated data update & model retraining, and that will be the case for use the new AzureML training APIs.
But that will have to wait! :) (true, reduced some time processing my inbox, but not that much…!)
btw- hope to get feedback on this scenario, ex: feature engineering tips and model tuning suggestions to improve the model results