November 1, 2011
What To Do With MEF Composition Container After Application Start-Up

Anyone who has played with or used MEF has probably written code in their App.xaml.cs file that looks something like this:

var catalog = new AggregateCatalog();
            catalog.Catalogs.Add(new DirectoryCatalog(”.”));
            catalog.Catalogs.Add(new AssemblyCatalog(Assembly.GetExecutingAssembly()));

            _container = new CompositionContainer(catalog);

            try
            {
                _container.ComposeParts(this);
                var a = this.VoidObjects;
            }
            catch (CompositionException compositionException)
            {
                MessageBox.Show(compositionException.ToString());
                return false;
            }
and the code has worked, MEF is great, and nobody really thinks twice about it. Well, no one has thought twice about it until now.

While I see code like that shown above in lots of applications that use MEF it always brings two questions to my mind:

1.) Why isn’t the composition container disposed of after composition is complete?

2.) What could we do if we made the composition container available to more components throughout an applications lifecycle instead of just at start-up?

The answer to (1.) remains, “I don’t know”. When MEF composition is baked so directly into the OnStartUp method there is no way to access it from anywhere else in the application so you might as well dispose of it right after composition is complete since you aren’t going to (to be able to) use it. So why wait until application shutdown to dispose of it? I have no idea.

Having said all that I would like to move onto the second question which inspires a much more interesting and useful discussion because there are lots of advantages to being able to trigger MEF composition from (almost)anywhere in the application. Let’s consider a particular concrete example.

Let’s say that you have a menu option that should create and display a new instance of a plug-in every time it’s clicked. Then we can simply use a factory pattern as previously discussed here and everything is great, right? Well - yes and no. Everything is great until an additional dependency is created in our factory after we have finished writing the factory method. Unless you remember to append the factory method of your class to give the new factory instance a reference to the new dependency you will eventually run into a null reference exception when the instance of the class created by the factory tries to call a method on that dependency.

The factory pattern as described above is brittle yet is used by lots of people and has cost at least two teams I have worked with entire DAYS to find and fix. I call it brittle because every time a new dependency is added our memories (and/or knowledge) are not always going to be good enough to remember, “Oh yeah, now I need to go change the factory method so that my instances all come out right”. Furthermore, our class’s new dependencies might have additional nested dependencies of its own that we would have to know how to new-up. That’s not even the worst of it. Now, whenever the dependencies of our dependencies changes we will probably need to go back and change our factory method even more. DI without MEF can get very messy very quickly if we are not careful.

Instead of worrying about all of that we’d like to be able to add a MEF import statement and have the factory method continue to work without relying on our (at least my) shotty memories about what other things need to be changed. In-fact, we’d like there to be no other things that need to be changed when adding a new import, right?

Thanks to MEF, we can do exactly that if we can de-couple composition from the application start-up. My approach has been to move the composition code from the application start-up to its own CompositionService class in the SoapBox.Core.Contracts project. I then create a CompositionServiceLocator class (also in the SoapBox.Core.Contracts project) that can be used by any plug-in to gain read-only access to the application-level composition container and compose any object at any time.

More concretely, I moved the typical composition code that lives directly in-side the OnStartup method to its own class that looks like this:

public class DesktopCompositionService : ICompositionService
    {

        private static object[] _compositionRoot;

        private static CompositionContainer _container;

        public bool Compose(params object[] o)
        {

            if (_compositionRoot == null)
            {
                _compositionRoot = o;
            }

            if (_container == null)
            {
                var catalog = GetCompositionCatalog();
                _container = new CompositionContainer(catalog);
            }

            try
            {
                _container.ComposeParts(o);
            }
            catch (CompositionException compositionException)
            {
                Console.WriteLine(compositionException.Message);
                return false;
            }
            return true;
        }

        private ComposablePartCatalog GetCompositionCatalog()
        {
            var catalog = new AggregateCatalog();
            catalog.Catalogs.Add(new DirectoryCatalog(”.”));
            catalog.Catalogs.Add(new AssemblyCatalog(Assembly.GetEntryAssembly()));
            return catalog;
        }

        public void DisposeOfContainer()
        {
            _container.Dispose();
        }

    }

 

I then created this seperate class through which other application components can gain access to the composition class.

public static class CompositionServiceLocator
    {

        static ICompositionService _defaultInstance = null;
        public static ICompositionService DefaultInstance
        {
            get
            {
                if (_defaultInstance == null)
                {
                    _defaultInstance = new DesktopCompositionService();
                }
                return _defaultInstance;
            }
        }

    }

Let’s see how this simple change of moving the composition code from the app.xaml.cs file into its’ own more accessible class has improve the plight of our factory method. Before, it may have looked like this (or worse):

public Widget CreateNewInstance()
        {
            var newWidget = new Widget();

            newWidget.Log = this.Log;
            newWidget.Dal = this.Dal;
            newWidget.MiscDependency =
                this.MiscDependency.CreateNewInstance(new DependencyOfOtherDependency());

            //.
            //.
            //.
            //plus maybe more stuff that is brittle
            //plus all the configuration once everything is connected
            //and maybe not everything should be a shared instance.
            //ect.
            //.
            //.
            //.

            return newWidget;
        }

 

and as we discussed above, it was brittle, hard to maintain, and error-prone. Now, with our new CompositionService and CompositionServiceLocator our factory method simply looks like this:

public Widget CreateNewInstance()
        {
            var newWidget = new Widget();
            CompositionServiceLocator.DefaultInstance.Compose(newWidget);
            return newWidget;
        }

There, isn’t that better? Now, we have effectively off-loaded all of the work done by our factory method to a single-line call to the CompositionService which in-turn makes MEF do all the “factory-ing” for us.

This approach effectively changes a factory method into a composition of an object. Now this approach has worked nicely in the custom tailored case shown above but composition is typically only part of what a true factory does. Though there are other steps we can take to get MEF to help us with the configuration aspect of a factory method I think we have already come a long way with a minimal amount of effort.

July 7, 2011
Using The Microsoft Kinect Speech Recognition Features To Control SoapBox Add-Ins

Last weekend I presented the following three sessions at the Southern California Code Camp in San Diego:

1.) Managed Extensibility Framework (MEF)

2.) Soapbox Core

3.) XBox Kinect

I met some great people and got a lot of positive feedback (The SoapBox session went well if I don’t say so myself. I’m sorry I didn’t record it - I will for sure record it the next time I give that talk).

After my final talk, one of the brave souls that had sat through all three one-hours sessions of my babblings asked the very interesting question, “So when are you going to create a Kinect add-in for SoapBox?” Oddly enough, I had never thought of putting the two together until I got that question. I don’t think it was supposed to be a challenge but I took it as one so on my 2 hour drive back to LA I created the Kinect add-in in my head then got home and created it last night.

THE BIG IDEA

The goal: I want to be able to open the PinBallTable Add-In that comes with the SoapBox Core Demo download via a simple verbal command recieved through an XBox Kinect sensor.

Here were my self-imposed design constraints

1.) To make using the Kinect add-in as easy as possible, hooking into the Kinect Add-in should require ZERO source code changes to the existing PinBallTable add-in.

2.) The Kinect add-in should use as little memory as possible, so lazy loading is a must.

With these constraints in-mind the design became clear. I should create my own custom metadata attribute that holds the text of the verbal command to which the exported class responds to via the well supported command pattern in SoapBox. This custom metadata could then be parsed in the OnImportsSatisfied() method on my Kinect add-in to build a grammar that could be used by the Microsoft SpeechRecognitionEngine hooked-up to the Kinect audio stream. Once this was all set-up all I’d have to do is create a SpeechRecognized event handler attached to the SpeechRecognitionEngine and fire the command associated with the recognized speech.

As it turns out, the Kinect SDK code that I needed was practically already made for me in the Audio Fundamentals Quickstart .

There are, of course, several prerequisites needed to make this Kinect add-in work. The obvious one is a Kinect sensor. In-terms of software, the Kinect add-in requires all the same packages as the Audio Quickstart and nothing more.

Below is a description of how I created a SoapBox add-in that allows users to issue voice commands to any SoapBox Core add-in via an XBox Kinect. First we will look at the new code created for this add-in, then we will look at the extremely minimal changes we needed to make to the existing PinBallTable add-in and finally we will discuss a few changes you can make on your own to make this Kinect add-in even better.

THE NEW CODE

- Custom Metadata

First, we need to add the following custom metadata attribute definition to a Kinect folder in the SoapBox.Core.Contracts project of Soapbox Core

using System; 
using System.Collections.Generic; 
using System.ComponentModel.Composition; 
using System.Windows.Input; 

namespace SoapBox.Core 
{
    [MetadataAttribute] 
    [AttributeUsage(AttributeTargets.Class, AllowMultiple = false)] 
    public class AudioCommandMetadata : ExportAttribute 
    { 
        public AudioCommandMetadata() 
            : base(typeof(ICommand)) 
        {
        
        }
        
        public AudioCommandMetadata(IDictionary<string, object> dict) 
            : this() 
        {
            this.Action = dict["Action"] as string; 
            this.Subject = dict["Subject"] as string; 
        }
        
        public string Action { get; set;} 
        public string Subject { get; set; } 
    }
}

This metadata attribute class simply allows any class supporting the ICommand interface to specify a subject string and action string that will eventually help us build verbal commands to which the program responds. Note: we could have made this attribute a little simpler by giving it a single ‘VerbalCommand’ property instead of the more complex Action property AND Subject property. In this case, I chose to use two properties so that it is easier to standardize the possible verbal commands. I hope that this will make it easier for the user to operate when there are lots and lots of commands by repeating the same basic verbal formula of “SoapBox [ACTION] [SUBJECT]” to trigger anything. I also hope that by using two properties it will be harder for other developers on my team to create their own verbal command patterns that don’t really match with other members of the team.

Now that we have made the necessary addition to the core, we are ready to create our Kinect add-in.

- Add-in Class Itself

Here is the entire Kinect add-in, it’s only 200 lines!

using System; 
using System.Collections.Generic; 
using System.ComponentModel.Composition; 
using System.IO; 
using System.Linq; 
using System.Threading; 
using System.Windows.Input; 
using System.Windows.Threading; 
using Microsoft.Research.Kinect.Audio; 
using Microsoft.Speech.AudioFormat; 
using Microsoft.Speech.Recognition; 
using SoapBox.Core; 

namespace SoapBox.KinectAddIn 
{
    [Export(SoapBox.Core.ExtensionPoints.Host.Void, typeof(Object))] 
    [Export(SoapBox.Core.ExtensionPoints.Host.ShutdownCommands, typeof(IExecutableCommand))]
    public class KinectAudioPlugIn : AbstractExtension, IExecutableCommand, IPartImportsSatisfiedNotification 
    {
        private const string RecognizerId = "SR_MS_en-US_Kinect_10.0";
        private const string SoftwareName = "SoapBox"; 
        private const double ConfidenceCutoff = 0.95; 
        protected SpeechRecognitionEngine _sre; 
        protected KinectAudioSource _kinectAudioSource; 
        protected Stream _kinectAudioStream; 
        protected Dispatcher _uiThreadDispatcher; 
        
        #region Protected Properties 
        
        protected IDictionary<string, IList<Lazy<ICommand, AudioCommandMetadata>>> AudioToCommandDict { get; set; }
        
        [ImportMany(typeof(ICommand))] 
        protected IEnumerable<Lazy<ICommand, AudioCommandMetadata>> Commands {get; set; }
        
        #endregion Protected Properties 
        
        #region Constructors 
        
        public KinectAudioPlugIn() 
        {
        }
        
        #endregion Constructors 
        
        #region Kinect Speech Recognition Event Handlers 
        
        void SreSpeechRecognitionRejected(object sender, SpeechRecognitionRejectedEventArgs e) 
        {
            logger.Info("\nSpeech Rejected"); 
        }
        
        void SreSpeechHypothesized(object sender, SpeechHypothesizedEventArgs e) 
        {
            logger.InfoWithFormat("\rSpeech Hypothesized: \t{0}", e.Result.Text); 
        }
        
        void SreSpeechRecognized(object sender, SpeechRecognizedEventArgs e) 
        {
            var resultText = e.Result.Text; var confidence = e.Result.Confidence; 
            if (confidence > ConfidenceCutoff && this.AudioToCommandDict.ContainsKey(resultText)) 
            {
                var cmdList = this.AudioToCommandDict[resultText]; 
                var a = new Action(() => 
                {
                    foreach (var item in cmdList) 
                    {
                        if (item.Value.CanExecute(null)) 
                        {
                            item.Value.Execute(null); 
                        }
                    }
                });
                this._uiThreadDispatcher.Invoke(a, null); 
            }
            else 
            {
                logger.InfoWithFormat("\nLow Confidence Speech Ignored:\n\tText = {0}\n\tConfidence = {1}",new object[2]{resultText,confidence}); 
            }
        }
        
        #endregion Kinect Speech Recognition Event Handlers 
        
        #region Helpers 
        
        Choices GetAllRecognizedCommands() 
        {
            //This implementation could probably be made into a nice LINQ statement if anyone knows/cares to do it 
            var recognizedCommands = new Choices(); 
            this.AudioToCommandDict = new Dictionary<string, IList<Lazy<ICommand, AudioCommandMetadata>>>(); 
            foreach (var item in this.Commands) 
            {
                var action = item.Metadata.Action.Trim(); 
                var subject = item.Metadata.Subject.Trim(); 
                if (string.IsNullOrEmpty(action) || string.IsNullOrEmpty(subject)) 
                {
                    continue; 
                }
                var phrase = string.Format("{0} {1} {2}", SoftwareName, action, subject); 
                if (this.AudioToCommandDict.ContainsKey(phrase)) 
                {
                    this.AudioToCommandDict[phrase].Add(item); 
                }
                else 
                {
                    recognizedCommands.Add(phrase);
                    this.AudioToCommandDict.Add(phrase, new List<Lazy<ICommand, AudioCommandMetadata>>() { item }); 
                }
            }
            
            return recognizedCommands; 
        }
        
        #endregion Helpers 
        
        #region IPartImportsSatisfiedNotification Members 
        
        public void OnImportsSatisfied() 
        {
            var t = new Thread(() => 
            {   //I know this is a perfect example of bad excepion handling. I just don't know all the things 
                //that can go wrong with the Kinect yet, so I am just putting this entire initialization in 
                //one bug try-catch block. If some knows how to make this better, let me know. 
                try 
                {
                    this._kinectAudioSource = new KinectAudioSource(); 
                    _kinectAudioSource.FeatureMode = true; 
                    _kinectAudioSource.AutomaticGainControl = false; //Important to turn this off for speech recognition 
                    _kinectAudioSource.SystemMode = SystemMode.OptibeamArrayOnly; //No AEC for this sample 
                    RecognizerInfo ri = SpeechRecognitionEngine.InstalledRecognizers().Where(r => r.Id == RecognizerId).FirstOrDefault(); 

                    if (ri == null) 
                    {
                        return; 
                    }
                    
                    this._sre = new SpeechRecognitionEngine(ri.Id); 
                    var recCmnds = GetAllRecognizedCommands(); 
                    var gb = new GrammarBuilder(); 
                    
                    //Specify the culture to match the recognizer in case we are running in a different culture. 
                    gb.Culture = ri.Culture;gb.Append(recCmnds);
                    var g = new Grammar(gb);
                    
                    // Create the actual Grammar instance, and then load it into the speech recognizer. 
                    _sre.LoadGrammar(g);
                    _sre.SpeechRecognized += SreSpeechRecognized;
                    _sre.SpeechHypothesized += SreSpeechHypothesized;
                    _sre.SpeechRecognitionRejected += SreSpeechRecognitionRejected;
                    this._kinectAudioStream = _kinectAudioSource.Start();
                    _sre.SetInputToAudioStream(_kinectAudioStream,new SpeechAudioFormatInfo( EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));
                    _sre.RecognizeAsync(RecognizeMode.Multiple); 
                }
                catch(Exception e) 
                {
                    logger.Error("ERROR: Could not initialize Kinect audio and/or speech recognition engine.", e); 
                }
            });
            
            t.Start();
            this._uiThreadDispatcher = Dispatcher.CurrentDispatcher; 
        }
        
        #endregion
        
        #region IExecutableCommand Members 
        
        /// <summary>
        /// This is the shutdown command that cleans-up all the pieces we use here
        /// </summary>
        /// <param name="args"></param>
        public void Run(params object[] args) 
        {
            this._kinectAudioStream.Dispose(); 
            this._sre.Dispose(); 
            this._kinectAudioSource.Dispose(); 
        }
        
        #endregion
    }
}

Most of (90+% of) this code comes directly from the Audio Fundamentals Quickstart mentioned above. For a detailed explanation of that code, please watch the video and/or read the article. That said, there are a few threading tricks I had to implement in-order to get the RecognizeAsync() method to work properly, but nothing there is terribly difficult to understand.

The parts of the above code that are of interest to SoapBox developers are the following:

GetAllRecognizedCommands Methods

The GetAllRecognizedCommands() method of the Kinect add-in is really where most of the magic happens. In this method we parse all of the imported AudioCommandMetadata to create a dictionary mapping between the (lazily loaded) ICommand objects and their verbal triggers. We then return a Choices object that will be used to create the speech recognition grammar given to the SpeechRecognitionEngine.

SreSpeechRecognized Method

By the time we reach the SreSpeechRecognized event handler we are already on the home stretch. Instead of just writing the recognized text to the console - like is done in the Audio Fundamentals Quickstart - we use it to find the commands to be executed when that particular speech is recognized. Once we have the list of ICommand objects to be executed we go through each of them and execute the ones that are executable.

CHANGES TO EXISTING CODE

- Just Add Water Metadata

As you may already be able to tell, getting the existing SoapBox PinBall demo to hook into this add-in is just as easy as adding the following metadata attribute to its ViewMenuPinBallTable class

[AudioCommandMetadata(Action = “Show”, Subject = “PinBallTable”)]

Since the ViewMenuPinBallTable class inherits from AbstractMenuItem it already supports the ICommand class so all we have to do now is run the application with a Kinect hooked-up to our machine and say the words, “SoapBox Show PinBallTable” and the Run() method of the ViewMenuPinBallTable class will be executed by the Kinect add-in so that the pinball table magically appears on the screen. Pretty Cool, huh !?!

- Is The Custom Metadata Even Really Needed?

If you refer back to the top of this post you will see that my number one design constraint was to make, “ZERO source code changes to the existing PinBallTable add-in”. Now, some of you may be saying to yourselves, “Hey, you cheated, you added that metadata attribute and therefore you changed the PinBallTable demo source code!” Well, you are kind of right, if you really call metadata attributes source code. I mean are you really going to write new tests or expect old ones to break after adding a single line of metadata? If so you might want to take a second look at your tests and design because something is seriously wrong.

For all you hardcore sticklers out there I want to make it clear that we could have achieved a very similar result without even making that single metadata attribute addition (an important point if you don’t want to, or can’t re-compile your existing plug-in(s)). If instead of using the [ImportMany(typeof(ICommand))] contract on the Commands property in the Kinect add-in we could have used the [Import(SoapBox.Core.ExtensionPoints.Workbench.MainMenu.ViewMenu, typeof(IMenuItem))] contract then instead of parsing metadata attributes we could have simply parsed the ID property from each of the IMenuItems and the verbal commands would have become something like “SoapBox PinBallTable” because “PinBallTable” is the value given to the ID property of the ViewMenuPinBallTable class.

Though this approach doesn’t use lazy loading, it does something better, it only uses objects that have necessarily already been loaded. Since all the menu items must be loaded at start-up there would be no new instantiations made with this approach.

The reason I chose the metadata approach is because I think it results in a much more robust and broadly useable solution. Specifically, it forces the designer to specify the Subject and Action of every command and it provides a more general way to hook into the Kinect add-in.

Conclusion

In this article, we discussed a method for robustly enabling voice commands in a wide range of SoapBox add-in. We saw that through minimal effort on new code and essentially ZERO changes to existing code we were able to make the pre-existing PinBallTable add-in respond to verbal commands from the user.

All you SoapBox veterans out there are probably not too surprised by how easy SoapBox Core makes all of this, and how little code was needed because you already know how powerful the framework is. On the other hand, if you are new to MEF and/or SoapBox you are almost certainly amazed at how easy it was to add Kinect support to an existing SoapBox application – in which case, I hope you use this article as a motivator to check-out SoapBox Core and give it a shot.

I hope you have enjoyed and understood this post. If you have questions, comments, concerns, suggestions, requests or jokes - or if you want my source code for this post - please e-mail beachfrontcoding@gmail.com and I will be happy to send it to you. Once you get the code up and running, here are a couple of things you can do to make it better:

1.) Add more commands. Change the AllowMultiple attribute on the custom metadata class to true so that the PinBallTable can respond to both “SoapBox Show PinBallTable” and “SoapBox Open PinBallTable” with the same ICommand object.

2.) Create file similar to the Extensions.cs file in Soapbox Core that you can use to manage the subject and action string values for lots of commands used across your application/

3.) Export the “SoftwareName” and “ConfidenceCutoff” properties from another class property so that you have a strongly-typed config file to use in the Kinect add-in.

4.) The SpeechRecognitionEngine is sometimes a little slow and it makes the PinBallTable appear to load slowly. Add a StatusBarLabel to update the user on the happenings as soon as a piece of speech is recognized so that they don’t get impatient and issue the command over and over again.

5.) Change the AllowMultiple attribute on the AudioCommandMetadata class to true and change the corresponding constructor appropriately so that ICommand classes can responde to more than one verbal command. For example, maybe the pinball table should open when the user says, “SoapBox Open PinBallTable” AND when the user says, “SoapBox Show PinBallTable”

Now go, get to work on SoapBox!

— Karl B.

Liked posts on Tumblr: More liked posts »