Getting set up for augmented reality using XNA and Kinect

Augmented RealityThe term virtual reality has gotten kind of stale and dusty.  It seemed really cool back when I was a kid, but for whatever reason it has fallen out of vogue. But augmented reality is here to take its place.

Random example from YouTube:

Wikipedia entry:

I want to get started making augmented reality apps, so I figure to get going I need two things: a live video feed (lets call that reality for now), and a way to draw 3d stuff on top of it. I will be using the Kinect for the video feed and using Microsoft’s XNA framework to do the 3d drawing. In this post I will describe how to bring in live Kinect video footage and draw 3d XNA content on top of it

To get the Kinect video data rendered behind the 3d content we are going to use a nice reusable class from XNA, DrawableGameComponent. But first things first. Our arena today will be a standard XNA game project. Since we are going to be using the Kinect we will need to add a reference to Microsoft.Research.Kinect  to the game project. Next up we need to create a class that inherits from DrawableGameComponent. This class will be responsible for getting the video frames from the Kinect and drawing them on screen.

Specifically we need to:

  • Initialize the Kinect and subscribe to the video frame ready events
  • Allocate some space for Texture2d objects and buffer arrays that we can use to convert the Kinect data
  • Translate the video frames into Texture2D objects that can be used to draw with
  • Draw the Texture2D video frames on to the screen using a SpriteBatch
So we need a new class inheriting from DrawableGameComponent:
    public class KinectVideoFeedComponent : DrawableGameComponent
We will the do the first two steps in the LoadContent method, which can be accessed by overriding it from the base class:
        protected override void LoadContent()
            spriteBatch = new SpriteBatch(GraphicsDevice);

            colorFrame = new Texture2D(GraphicsDevice, 640, 480, false, SurfaceFormat.Color);
            colorFrameBuffer = new byte[640 * 480 * 4];

            depthFrame = new Texture2D(GraphicsDevice, 320, 240, false, SurfaceFormat.Color);
            depthFrameBuffer = new byte[320 * 240 * 4];

            nui.Initialize(RuntimeOptions.UseColor | RuntimeOptions.UseDepth);
            nui.VideoFrameReady += new EventHandler<ImageFrameReadyEventArgs>(nui_VideoFrameReady);
            nui.VideoStream.Open(ImageStreamType.Video, 2, ImageResolution.Resolution640x480, ImageType.Color);

            nui.DepthFrameReady += new EventHandler<ImageFrameReadyEventArgs>(nui_DepthFrameReady);
            nui.DepthStream.Open(ImageStreamType.Depth, 2, ImageResolution.Resolution320x240, ImageType.Depth);


Next up is translating the video frames into Texture2D objects that can be used to draw with, but before we do that I would like to pause to describe how images are passed around in XNA and with Kinect. You know the images are really little more than a collection of colored pixels stacked together to make an image. You probably also know that a pixel is made up of a red value, a green value and a blue value. How you combine those values tells you the color.  If you turn red all the way up and green and blue down then you end up with red. If you turn up red and blu

e and leave green down, you get purple. If you turn them all up you get white. All down is black. With the Kinect and XNA the red, green and blue values are stored as a simple array of bytes. You might expect a two dimensional array so that you could get at any given point in an image using the X and Y coordinates, but when it comes to this sort of programming you need to make every bit count. So every pixels of an image is stored a long series of bytes.  The first three bytes represent the first pixel, the next three represent the next pixel and so on. Everyone agrees that the pixels start in the top left and continue to the right row by row. The problem is that not everyone agrees about the order or the color components.

In fact not everyone at Microsoft agrees about that it. The pixels coming from the Kinect are order blue,  green, red. When you ask XNA to draw the pixels for you it expects the bytes to be ordered red, green blue. Thus, if you try to draw an image from kinect in xna the red and blue values will be mixed up and your image will look weird. So if you going to use them together you have to swap the red and blue channels before handing off the image to XNA.

Ok back to the problem at hand. We need to convert the video frames from the Kinect into a Textured2D surface that XNA can use:

        void nui_VideoFrameReady(object sender, ImageFrameReadyEventArgs e)
            //Push frame bytes into texture
            byte[] bits = e.ImageFrame.Image.Bits;
            for (int i = 0; i < bits.Length; i += 4)
                colorFrameBuffer[i] = bits[i + 2];
                colorFrameBuffer[i + 1] = bits[i + 1];
                colorFrameBuffer[i + 2] = bits[i];

Lastly, we need to draw the frame on the screen:

        public override void Draw(GameTime gameTime)
            spriteBatch.Draw(colorFrame, new Vector2(0, 0), Color.White);
            //Using sprite batch changes some device states,
            //so we need to flip them back or the rest of th render might look funny
            GraphicsDevice.BlendState = BlendState.Opaque;
            GraphicsDevice.DepthStencilState = DepthStencilState.Default;

Did I say lastly? What I meant was lastly for the DrawableGameComponent. Now we just have to add it to the game. So head on over to your game class, by default it’s called Game1.cs. We are going to need a reference to a component to work with throughout the game so add a member variable up near the top:

    public class Game1 : Microsoft.Xna.Framework.Game
        KinectVideoFeedComponent kinectVideo;

Then in the constructor initialize and add the component to the Components collection:

        public Game1()
            kinectVideo = new KinectVideoFeedComponent(this) { Visible = false };

You may have noticed that Visible = false part. DrawableGameComponent’s are a great way to package discrete reusable bits of codee that draws on the screen. One little problem them is that out of the box their Draw() methods are called after the game’s Draw() method. This is good for a lot of things, but in our case we want it to draw almost first thing so that the rest of scene will render on top of it. So to make this happen we turn this Visible property to false and then call its Draw() method manually. So at this point pretty much all that is left is to add a call to kinectVideo.Draw in the game’s Draw method:

        protected override void Draw(GameTime gameTime)
            GraphicsDevice device = graphics.GraphicsDevice;

That is it. If you compile and run now you should see a live video feed from the kinect being rendered behind your scene. It is a simple start but it should be able to form the basis of future augmented reality applications.

Source code is available at: with Kinect in XNA

Augmented Reality

Tagged , ,

4 thoughts on “Getting set up for augmented reality using XNA and Kinect

  1. turkim01 says:

    hell o thats very nice can you send me this proje my e mail

  2. Devinda says:

    Thanks Matt ,most of the people hiding there AR technology because of their high commercial value. But you are so generous..thanks again,

  3. Zen Ryan says:

    I am not able to download the project from the svn repository…it gives some access error…
    Could you please mail it to me?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: