In December of last year I heard that Microsoft was providing a Kinect sensor for use in the FIRST Robotics Competition’s “Kit of Parts”. For those of you who are not familiar with this competition, it is an annual event for high school students to build a robot capable of solving a problem which varies from year to year. Back in 2008, I volunteered as a judge in a FIRST Tech Challenge and I never forgot the experience of seeing so many young people so excited about an engineering and scientific event.

As an inveterate software engineer and amateur robot researcher who uses many different Microsoft technologies, I decided to investigate the Kinect sensor and the SDK Microsoft provided to exploit its capabilities. My thought was that I might learn something myself, while creating an example of how to use the SDK for the basic function of determining the angle between body segments of the person being tracked by the Kinect sensor. Since the Kinect sensor is designed to perform “skeletal tracking,” this is well within its range of capabilities. This is also something that the Kinect competitors may need to control their robots if they chose to use the Kinect sensor at the operator station. I am a bit late in presenting this article, considering that the initial six week build period in the FIRST Competition is almost over, but I am hoping that this might still be useful to the student competitors.

First, let me say that Microsoft has done an excellent job in providing a full featured SDK that exposes all aspects of the Kinect sensor, from raw data to higher level functions. It does so using classes made available in both managed and native code. Example applications are provided to demonstrate the use of each class. I have used many robotics sensors over the years and even though the Kinect sensor was initially designed for use in Xbox 360 games, it provides the most promise of any robotics sensor I have seen at this cost (approximately $100 as a stand-alone product).

So back to the problem at hand…imagine yourself controlling a mobile robot using your body position. This is not unlike what a ground crew might do to guide a jumbo jet into the terminal area, using the position of their arms to tell the pilot how close the plane is to the mark, and when to stop the plane. I have created a small class that calculates the angle between two adjacent body segments, so that the Kinect sensor can be used to accomplish a similar level of control.

The complete class, called the SkeletonAnalyzer, is available here and snippets are presented below to provide details of its operation. References used to create this code are provided along the way. Since I am not a mathematician, I too rely on the expertise of others so graciously provided on various web pages. The SkeletonAnalyzer is written in C# and uses the Kinect assemblies, in addition to the XNA framework assemblies. The XNA framework is used to perform some basic vector calculations.

Whenever I learn how to use new SDK’s, I always like to begin from the top and work my way down to the details of the implementation. To demonstrate the use of the SkeletonAnalyzer, I modified the KinectDiagnosticViewer user control provided with the Kinect SDK. The modifications made to the KinectDiagnosticViewer constructor to instantiate and configure the SkeletonAnalyzer are displayed below:

 

public KinectDiagnosticViewer()

{

    InitializeComponent();

 

    // Instantiate two instances of the SkeletonAnalyzer, once for each arm where a calculation

    // of the angle between the upper arm and forearm are required.

    _LeftArmSkeletonAnalyzer = new SkeletonAnalyzer();

    _RightArmSkeletonAnalyzer = new SkeletonAnalyzer();

 

    // Reverse the coordinates to portray them correctly on the screen, with hands at the

    // head acting as quadrant zero (aviation ground control style).

    _LeftArmSkeletonAnalyzer.ReverseCoordinates = true;

    _RightArmSkeletonAnalyzer.ReverseCoordinates = true;

 

    // Reverse the order of the body segments in the left arm relative to the right arm

    // to account for the fact that they are on the opposite side.

    _LeftArmSkeletonAnalyzer.SetBodySegments(JointID.WristLeft, JointID.ElbowLeft, JointID.ShoulderLeft);

    _RightArmSkeletonAnalyzer.SetBodySegments(JointID.ShoulderRight, JointID.ElbowRight, JointID.WristRight);

}

Just after instantiating two instances of the SkeletonAnalyzer class, one for the left arm and the other for the right, the ReverseCoordinates attributes is set to true. This is important since the angle is calculated from the true position of each body segment, not the position as it is depicted on the screen which is a mirror image of the true position. By reversing the coordinates we are effectively calculating the angle of the body segments as they appear on the computer screen. To see this more clearly, draw a stick figure on a piece of paper, and then turn it around and hold it up to the light to see how the angle changes position. Though the angle itself doesn’t change the value of the angle changes relative to the coordinate system since it is now located in a different quadrant. Using the ReverseCoordinates setting accounts for this and effectively flips the paper back again so it matches the true position of the body segments.

Next, the joints which comprise each body segment are provided for each instance of the SkeletonAnalyzer using the SetBodySegments function. The first call to the SetBodySegments function includes the joints for the left arm, and second call for the right arm.

Now that you have each instance of the SkeletonAnalyzer class fully configured, you need to actually call it with data from the Kinect sensor. This is accomplished with a small change to the KinectDiagnosticViewer’s nui_SkeletonFrameReady()method which processes each new batch of skeleton data. I have included an excerpt of this function below which contains the changes:

 

. . . code above removed for brevity. . .

             

              jointLine.StrokeThickness = 6;

              skeletonCanvas.Children.Add(jointLine);

          }

          float leftArmAngle = (float)_LeftArmSkeletonAnalyzer.GetBodySegmentAngle(data.Joints);

          angleLeftForeArmUpperArm.Text = leftArmAngle.ToString();

          float rightArmAngle = (float)_RightArmSkeletonAnalyzer.GetBodySegmentAngle(data.Joints);

          angleRightForeArmUpperArm.Text = rightArmAngle.ToString();

      }

      iSkeleton++;

 

. . . code below removed for brevity. . .

The SkeletonData object contains an array of Joint objects which is retrieved from the Skeletons object in the foreach loop located above this code snippet. Each Joint object contains the coordinates of the associated joint, expressed as a floating point value. You can see how this value is converted to screen coordinates in the SkeletonViewer project provided with the SDK samples. For our purposes, these coordinates are going to be converted to a unit vector, so screen coordinates are not used. This method returns an angle in degrees, which is then displayed in the KinectDiagnosticViewer user control. See the image below which depicts the KinectDiagnosticViewer with the newly added measure of the left and right arm angles at the lower right with the value of approximately 39 and 90 degrees respectively.

Kinect Skeleton Viewer with Angle

Now that you have seen it work, I will briefly explain the source code for the SkeletonAnalyzer class, along with references for more information on the geometry used in the calculation of the angle. As a reminder, the SkeletonAnalyzer source and project files are available for download here.

At this point, there isn’t much going on in the SkeletonAnalyzer class. My hope is that the users of this class might enhance it and post their changes to the FIRST Kinect or embedded101.com Forum. For now the GetBodySegmentAngle()method is where all of the action takes place with the calculation of the angle between the two body segments defined by the three joints previously specified in the KinectDiagnosticViewer constructor using the SetBodySegments() method. The GetBodySegmentAngle() method in its entirety is presented below:

 

public double GetBodySegmentAngle(JointsCollection joints)

{

    Joint joint1 = joints[_JointId1];

    Joint joint2 = joints[_JointId2];

    Joint joint3 = joints[_JointId3];

 

    Vector3 vectorJoint1ToJoint2 = new Vector3(joint1.Position.X - joint2.Position.X, joint1.Position.Y - joint2.Position.Y, 0);

    Vector3 vectorJoint2ToJoint3 = new Vector3(joint2.Position.X - joint3.Position.X, joint2.Position.Y - joint3.Position.Y, 0);

    vectorJoint1ToJoint2.Normalize();

    vectorJoint2ToJoint3.Normalize();

 

    Vector3 crossProduct = Vector3.Cross(vectorJoint1ToJoint2, vectorJoint2ToJoint3);

    double crossProductLength = crossProduct.Z;

    double dotProduct = Vector3.Dot(vectorJoint1ToJoint2, vectorJoint2ToJoint3);

    double segmentAngle = Math.Atan2(crossProductLength, dotProduct);

           

    // Convert the result to degrees.

    double degrees = segmentAngle * (180 / Math.PI);

 

    // Add the angular offset.  Use modulo 360 to convert the value calculated above to a range

    // from 0 to 360.

    degrees = (degrees + _RotationOffset) % 360;

 

    // Calculate whether the coordinates should be reversed to account for different sides

    if (_ReverseCoordinates)

    {

        degrees = CalculateReverseCoordinates(degrees);

    }

 

    return degrees;

}

The heart of this method is represented by the highlighted code. Here the cross product and dot product are used to obtain two values from which an atan2 (variation of the arctangent function) calculation can be made. Atan2 correlates the ratio of the cross product length (the Z-axis of the cross product calculation) with the value of the dot product to form the angle in radians between the body segments. A much better explanation of the geometry for this calculation is available here and here.

Conclusion:

Using the Kinect SDK, it is easy to obtain skeletal data from the Kinect sensor once tracking begins. This data can then be used in a variety of ways to determine the intent of a person controlling an object, such as your FIRST robot. In the article we have used a geometric formulation to derive the angle between two body segments for each Skeletal Frame as it becomes available from the sensor. This angle can then be interpreted as a representation of motor speed, not unlike a member of the ground crew that guides a jumbo jet to a precise position on a mark near the terminal building.