Covariance Matrix

This weekend I was going through a section in Real-Time Collision Detection on computing the covariance matrix from a set of points (from hereon “point cloud”). Of course, while working through the implementation it’s always good to have an example or two to help solidify the concept, so I found myself working through a few samples for my unit tests and decided to post them here.

Example 1:

Let’s start with a simple, somewhat uninteresting example where we have a single point on each of the cardinal axes.

+x:  1.0,  0.0,  0.0
+y:  0.0,  1.0,  0.0
+z:  0.0,  0.0,  1.0
-x: -1.0,  0.0,  0.0
-y:  0.0, -1.0,  0.0
-z:  0.0,  0.0, -1.0

Covariance Matrix:

0.333333,  0.0,  0.0
0.0,  0.333333,  0.0
0.0,  0.0,  0.333333

This is what we would expect because the spread is even along all the axes.

Example 2:

Now, let’s look at a rotated version of the point cloud used in Example 1. Rotate the points 45 deg about the y-axis. I’m going to use 0.5 instead of sqrt(2)/2 to illustrate what happens when there is a dominant axis.

+x:  1.0,  0.0,  0.0   ->    0.5,  0.0,  0.5
+y:  0.0,  1.0,  0.0   ->    0.0,  1.0,  0.0  (stays the same)
+z:  0.0,  0.0,  1.0   ->   -0.5,  0.0,  0.5
-x: -1.0,  0.0,  0.0   ->   -0.5,  0.0, -0.5
-y:  0.0, -1.0,  0.0   ->    0.0, -1.0,  0.0  (stays the same)
-z:  0.0,  0.0, -1.0   ->    0.5,  0.0, -0.5

Covariance Matrix:

0.166667,  0.0,  0.0
0.0,  0.333333,  0.0
0.0,  0.0,  0.166667

This result should make sense because the x- and z-axis are no longer as spread as wide as the two points on the y-axis. As a result, the y component of the diagonal is larger than the x and z counterparts.

Also, it should be noted that even though the points in the xz-plane were not on the cardinal axes, the result is still a diagonal matrix. This is because the point cloud in this example is symmetric, such that each point cloud point in the xz-plane has a corresponding point (x,y) -> (-x,-y).

Example 3:

In this example, we use the same point cloud as in Example 2, but translate the points all by 1.1, -0.4, 0.7.

 1.6, -0.4,  1.2
 1.1,  0.6,  0.7
 0.6, -0.4,  1.2
 0.6, -0.4,  0.2
 1.1, -1.4,  0.7
 1.6, -0.4,  0.2

Covariance Matrix:

0.166667,  0.0,  0.0
0.0,  0.333333,  0.0
0.0,  0.0,  0.166667

This example is a test to confirm a covariance matrix property: the covariance matrix remains the same if the point cloud is translated.

Example 4:

Okay, so now let’s just take an arbitrary point cloud of eight points. The example here has no built-in symmetry, nor is it centered at the origin.

 1.2,  1.2,  1.2
-0.8, -0.8, -0.8
 0.7,  0.7,  0.5
 0.3,  0.4, -0.7
-0.2,  1.1,  0.5
 1.3, -0.8,  0.9
-0.1, -0.1, -0.3
 0.4, -0.5, -0.7

Centroid: 0.35, 0.15, 0.075

Covariance Matrix:

0.447500,  0.102500,  0.353750
0.102500,  0.582500,  0.283750
0.353750,  0.283750,  0.551875

Implementation Notes:

The covariance matrix is symmetric. Therefore, only the upper triangular entries (including the diagonal) must be computed.

I included the centroid in the final example since it is subtracted off the points in the point cloud before computing the covariance matrix entries. We do this because we want our result to reflect the spread of the points in the local space of the point cloud.

Left- and Right-Handed Coordinate Systems

Lately I’ve been working the core engine math library, and as a result had to delve into the wonderful world of “handedness”. This part of implementing a cohesive math library has always been a stumbling block for me in the past, so this time around things are much clearer. Here are some of my notes.

Left-Handed Coordinate System Right-Handed Coordinate System
left-handed right-handed
Matrix Layout: Row Vectors Matrix Layout: Column Vectors
rowvectors columnvectors
Matrix Multiplication:

[pmath size=15]M = M_1M_2{cdots}M_n[/pmath]
Matrix Multiplication:

[pmath size=15]M = M_n{cdots}M_2M_1[/pmath]
Polygon Winding: Clockwise Polygon Winding: Counter-Clockwise

Book Review: Software Engineering for Game Developers

Software Engineering for Game Developers

Software Engineering
and Game Developers

John P Flynt, Ph.D.

Thoughts Overall:

SEGD is a detailed post-mortem of a small team’s game project: Ankh. It focuses in on the process of developing a game from start to finish… with a heavy emphasis on the process. The book spans 800+ pages, many of which are tabular explanations of forms and documents. In other words — this book has “academic” written all over it. Okay, granted the Ph.D. at the end of the author’s name also indicates this fact, but I was willing to give the book a chance (as I am with a lot of books!).

Overall, I felt let down about a hundred pages into the book and it never recovered. Initially, I had high hopes because I like reading post-mortems from game projects (even if it may have been from a game that no one’s ever heard of) and I like process. However, SEGD takes its readers down a road that veers strongly towards the managerial side of game development, sans a couple of chapter sections sprinkled with the code of a few classes in an attempt to catch the attention of software engineers. Don’t get me wrong, I don’t mind reading about practices and methodologies — I grew up on SCRUM and Test-Driven Development, but there’s something wrong when a book is detailing forms from the ISO standard.

Yes, I read all of it. The best parts were when they explained the expectations they had, the methods they used, the overall outline of their setup, the roles that team members had, and the problems they ran into and how they resolved them. These were great discussions. In fact, I think the “Stripe” (iteration) methodology used is a pretty viable structure. But all this could have been wrapped up well before hitting the 500 page mark.

Do I Recommend this Book?

For this particular book, I need to you to answer my question first: Do you like digging for your information?

This book is dense and is, in fact, rewarding on varying levels for both managers and members of a game development team, but you’ll have to dig through layers of mechanics for those nuggets.

Pros:

I really appreciated the effort that went into tracking the info during the Ankh project. In fact, this is exactly why I bought the book: to read about a team project that goes from design doc to a fully functional game. It was very interesting to read about how a bunch of ideas were turned into something tangible and fun.

This book has a lot to offer in terms of how a team formulates a game and gets their ideas implemented from the initial scribblings they wrote weeks/months before. It examines an iterative development methodology (each iteration they call a “Stripe”). Each stripe is actually a complete demo that fulfills a certain set of specified requirements, along the way to the final stripe with completes the game. The book takes the reader through each stripe (not in detail for all 17, but that’s okay — it started to become monotonous after a while), and explains many important issues that arose and addresses the team dynamics.

Furthermore, each stripe is outlined (with requirements and use cases), implemented, and tested. All of these steps are elaborated upon with good context, though there is a heavy emphasis on the outlining and documentation and less so on the actual implementation.

As you might have expected, the topic of design patterns surfaces throughout the discussion. Although I had feared for the worst when I reached this chapter (due to the overall view being so lofty it made me feel uncomfortably distant from the software), it actually turned out to be one of the best chapters of the book. For SEGD, patterns wasn’t the be-all end-all I have been reading about in other sources. The chapter takes a very practical point of view and after you have finished reading it, you don’t feel like the world can be solved by patterns, rather that they are proven, structured designs to solve a distinct class of problems. This tone and treatment was very refreshing.

Cons:

After finishing this book I felt like I needed to have a formal review before moving on to the next book in my queue. My biggest complaint is the book was too focused on formalities of the game development process. I kept asking myself, “How did these guys make any progress with all these documents and meetings?”

Can I think of anything more boring to read about than metrics? I know: tables of ISO standard forms! Why the author could not have merely suggested the reader to take a look at a few forms to use as templates when setting up their design document or other related documents, rather than include pages upon pages of tables with explanations of each section (some of which boil down to “you probably don’t need to include this section”) I cannot fathom.

Don’t expect code. Classes, their objects, and the messages/events passed between them are all examined at a high architectural level. I was thinking this book might have explained more about library modules to form a game engine, but the game project itself was quite tiny in comparison to what I had already been exposed to at work, and did not discuss engine architecture to any detail. Instead, the focus was on using UML and SmartDraw to design every class in your game. Personally, I believe if you have enough foresight and patience to draw out the UML and maintain it throughout the life of a game project, you should take a step back and consider how much time you’re spending not implementing your game. Tools and documentation are there to help you convey ideas and save time — not become a ball-and-chain!

Final Word:

This was an interesting read — though I had higher hopes. The cursory content of the vast number of subjects could have been thinned and left to the reader to investigate by reading authoritative, dedicated books on the subjects. The core of the book was really the Ankh project and the Stripes methodology, but it was often times buried under too many layers of formalization. Don’t pick up this book unless you’re willing to spend a lot of time sifting through those layers.

Reading PNG Images from Memory

As far as I can tell most everyone is interested in reading PNG image data from file (internally done in libpng via fread()). I thought: “Wow, that’s quite unfortunate.” and figured I’d write about it after I figured it out. The article below is the result.

NOTE: the intent of this article is to help you get something up and running quickly. By design, it does not provide detailed explanations about the functions or interface of libpng — that’s what the documentation is for.

Motivation

There are several reasons why one might want to have data read from memory rather than directly from a file.

Testing – This was the first reason that came to my mind. In the Pulsar engine codebase, I want to test my format data loaders, which includes anything from textures and models to config files and scripts. More on this later.

Archives – If you pack a “file” into a custom archive, do you expect you want to hand a third-party library your file pointer? …

IO Performance – File IO has never been instant, and it can be worse depending upon the platform. As a result, you’re usually better off if you read in a file in its entirety as opposed to only reading a small chunk at a time, since the device may have seeked to somewhere else in between your reads.

Scenario in Pulsar

I have abstracted the operation of reading into an InputStream interface so that the source of the data (file, buffer in memory, network, or whatever) can be swapped out without affecting the behavior of the loader classes.

The function signature looks like this:

size_t InputStream::Read(byte* dest, const size_t byteCount);

You will see this used as the main data pipe in the next section.

Implementation

I will assume you have already set the include and library dependencies for libpng. The libpng documentation is pretty clear about this topic, so I won’t delve into it here.

I should also note that the code provided below is practical and is taken from a module in my codebase. It includes a basic amount of error checking, but I stripped the rest of it out for clarity sake. As such, I recommend that you use this as a starting point.

You will undoubtedly have to switch out the calls to pulsar::InputStream::Read() and replace with them your own data source’s read interface.

Lastly, I have typedefed a few basic data types in my engine. For example: byte is typedefed as an unsigned char.

Setup

First, we check if the data has the PNG signature, which is held in the first 8 bytes of the image data. If libpng determines that the signature is invalid, we bail.

enum {kPngSignatureLength = 8};
byte pngSignature[kPngSignatureLength];

// call to pulsar::InputStream::Read()
// -> replace with your own data source's read interface
inputStream.Read(pngSignature, kPngSignatureLength);

if(!png_check_sig(pngSignature, kPngSignatureLength))
   return false;

Next, we need to create a pair of file info structures. For future compatibility reasons, libpng must allocate and free the memory for these structures. If either one of these calls fail, we perform the necessary cleanup and bail.

// get PNG file info struct (memory is allocated by libpng)
png_structp png_ptr = NULL;
png_ptr = png_create_read_struct(PNG_LIBPNG_VER_STRING, NULL, NULL, NULL);

if(png_ptr == NULL)
   return false;

// get PNG image data info struct (memory is allocated by libpng)
png_infop info_ptr = NULL;
info_ptr = png_create_info_struct(png_ptr);

if(info_ptr == NULL)
{
   // libpng must free file info struct memory before we bail
   png_destroy_read_struct(&png_ptr, NULL, NULL);
   return false;
}

This is where the magic happens. Now we must tell libpng to use a our own custom routine for retrieving the image data. To do this, we use the png_set_read_fn() routine:

png_set_read_fn(png_ptr, &inputStream, ReadDataFromInputStream);

The first parameter is the pointer to the first struct we asked libpng to create for us: png_ptr.

The second parameter is a void pointer to the data source object. In this case, the data source my InputStream object.

The final parameter is a pointer to a function that will handle copying the data from the data source object into a given byte buffer. Here, my reader function is called ReadDataFromInputStream(). This routine must have the following function signature:

void ReadDataFromInputStream(png_structp png_ptr, png_bytep outBytes,
   png_size_t byteCountToRead);

Now that we have told libpng where to go when it needs to read bytes, we need to implement that routine. Mine looks like this:

void ReadDataFromInputStream(png_structp png_ptr, png_bytep outBytes,
   png_size_t byteCountToRead)
{
   png_voidp io_ptr = png_get_io_ptr(png_ptr);
   if(io_ptr == NULL)
      return;   // add custom error handling here

   // using pulsar::InputStream
   // -> replace with your own data source interface
   InputStream& inputStream = *(InputStream*)io_ptr;
   const size_t bytesRead = inputStream.Read(
      (byte*)outBytes,
      (size_t)byteCountToRead);

   if((png_size_t)bytesRead != byteCount)
      return;   // add custom error handling here

}  // end ReadDataFromInputStream()

The last piece of setup we need to do is tell libpng that we have already read the first 8 bytes when we checked the signature at the beginning.

// tell libpng we already read the signature
png_set_sig_bytes(png_ptr, kPngSignatureLength);

PNG Header

Now we are ready to read the PNG header info from the data stream. There is a lot of data one could extract from this, but here I’m only going to grab a few important values. Refer to the libpng documentation for more details on this function.

png_read_info(png_ptr, info_ptr);

png_uint_32 width = 0;
png_uint_32 height = 0;
int bitDepth = 0;
int colorType = -1;
png_uint_32 retval = png_get_IHDR(png_ptr, info_ptr,
   &width,
   &height,
   &bitDepth,
   &colorType,
   NULL, NULL, NULL);

if(retval != 1)
   return false;	// add error handling and cleanup

The width and height parameters are the image dimensions in pixels.

The bitDepth is the number of bits per channel. It is important to note that this is not per pixel (which is what is normally the value stored in a TGA or BMP).

The final parameter we care about is colorType which is an integer that maps to one of the predefined values in libpng that describe the image data (RGB, RGBA, etc). You will need this to determine how many channels you need to read from the bytes into your pixel color.

PNG Image Data

We have reached the point where we can now read in the image data into our internal image pixels.

outImage.Init(width, height);

switch(colorType)
{
   case PNG_COLOR_TYPE_RGB:
      ParseRGB(outImage, png_ptr, info_ptr);
      break;

   case PNG_COLOR_TYPE_RGB_ALPHA:
      ParseRGBA(outImage, png_ptr, info_ptr);
      break;

   default:
      PULSAR_ASSERT_MSG(false, "Invalid PNG ColorType enum value given.\n");
      png_destroy_read_struct(&png_ptr, &info_ptr, NULL);
      return false;
}

First, I initialize my internal image format, which allocates the appropriate memory for the pixels — each of which is a Color32 (red,green,blue,alpha).

The switch is to differentiate my parsing routines (one of which is listed below) based on the colorType value we retrieved from the header.

To parse the data into pixel colors all we have to do is grab each row from the image and proceed to read each tuple of bytes that represents a pixel. The number of bytes we read is dependent upon the data stored, which is why we differentiate between parsing a 24-bit (RGB) color value and a 32-bit (RGBA) color value via colorType.

I have provided the ParseRGBA() routine below. The ParseRGB() is very similar except no bytes are read for the alpha channel.

void ParseRGBA(Image& outImage, const png_structp& png_ptr,
   const png_infop& info_ptr)
{
   const u32 width = outImage.GetWidth();
   const u32 height = outImage.GetHeight();

   const png_uint_32 bytesPerRow = png_get_rowbytes(png_ptr, info_ptr);
   byte* rowData = new byte[bytesPerRow];

   // read single row at a time
   for(u32 rowIdx = 0; rowIdx < height; ++rowIdx)
   {
      png_read_row(png_ptr, (png_bytep)rowData, NULL);

      const u32 rowOffset = rowIdx * width;

      u32 byteIndex = 0;
      for(u32 colIdx = 0; colIdx < width; ++colIdx)
      {
         const u8 red   = rowData[byteIndex++];
         const u8 green = rowData[byteIndex++];
         const u8 blue  = rowData[byteIndex++];
         const u8 alpha = rowData[byteIndex++];

         const u32 targetPixelIndex = rowOffset + colIdx;
         outImage.SetPixel(targetPixelIndex, Color32(red, green, blue, alpha));
      }
      PULSAR_ASSERT(byteIndex == bytesPerRow);
   }

   delete [] rowData;

}  // end ParseRGBA()

Cleanup

Finally, now that the image data has been read we should clean up the libpng structures, then we can be on our merry way!

png_destroy_read_struct(&png_ptr, &info_ptr, NULL);

That's it! Now we can go do whatever we need to with the image data we just read in!

Conclusion

Here I've only touched on the basics. This seemingly simple task was practically nowhere to be found when I searched for it, resulting in me having to piece it together. Hopefully this will same someone else some time.

Resources

Image Library Resources

I have compiled a listing of resources I used while implementing my Image library. This will be updated as things progress.

TARGA File Format:

Extension: tga
Endian: Little Endian

Resources: