Sunday, June 24, 2012

Thumbs up while walking

So, Turing had a birthday and Galileo recanted, the recent nearby asteroid is actually 1km in size and will cause some "issues" in 750 years and Monsters was a pretty good alien film.

Using the some ideas from http://www.andol.info/hci/1661.htm as a base of doing defect detection in opencv, I combined some recent developments together. Discarded for initial speed issues was a SURF system looking at a hand image. Sure, it can find the parts that match but I don't really need to know that I'm looking at a hand anymore, I've pretty much assured that.

Walking the +-x and +y directions got me my original (too) tight bounding box, so I decided to start walking the depth pixels +-y as well to get a decent coverage of what I should be seeing. Throwing in a +- depth check to get pixels probably belonging to what is a hand, making a rectangle and padding that by +30 pixels in all directions gave me a really good box at most distances from the Kinect.

This suffers more from a walking pixel than the previous method. Slapping your hand on your chest will pick up most of your body. I've an alternate idea that was provided to me by a good friend using a circle system if this does prove troublesome.

Anyway. I have a region of interest now that I can do some things with, contours and defects. Taking a straight bmp from the coding4fun tobitmap output really wasn't finding what I wanted it to. Turn on drawing of all contours and at certain distances curve your fingers towards the Kinect, they'll switch from the bright white through to black/dark and then the contour will show.

Pop in a ThresholdBinary and we can work on the biggest polygon contour, nab the convex hull and its defects. With defects and their start, depth and end points, we have our fingers.

Since I know which hand I'm working on as it's based from a skeleton point, my next step is to look at the new rotational information the 1.5 SDK is supposed to have. Get orientation and I can start looking at thumb distances and crafting a virtual hand. I think the fingers and numbering of could be tricky, but measuring distance gaps (taking into account depth) and providing a "Hold your hand up for calibration" and storing that info per recognised person (why yes, yes I will investigate voice and facial identification) should make it low maintenance.

I know I'm missing more of what I've been up to, but most of what I've been up to is crawling through fictional towns to avoid zombies while hunting for beans and ammo in DayZ.

Sunday, June 3, 2012

KinectUI, oops

Tell a lie, I actually have done some work with Kinect since 1.5. I came to the conclusion while doing bounding box area changes that actually grabbing the frame or UI element in a general Win7 desktop isn't a good idea (probably why Win8 touts Metro and touch/gesture so much).

Using skeleton wrist positions, a custom winform, some buttons and position notification objects I threw together a sample for testing. It runs off a main hand and a secondary hand for manipulation. Some words follow:
1: Either hand mouse over UI elements causes them to change visuals as an indication.
2: Main hand hover over elements for X seconds causes them to be "picked up" and centered to the main hand, they can then be dragged about the usable area.
3: Sharp movement of the main hand breaks the pick up calculation and releases the UI element.
4: With a picked up element, moving the second hand into the area causes it to enter a resize mode. Main hand left/right changes width, secondary up/down changes height.

It's just a test setup but it has already pointed me to areas that need thought i.e. you end up crossing hands too much during re-sizing, "dropping" controls is just a side effect of framerates and WinForms aren't really a great way to go here for the UI. Something perhaps in OpenTK might be the better target, but that lends a fun issue of, if this is to be a presentation system, so I end up writing my own readers for document formats etc. Maybe I'll have to look at WPF eventually.

And I'm still kicking around ideas to manage Z axis (where a "proper" 3D environment for the UI would make it excellent).

Microsoft.Speech, getting voices in your head

A quick glance through the voice samples of the Kinect shows the Microsoft.Speech namespace. Great, get it working and test a few things out. Fire up some documentation on how to use the Synthesizer, set to default audio and crash. Quick further check, apparently you should be dumping the stream to a SoundPlayer and playing that way. Zero size stream, hrmm.

Turns out all you really need is one of these http://www.microsoft.com/en-us/download/details.aspx?id=27224

Using System.Speech you'll get Microsoft Anna when running GetInstalledVoices(), but nothing when using Microsoft.Speech. Yup, you're missing a voice.

Some of them are a bit hit and miss. Anna isn't very fluid at all, en-GB_Hazel can't say my wifes name yet en-US_ZiraPro can. Oh well. I now know how to do both direct audio output and to a stream for sending somewhere, such as a notification via a custom Lync client.

Speech recognition itself with the grammar builder file is quite nice. I can see myself spending some time building a query/response/action library for automating small activities such as queuing up the latest episode of something (why yes, I mean on youtube, of course), checking for patches for DayZ and download/extract to location, look at my RSS lists and bring new items to my attention and read them out. Having a new small human means this type of thing would really help (and can leverage my non-iDevice environment and actual computing power with ease of customisation).

No further work done on any Kinect projects though. The Lync project I have is almost at a decent beta phase, all the questions from my testing earlier this week have answers now, thanks to a few chats with two very smart chaps at work. The main part left is a decent configuration system and wiring the client to look them all up.