So, Turing had a birthday and Galileo recanted, the recent nearby asteroid is actually 1km in size and will cause some "issues" in 750 years and Monsters was a pretty good alien film.
Using the some ideas from http://www.andol.info/hci/1661.htm as a base of doing defect detection in opencv, I combined some recent developments together. Discarded for initial speed issues was a SURF system looking at a hand image. Sure, it can find the parts that match but I don't really need to know that I'm looking at a hand anymore, I've pretty much assured that.
Walking the +-x and +y directions got me my original (too) tight bounding box, so I decided to start walking the depth pixels +-y as well to get a decent coverage of what I should be seeing. Throwing in a +- depth check to get pixels probably belonging to what is a hand, making a rectangle and padding that by +30 pixels in all directions gave me a really good box at most distances from the Kinect.
This suffers more from a walking pixel than the previous method. Slapping your hand on your chest will pick up most of your body. I've an alternate idea that was provided to me by a good friend using a circle system if this does prove troublesome.
Anyway. I have a region of interest now that I can do some things with, contours and defects. Taking a straight bmp from the coding4fun tobitmap output really wasn't finding what I wanted it to. Turn on drawing of all contours and at certain distances curve your fingers towards the Kinect, they'll switch from the bright white through to black/dark and then the contour will show.
Pop in a ThresholdBinary and we can work on the biggest polygon contour, nab the convex hull and its defects. With defects and their start, depth and end points, we have our fingers.
Since I know which hand I'm working on as it's based from a skeleton point, my next step is to look at the new rotational information the 1.5 SDK is supposed to have. Get orientation and I can start looking at thumb distances and crafting a virtual hand. I think the fingers and numbering of could be tricky, but measuring distance gaps (taking into account depth) and providing a "Hold your hand up for calibration" and storing that info per recognised person (why yes, yes I will investigate voice and facial identification) should make it low maintenance.
I know I'm missing more of what I've been up to, but most of what I've been up to is crawling through fictional towns to avoid zombies while hunting for beans and ammo in DayZ.
Sunday, June 24, 2012
Sunday, June 3, 2012
KinectUI, oops
Tell a lie, I actually have done some work with Kinect since 1.5. I came to the conclusion while doing bounding box area changes that actually grabbing the frame or UI element in a general Win7 desktop isn't a good idea (probably why Win8 touts Metro and touch/gesture so much).
Using skeleton wrist positions, a custom winform, some buttons and position notification objects I threw together a sample for testing. It runs off a main hand and a secondary hand for manipulation. Some words follow:
1: Either hand mouse over UI elements causes them to change visuals as an indication.
2: Main hand hover over elements for X seconds causes them to be "picked up" and centered to the main hand, they can then be dragged about the usable area.
3: Sharp movement of the main hand breaks the pick up calculation and releases the UI element.
4: With a picked up element, moving the second hand into the area causes it to enter a resize mode. Main hand left/right changes width, secondary up/down changes height.
It's just a test setup but it has already pointed me to areas that need thought i.e. you end up crossing hands too much during re-sizing, "dropping" controls is just a side effect of framerates and WinForms aren't really a great way to go here for the UI. Something perhaps in OpenTK might be the better target, but that lends a fun issue of, if this is to be a presentation system, so I end up writing my own readers for document formats etc. Maybe I'll have to look at WPF eventually.
And I'm still kicking around ideas to manage Z axis (where a "proper" 3D environment for the UI would make it excellent).
Using skeleton wrist positions, a custom winform, some buttons and position notification objects I threw together a sample for testing. It runs off a main hand and a secondary hand for manipulation. Some words follow:
1: Either hand mouse over UI elements causes them to change visuals as an indication.
2: Main hand hover over elements for X seconds causes them to be "picked up" and centered to the main hand, they can then be dragged about the usable area.
3: Sharp movement of the main hand breaks the pick up calculation and releases the UI element.
4: With a picked up element, moving the second hand into the area causes it to enter a resize mode. Main hand left/right changes width, secondary up/down changes height.
It's just a test setup but it has already pointed me to areas that need thought i.e. you end up crossing hands too much during re-sizing, "dropping" controls is just a side effect of framerates and WinForms aren't really a great way to go here for the UI. Something perhaps in OpenTK might be the better target, but that lends a fun issue of, if this is to be a presentation system, so I end up writing my own readers for document formats etc. Maybe I'll have to look at WPF eventually.
And I'm still kicking around ideas to manage Z axis (where a "proper" 3D environment for the UI would make it excellent).
Microsoft.Speech, getting voices in your head
A quick glance through the voice samples of the Kinect shows the Microsoft.Speech namespace. Great, get it working and test a few things out. Fire up some documentation on how to use the Synthesizer, set to default audio and crash. Quick further check, apparently you should be dumping the stream to a SoundPlayer and playing that way. Zero size stream, hrmm.
Turns out all you really need is one of these http://www.microsoft.com/en-us/download/details.aspx?id=27224
Using System.Speech you'll get Microsoft Anna when running GetInstalledVoices(), but nothing when using Microsoft.Speech. Yup, you're missing a voice.
Some of them are a bit hit and miss. Anna isn't very fluid at all, en-GB_Hazel can't say my wifes name yet en-US_ZiraPro can. Oh well. I now know how to do both direct audio output and to a stream for sending somewhere, such as a notification via a custom Lync client.
Speech recognition itself with the grammar builder file is quite nice. I can see myself spending some time building a query/response/action library for automating small activities such as queuing up the latest episode of something (why yes, I mean on youtube, of course), checking for patches for DayZ and download/extract to location, look at my RSS lists and bring new items to my attention and read them out. Having a new small human means this type of thing would really help (and can leverage my non-iDevice environment and actual computing power with ease of customisation).
No further work done on any Kinect projects though. The Lync project I have is almost at a decent beta phase, all the questions from my testing earlier this week have answers now, thanks to a few chats with two very smart chaps at work. The main part left is a decent configuration system and wiring the client to look them all up.
Turns out all you really need is one of these http://www.microsoft.com/en-us/download/details.aspx?id=27224
Using System.Speech you'll get Microsoft Anna when running GetInstalledVoices(), but nothing when using Microsoft.Speech. Yup, you're missing a voice.
Some of them are a bit hit and miss. Anna isn't very fluid at all, en-GB_Hazel can't say my wifes name yet en-US_ZiraPro can. Oh well. I now know how to do both direct audio output and to a stream for sending somewhere, such as a notification via a custom Lync client.
Speech recognition itself with the grammar builder file is quite nice. I can see myself spending some time building a query/response/action library for automating small activities such as queuing up the latest episode of something (why yes, I mean on youtube, of course), checking for patches for DayZ and download/extract to location, look at my RSS lists and bring new items to my attention and read them out. Having a new small human means this type of thing would really help (and can leverage my non-iDevice environment and actual computing power with ease of customisation).
No further work done on any Kinect projects though. The Lync project I have is almost at a decent beta phase, all the questions from my testing earlier this week have answers now, thanks to a few chats with two very smart chaps at work. The main part left is a decent configuration system and wiring the client to look them all up.
Sunday, May 27, 2012
Kinect SDK 1.5
Well it's out, been out for a day or two now with some great new additions such as the face recognition and near mode skeletal tracking. I've not had any time to really play with it and my ideas but I did discover something daft that I should have realised earlier.
Looking at the green screen sample my brain went "hold on, how is it showing the parts that make up the person instead of just the skeleton?" and dove into the code.
Yes, depth data has all the parts of a person. The original SDK had this as well in the colouring of the players depth data. Why didn't I realise this!
So my trials of bounding boxes, while not a total loss, were based on the assumption I'd have to find all the depth points that make up a person myself.
MS also included an excellent PDF link called Human_Interface_Guidelines_v1.5.0. It covers a lot of points from hardware spec, UI design, usability and essentially says "Hey, we did a whole bunch of research with a whole bunch of people, these are the things you should know" which is really nice of MSR (one of my favourite groups out there).
On the Lync front, I've completed the code for my latest project barring some EWS calls. The ideas surrounding it are simple enough but should hopefully revolutionise how we collaborate and "use space" once I get some Kinect integrated. Small steps (and lots of testing).
Looking at the green screen sample my brain went "hold on, how is it showing the parts that make up the person instead of just the skeleton?" and dove into the code.
Yes, depth data has all the parts of a person. The original SDK had this as well in the colouring of the players depth data. Why didn't I realise this!
So my trials of bounding boxes, while not a total loss, were based on the assumption I'd have to find all the depth points that make up a person myself.
MS also included an excellent PDF link called Human_Interface_Guidelines_v1.5.0. It covers a lot of points from hardware spec, UI design, usability and essentially says "Hey, we did a whole bunch of research with a whole bunch of people, these are the things you should know" which is really nice of MSR (one of my favourite groups out there).
On the Lync front, I've completed the code for my latest project barring some EWS calls. The ideas surrounding it are simple enough but should hopefully revolutionise how we collaborate and "use space" once I get some Kinect integrated. Small steps (and lots of testing).
Thursday, May 17, 2012
UCMA 3.0 and presence samples
Following samples is a good way to start things, unless they just don't work for your configured environment. I recently had issues connecting an application endpoint to our Lync 2010 setup, these started with a simple timeout from the server.
Checking a bit more (having confirmed the setup of the trusted endpoint) showed that, unlike the samples I'd found, using the proxy settings for the ApplicationEndpointSettings was a bad thing.
With that out the way, the next hurdle became a mutual TLS exception when using the provisioned settings. All signs point to "you messed up the cert somewhere", but quickly switching over to ServerPlatformSettings and applying the provided cert got the system up and running.
Finally, tying up a persistent RemotePresenceView and event for PresenceNotificationReceived (Task 3) was the annoying. State change events logged a repeated Subscribing, WaitingForRetry. The notification event never even got to fire and the Lync contact for the endpoint never registered as Available.
As a last shot before trying to start from scratch again, I went back through to the start and changed the ApplicationEndpoint to a UserEndpoint, ran it and everything worked. The Lync client showed my endpoint as being available and my subscription to a contacts presence fired at the right times.
I think the issues I had stemmed from a few possible things:
1: Cert issues - did it get properly set on the Lync server? If this was the issue then not using ServerPlatformSettings may have helped resolve 2.
2: I didn't get the Lync server restarted (someone on the MSDN forums had similar issues with an application endpoint not firing events using ServerPlatformSettings, this was their fix)
And finally was it a lack of understanding the workings of Lync in general on my part?
It's probable, but at the same time, samples are there to walk you through a simple process.
Is our environment odd? I know there are some oddities that have been added to make it work with the bastard nightmare child of a phone system we like to call our "primary" phone system, but these don't touch the standard Lync setup I'm talking to.
At least I now have a serviceable framework for tracking presence that I can use to get a reporting system working on.
Checking a bit more (having confirmed the setup of the trusted endpoint) showed that, unlike the samples I'd found, using the proxy settings for the ApplicationEndpointSettings was a bad thing.
With that out the way, the next hurdle became a mutual TLS exception when using the provisioned settings. All signs point to "you messed up the cert somewhere", but quickly switching over to ServerPlatformSettings and applying the provided cert got the system up and running.
Finally, tying up a persistent RemotePresenceView and event for PresenceNotificationReceived (Task 3) was the annoying. State change events logged a repeated Subscribing, WaitingForRetry. The notification event never even got to fire and the Lync contact for the endpoint never registered as Available.
As a last shot before trying to start from scratch again, I went back through to the start and changed the ApplicationEndpoint to a UserEndpoint, ran it and everything worked. The Lync client showed my endpoint as being available and my subscription to a contacts presence fired at the right times.
I think the issues I had stemmed from a few possible things:
1: Cert issues - did it get properly set on the Lync server? If this was the issue then not using ServerPlatformSettings may have helped resolve 2.
2: I didn't get the Lync server restarted (someone on the MSDN forums had similar issues with an application endpoint not firing events using ServerPlatformSettings, this was their fix)
And finally was it a lack of understanding the workings of Lync in general on my part?
It's probable, but at the same time, samples are there to walk you through a simple process.
Is our environment odd? I know there are some oddities that have been added to make it work with the bastard nightmare child of a phone system we like to call our "primary" phone system, but these don't touch the standard Lync setup I'm talking to.
At least I now have a serviceable framework for tracking presence that I can use to get a reporting system working on.
Wednesday, May 16, 2012
Kinect for Windows
Well; I have run through a number of trials and tests using OpenCV and a webcam looking at my own implementation of the sixthsense style of interaction. While the ways of implementing it are there, I sat down with a co-worker one day and realised, why not buy a Kinect? It has less of the over the shoulder portability of a sixthsense system but more of a presentation/presence potential and MS is about to release 1.5 of the SDK.
The ideas I've come up with are interesting, especially since Kinect for Windows is just a bunch of smarts in a box with a camera, IR sensor and IR laser emitter.
Milestone 1 became:
"While the skeleton points are provided easily, I need to locate the object at the point"
After a nice side track off onto depth camera smoothing (a limited implementation of this project which I kind of deduced myself but it's a nice link regardless) and change only depth frame display, I ended up with a quick and dirty depth and skeleton point based bounding box.
Walking points up, left and right using a quick guess at the range of depths to include (to allow for precision problems, angles etc) it gave me a nice framed area for further depth and image mapping.
A happy coincidence was that, when tracking hands, the open palm has a larger area automatically than a closed or half closed hand due to those parts falling outside the allowed +- Z axis. (Here I wrote "Handy!" then slapped myself for the bad pun).
So, Milestone 1 is completed. It's not great but I can make it better and it's required for two of the ideas I'm going to attempt to implement.
Next up, Milestone 2, depth map storing and comparison along with some image recognition.
I'll probably throw my "fun with Lync 2010" things on here as well since that's been driving me up the wall recently.
The ideas I've come up with are interesting, especially since Kinect for Windows is just a bunch of smarts in a box with a camera, IR sensor and IR laser emitter.
Milestone 1 became:
"While the skeleton points are provided easily, I need to locate the object at the point"
After a nice side track off onto depth camera smoothing (a limited implementation of this project which I kind of deduced myself but it's a nice link regardless) and change only depth frame display, I ended up with a quick and dirty depth and skeleton point based bounding box.
Walking points up, left and right using a quick guess at the range of depths to include (to allow for precision problems, angles etc) it gave me a nice framed area for further depth and image mapping.
A happy coincidence was that, when tracking hands, the open palm has a larger area automatically than a closed or half closed hand due to those parts falling outside the allowed +- Z axis. (Here I wrote "Handy!" then slapped myself for the bad pun).
So, Milestone 1 is completed. It's not great but I can make it better and it's required for two of the ideas I'm going to attempt to implement.
Next up, Milestone 2, depth map storing and comparison along with some image recognition.
I'll probably throw my "fun with Lync 2010" things on here as well since that's been driving me up the wall recently.
Labels:
bounding box,
depth,
kinect,
milestone1,
milestone2,
opencv
Subscribe to:
Posts (Atom)