We aim to use the high quality RGB-D images obtained by kinect cameras
to improve state of the art of semantic labelling of indoor office spaces. To
get started, we collected kinect-videos of 11 office scenes, stitched them
to form large point-cloud models and labelled them manually. We trained
and tested a very simple linear model which achieves 65 % accuracy in
labelling segments into 6 categories. In addition to this, we formulated 2
MRF models which use context to infer labels and worked out the learning
and inference mechanisms.