The training set contains 1M scenes with up to three objects. We also provide ~1K test examples for the following variants: 2.1 Empty room: scenes consist of the sky, walls, and floor only. 2.2 Six ...