Skip to content

craked-code/MSD_implementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

MSD_implementation

Floor Plan Generation on MSD — Graph-Informed U-Net (UN)

A Google Colab implementation of the Graph-Informed U-Net (UN) baseline for floor plan generation, based on the paper "MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes" (van Engelenburg et al., arXiv:2407.10121).

Overview

This project implements the UN (Graph-Informed U-Net) baseline described in the MSD paper. The model takes a building structure image and a zoning graph as input, and predicts a segmented floor plan where each pixel is assigned a room type.

The MSD (Modified Swiss Dwellings) dataset contains over 5,300 floor plans of multi-apartment building complexes — significantly more complex than datasets like RPLAN or LIFULL.

Status: Active implementation — baseline model running, diffusion component in progress.

What's Implemented

1. Data Loading

  • Downloads the MSD dataset from Kaggle using the Kaggle API
  • Loads building structure images, zoning graphs (stored as NetworkX graphs), and ground truth floor plan images
  • Custom collate_fn to handle batches of variable-size graphs alongside image tensors
  • Ground truth floor plans are rasterized into per-pixel class label maps during loading

2. U-Net Encoder

Four convolutional encoder blocks that progressively downsample the input image:

  • Each block applies a 3×3 convolution, batch norm, ReLU, and 2×2 max-pooling
  • Channel dimensions double at each block: 3 → 64 → 128 → 256 → 512
  • Skip connections are saved at each encoder stage for use in the decoder

3. Graph Convolutional Network (GCN)

A GCN processes the zoning graph in parallel with the image encoder:

  • 2 GCNConv layers, each with hidden size 256
  • Global mean pooling produces a fixed-size graph-level embedding of size 256
  • The graph embedding is tiled and concatenated to the U-Net bottleneck feature map

4. Feature Fusion

The encoder bottleneck (512 channels) and GCN output (256 channels) are concatenated at the deepest layer, giving the decoder awareness of both structural and topological constraints simultaneously.

5. U-Net Decoder

Four decoder blocks that progressively upsample back to the original image resolution:

  • Each block uses transposed convolutions for upsampling
  • Skip connections from the corresponding encoder block are concatenated at each stage
  • Final 1×1 convolution maps to the number of room/element classes

6. Training

  • Optimizer: Adam (learning rate 0.001, batch size 2)
  • Loss: Cross-entropy (pixel-wise classification)
  • Currently trained for 1 epoch on the MSD dataset

7. Evaluation — MIoU

Mean Intersection-over-Union (MIoU) is computed between predicted and ground truth floor plan segmentation maps, following the formulation in the paper.


Architecture Summary

Input: Building Structure (B, 3, 512, 512) + Zoning Graph
         │                                       │
    U-Net Encoder                              GCN
    (4 blocks, downsampling)         (2 GCNConv layers, GlobalPool)
         │                                       │
    Bottleneck (B, 512, 32, 32)     Graph Embedding (B, 256)
         └──────────── Concatenate ─────────────┘
                  Fused (B, 768, 32, 32)
                         │
                  U-Net Decoder
              (4 blocks, upsampling + skip connections)
                         │
              Output: (B, num_classes, 512, 512)

Not Yet Implemented

  • Boundary pre-processing (3-channel): The paper uses Segment Anything to convert the binary structure image into a 3-channel input distinguishing interior, exterior, and raw boundary. This is not yet part of the pipeline.
  • Modified HouseDiffusion (MHD): The second baseline from the paper is not implemented.
  • Full training: Only 1 epoch has been completed so far.

Dataset

Modified Swiss Dwellings (MSD) — available on Kaggle.

The dataset contains floor plans of European (Swiss) multi-apartment building complexes, with precise room polygon annotations, zoning graphs, building structure binary images, and room type labels.

Based On

van Engelenburg et al., "MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes", arXiv:2407.10121, 2024.

Status

Work in progress. The full training pipeline runs end-to-end. Further training epochs and evaluation are pending.

About

Graph-informed U-Net with GCN fusion for floor plan generation — implementation of arXiv:2407.10121

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors