RoRD Midterm Review: Rotation-Robust Descriptors For IC Layout Recognition

Nov 1, 2025

1236 words

6 min read

RoRD

RoRD: Rotation-Robust Descriptors For IC Layout Recognition

This is the midterm review report for a Zhejiang University Chu Kochen Honors College deep research training project. It summarizes the progress of RoRD-Layout-Recognition, including core implementation, performance testing, innovation analysis, and future work.

Report Information

Date: November 2025
Type: midterm review report
Presenter: Tiansheng Jiao

Project Overview

Background And Goals

IC layout recognition is a key technology in semiconductor manufacturing and EDA. As chip designs become increasingly complex, traditional pixel-matching methods struggle with geometric transformations such as rotation and scaling. Traditional geometry-based algorithms also face high computational complexity and long runtime.

Core goals:

  1. Develop rotation-robust IC layout descriptors, or RoRD, to address sensitivity to geometric transformations.
  2. Achieve high-accuracy layout geometric feature matching for multi-scale and multi-instance retrieval.
  3. Build an end-to-end layout recognition solution for semiconductor design and manufacturing.

Key Problems

ProblemDescription
Geometric invarianceIC layouts frequently use rotations such as 0, 90, 180, and 270 degrees, while traditional methods struggle to keep features consistent.
Dynamic IP library expansionIP and standard-cell libraries are large and frequently updated, requiring adaptation to new templates without repeated retraining.
Data scarcitySupervised learning needs large amounts of fine-grained labeled data, but pixel-level and bounding-box labels are expensive in layout domains.
Real-time requirementsIndustrial use requires fast processing, batch execution, and concurrent inference.

Limitations Of Traditional Methods

MethodStrengthLimitation
Direct pixel matchingSimple and intuitiveSensitive to rotation and weak in robustness.
SIFT / SURF featuresScale invariancePoor fit for IC layout geometry.
Deep learning classificationEnd-to-end learningRequires large labeled datasets.
Traditional hash matchingFastLimited accuracy and weak geometric-transform support.

Technical Advantages

  1. Manhattan geometry constraints for right-angle and grid-like layout structures.
  2. Diffusion-based data augmentation to synthesize data from real layouts.
  3. Multi-scale feature fusion for different process nodes and hierarchy levels.
  4. End-to-end automated pipeline from raw data to trained model.

Completion Analysis

As of November 2025, the core framework and basic functionality are complete. Overall completion is about 65%. The core model implementation has reached 90%, while more effort is still needed in training validation and performance testing.

ModuleCompletionQualityKey missing item
Core model implementation90%ExcellentTraining validation
Data processing pipeline85%GoodLarge-scale testing
Matching algorithm optimization80%GoodReal-data validation
Training infrastructure70%MediumDistributed support
Documentation and examples60%MediumIndustrial cases
Performance testing50%LowPost-training evaluation

Key Remaining Tasks

TaskRemaining workPlan
Model training and optimizationReal training, parameter tuning, grid search, and convergence validation.Main focus of phase one.
Large-scale data testingReal IC layout dataset validation and tests across process nodes and design complexity.Gradually complete in phases one and two.
Performance limit explorationUpper-bound testing, extreme resolution, and complex layout capability validation.Main research focus of phase two.
Real-world scenario validationIndustrial testing, EDA tool integration, and interface adaptation.Main focus of phase two.

Features Needing Improvement

  1. Training infrastructure: configuration management, losses, and optimizer framework are complete; distributed training and automatic hyperparameter tuning remain unfinished.
  2. Performance testing: baseline inference testing with untrained models is complete; post-training accuracy and performance evaluation remain unfinished.
  3. Documentation: technical documentation, user guide, and API notes are complete; full tutorials, best practices, and deployment cases are still needed.

Completed Core Features

Multi-Backbone Support

The project supports VGG16, ResNet34, and EfficientNet-B0 to fit different speed and accuracy requirements.

FeatureProblem solvedValue
Multi-backbone supportDifferent applications need different speed-accuracy tradeoffs.ResNet34 supports real-time processing, while VGG16 provides a high-accuracy baseline.
Geometry-aware headsIC layouts have Manhattan geometry.Integrates geometric constraints into layout recognition.
Feature Pyramid NetworkDifferent hierarchy levels and process nodes have different sizes.Supports large layouts up to 4096×4096 pixels.

Diffusion-Based Data Augmentation

IC layout data is scarce and expensive to annotate, and traditional augmentation is limited. The project introduces diffusion-based data augmentation to synthesize layouts with realistic geometric relationships.

  1. Diffusion model integration: DDPM is applied to IC layout augmentation.
  2. Geometric transformation augmentation: supports 8 discrete rotations and mirrors with H-consistency checking.
  3. Multi-source data mixing: configurable ratios of real and synthetic data.

Geometry-Consistency Loss

IC layout training requires special losses and strategies to preserve geometric consistency:

Lgeo=Ldet+λ1Ldesc+λ2LH-consistencyL_{geo} = L_{det} + \lambda_1 L_{desc} + \lambda_2 L_{H\text{-}consistency}

Here, LdetL_{det} is detection loss, LdescL_{desc} is descriptor loss, and LH-consistencyL_{H\text{-}consistency} is H-consistency loss.

Advantages:

  1. Complete mathematical framework for the loss function.
  2. Feature consistency under rotation and mirroring.
  3. Manhattan geometry constraints integrated into deep learning loss design.
  4. Improved rotation invariance and geometric robustness.

Configuration-Driven Training

The project supports YAML configuration for hyperparameter management and experiment reproduction.

# configs/base_config.yaml
training:
  learning_rate: 5.0e-5
  batch_size: 8
  num_epochs: 50
  patch_size: 256

model:
  backbone:
    name: "resnet34"
    pretrained: false
  fpn:
    enabled: true
    out_channels: 256

data_sources:
  real:
    enabled: true
    ratio: 0.7
  diffusion:
    enabled: true
    png_dir: "data/diffusion_generated"
    ratio: 0.3

Multi-Scale Template Matching

Existing matching methods struggle with rotation, scale changes, and multi-instance detection in IC layouts. The project implements multi-scale template matching, multi-instance detection, and RANSAC geometric verification.

# Basic matching
python match.py \
    --layout data/large_layout.png \
    --template data/small_template.png \
    --output results/matching.png \
    --json_output results/matching.json
ModuleRole
Multi-scale template matchingUses pyramid search and multi-resolution feature fusion for cross-node matching.
Multi-instance detectionFinds multiple instances using region masking and iterative detection.
RANSAC geometric verificationFilters mismatches and improves matching accuracy.

Innovation Analysis

Geometry-Aware Descriptors

Traditional descriptors such as SIFT and SURF cannot capture Manhattan geometry in IC layouts, including right angles, grids, and sparse regions. The project introduces a geometry-aware feature function:

dgeo=Fgeo(I,H)d_{geo} = F_{geo}(I, H)

where II is the input layout image, HH is the geometric transformation matrix, and FgeoF_{geo} is the geometry-aware feature extractor.

Advantages:

  1. Manhattan constraints force descriptors to learn right-angle and grid structures.
  2. Rotation invariance is built around 8 geometric transformations.
  3. Potential accuracy improvement of 30% to 50% over traditional methods in IC layout matching.

Rotation-Invariant Loss

IC layouts are frequently rotated during design, and traditional methods cannot guarantee feature consistency after rotation. The project proposes:

Lrotation=i=18d(I)d(Ti(I))2L_{rotation} = \sum_{i=1}^{8} \| d(I) - d(T_i(I)) \|^2

where TiT_i is the ii-th geometric transformation, including 4 rotations and 4 mirrors.

Diffusion Data Augmentation

To address IC layout data scarcity, diffusion models are used for layout augmentation:

Isyn=Dθ1(zT,Ireal)I_{syn} = D_{\theta}^{-1}(z_T, I_{real})

The generated layouts are conditioned on real layout data and aim to preserve geometric constraints.

Modular Architecture

The project adopts a plugin-like modular design, supporting flexible combinations of backbones and attention mechanisms while reducing integration complexity with EDA tools.

Performance Testing And Analysis

Test Environment

ItemConfiguration
HardwareIntel Xeon 8558P + NVIDIA A100 × 1 + 512GB memory
SoftwarePyTorch 2.6+, CUDA 12.8
Test typeForward inference test with untrained model
Test dataRandom 2048×2048 layout-like simulation data

GPU Inference Performance

RankBackboneAttentionSingle-scale inference (ms)FPN inference (ms)FPSPerformance
1ResNet34None18.10 ± 0.0721.41 ± 0.0755.3Best
2ResNet34SE18.14 ± 0.0521.53 ± 0.0655.1Excellent
3ResNet34CBAM18.23 ± 0.0521.50 ± 0.0754.9Excellent
4EfficientNet-B0None21.40 ± 0.1333.48 ± 0.4246.7Good
5EfficientNet-B0CBAM21.55 ± 0.0533.33 ± 0.3846.4Good
6EfficientNet-B0SE21.67 ± 0.3033.52 ± 0.3346.1Good
7VGG16None49.27 ± 0.23102.08 ± 0.4220.3Average
8VGG16SE49.53 ± 0.14101.71 ± 1.1020.2Average
9VGG16CBAM50.36 ± 0.42102.47 ± 1.5219.9Average

CPU vs GPU Speedup

BackboneAttentionCPU inference (ms)GPU inference (ms)SpeedupRating
ResNet34None171.7318.109.5×Efficient
ResNet34CBAM406.0718.2322.3×Excellent
ResNet34SE419.5218.1423.1×Excellent
VGG16None514.9449.2710.4×Efficient
VGG16SE808.8649.5316.3×Very good
VGG16CBAM809.1550.3616.1×Very good
EfficientNet-B0None1820.0321.4085.1×Outstanding
EfficientNet-B0SE1815.7321.6783.8×Outstanding
EfficientNet-B0CBAM1954.5921.5590.7×Outstanding
Performance Conclusion

GPU acceleration is significant. The average speedup is 39.7×, the maximum speedup is 90.7× with EfficientNet-B0 + CBAM, and the minimum speedup is 9.5× with ResNet34 + None.

Recommendations

  1. Best default configuration: ResNet34 + no attention, 18.1ms single-scale inference and about 55.3 FPS.
  2. High-accuracy matching: ResNet34 + SE, with little inference overhead.
  3. Multi-scale search: FPN improves multi-scale capability but adds compute overhead.
  4. Batch processing: A100 can support 8 to 16 concurrent inference jobs.

Future Work Plan

Phase 1: Minimum Deliverable Standard (2025.11 - 2026.01)

DirectionWork
Data preparationCollect IC layout data from Prof. Zheng’s company, clean data, convert formats, and perform quality control.
Basic trainingTrain the ResNet34 backbone, validate basic geometry-consistency loss, and tune simple hyperparameters.
Functional validationComplete end-to-end tests and basic performance benchmarks.

Phase 2: Paper-Level Research (2026.01 - 2026.04)

This phase will use advanced-node data from Prof. Chen to support high-quality research papers and patent applications.

DirectionWork
Advanced-node adaptationAnalyze advanced-process layout features and optimize geometric matching at very small scales.
Algorithmic innovationStudy mathematical modeling for more complex geometric transformations and multimodal layout information fusion.
Academic outputTarget ICCAD, DAC, ASP-DAC, DATE, or TCAD.

Risk Assessment

RiskProbabilityImpactMitigation
Difficult model convergenceMediumHighAdjust learning rate and increase data augmentation.
Insufficient training dataMediumHighUse diffusion data augmentation and synthetic data.
Performance below targetLowMediumCompare multiple backbones and optimize architecture.
OverfittingMediumMediumUse early stopping and regularization.
Risk Strategy

Project risk will be reduced through modular design, staged execution, and parallel backup solutions, ensuring that core goals can be completed on time.

Summary And Outlook

Core results:

  1. Technical breakthrough: completed the RoRD model architecture and implemented rotation-robust IC layout descriptors.
  2. Performance improvement: inference speed reaches 55.3 FPS, with an average GPU speedup of 39.7×.
  3. Application value: supports high-accuracy multi-scale and multi-instance layout matching.
  4. Innovations: geometry-aware loss, diffusion data augmentation, and modular architecture design.
RoRD Midterm Review: Rotation-Robust Descriptors For IC Layout Recognition
https://www.jiao77.com/en/blog/report/rord-midterm-review-2025-11/
Author
Jiao77
Published on
Nov 1, 2025
License
CC BY-NC-SA 4.0

Loading comments...

Enter keywords to start searching