Nov 1, 2025
This is the midterm review report for a Zhejiang University Chu Kochen Honors College deep research training project. It summarizes the progress of RoRD-Layout-Recognition, including core implementation, performance testing, innovation analysis, and future work.
Date: November 2025
Type: midterm review report
Presenter: Tiansheng Jiao
IC layout recognition is a key technology in semiconductor manufacturing and EDA. As chip designs become increasingly complex, traditional pixel-matching methods struggle with geometric transformations such as rotation and scaling. Traditional geometry-based algorithms also face high computational complexity and long runtime.
Core goals:
| Problem | Description |
|---|---|
| Geometric invariance | IC layouts frequently use rotations such as 0, 90, 180, and 270 degrees, while traditional methods struggle to keep features consistent. |
| Dynamic IP library expansion | IP and standard-cell libraries are large and frequently updated, requiring adaptation to new templates without repeated retraining. |
| Data scarcity | Supervised learning needs large amounts of fine-grained labeled data, but pixel-level and bounding-box labels are expensive in layout domains. |
| Real-time requirements | Industrial use requires fast processing, batch execution, and concurrent inference. |
| Method | Strength | Limitation |
|---|---|---|
| Direct pixel matching | Simple and intuitive | Sensitive to rotation and weak in robustness. |
| SIFT / SURF features | Scale invariance | Poor fit for IC layout geometry. |
| Deep learning classification | End-to-end learning | Requires large labeled datasets. |
| Traditional hash matching | Fast | Limited accuracy and weak geometric-transform support. |
As of November 2025, the core framework and basic functionality are complete. Overall completion is about 65%. The core model implementation has reached 90%, while more effort is still needed in training validation and performance testing.
| Module | Completion | Quality | Key missing item |
|---|---|---|---|
| Core model implementation | 90% | Excellent | Training validation |
| Data processing pipeline | 85% | Good | Large-scale testing |
| Matching algorithm optimization | 80% | Good | Real-data validation |
| Training infrastructure | 70% | Medium | Distributed support |
| Documentation and examples | 60% | Medium | Industrial cases |
| Performance testing | 50% | Low | Post-training evaluation |
| Task | Remaining work | Plan |
|---|---|---|
| Model training and optimization | Real training, parameter tuning, grid search, and convergence validation. | Main focus of phase one. |
| Large-scale data testing | Real IC layout dataset validation and tests across process nodes and design complexity. | Gradually complete in phases one and two. |
| Performance limit exploration | Upper-bound testing, extreme resolution, and complex layout capability validation. | Main research focus of phase two. |
| Real-world scenario validation | Industrial testing, EDA tool integration, and interface adaptation. | Main focus of phase two. |
The project supports VGG16, ResNet34, and EfficientNet-B0 to fit different speed and accuracy requirements.
| Feature | Problem solved | Value |
|---|---|---|
| Multi-backbone support | Different applications need different speed-accuracy tradeoffs. | ResNet34 supports real-time processing, while VGG16 provides a high-accuracy baseline. |
| Geometry-aware heads | IC layouts have Manhattan geometry. | Integrates geometric constraints into layout recognition. |
| Feature Pyramid Network | Different hierarchy levels and process nodes have different sizes. | Supports large layouts up to 4096×4096 pixels. |
IC layout data is scarce and expensive to annotate, and traditional augmentation is limited. The project introduces diffusion-based data augmentation to synthesize layouts with realistic geometric relationships.
IC layout training requires special losses and strategies to preserve geometric consistency:
Here, is detection loss, is descriptor loss, and is H-consistency loss.
Advantages:
The project supports YAML configuration for hyperparameter management and experiment reproduction.
# configs/base_config.yaml
training:
learning_rate: 5.0e-5
batch_size: 8
num_epochs: 50
patch_size: 256
model:
backbone:
name: "resnet34"
pretrained: false
fpn:
enabled: true
out_channels: 256
data_sources:
real:
enabled: true
ratio: 0.7
diffusion:
enabled: true
png_dir: "data/diffusion_generated"
ratio: 0.3
Existing matching methods struggle with rotation, scale changes, and multi-instance detection in IC layouts. The project implements multi-scale template matching, multi-instance detection, and RANSAC geometric verification.
# Basic matching
python match.py \
--layout data/large_layout.png \
--template data/small_template.png \
--output results/matching.png \
--json_output results/matching.json
| Module | Role |
|---|---|
| Multi-scale template matching | Uses pyramid search and multi-resolution feature fusion for cross-node matching. |
| Multi-instance detection | Finds multiple instances using region masking and iterative detection. |
| RANSAC geometric verification | Filters mismatches and improves matching accuracy. |
Traditional descriptors such as SIFT and SURF cannot capture Manhattan geometry in IC layouts, including right angles, grids, and sparse regions. The project introduces a geometry-aware feature function:
where is the input layout image, is the geometric transformation matrix, and is the geometry-aware feature extractor.
Advantages:
IC layouts are frequently rotated during design, and traditional methods cannot guarantee feature consistency after rotation. The project proposes:
where is the -th geometric transformation, including 4 rotations and 4 mirrors.
To address IC layout data scarcity, diffusion models are used for layout augmentation:
The generated layouts are conditioned on real layout data and aim to preserve geometric constraints.
The project adopts a plugin-like modular design, supporting flexible combinations of backbones and attention mechanisms while reducing integration complexity with EDA tools.
| Item | Configuration |
|---|---|
| Hardware | Intel Xeon 8558P + NVIDIA A100 × 1 + 512GB memory |
| Software | PyTorch 2.6+, CUDA 12.8 |
| Test type | Forward inference test with untrained model |
| Test data | Random 2048×2048 layout-like simulation data |
| Rank | Backbone | Attention | Single-scale inference (ms) | FPN inference (ms) | FPS | Performance |
|---|---|---|---|---|---|---|
| 1 | ResNet34 | None | 18.10 ± 0.07 | 21.41 ± 0.07 | 55.3 | Best |
| 2 | ResNet34 | SE | 18.14 ± 0.05 | 21.53 ± 0.06 | 55.1 | Excellent |
| 3 | ResNet34 | CBAM | 18.23 ± 0.05 | 21.50 ± 0.07 | 54.9 | Excellent |
| 4 | EfficientNet-B0 | None | 21.40 ± 0.13 | 33.48 ± 0.42 | 46.7 | Good |
| 5 | EfficientNet-B0 | CBAM | 21.55 ± 0.05 | 33.33 ± 0.38 | 46.4 | Good |
| 6 | EfficientNet-B0 | SE | 21.67 ± 0.30 | 33.52 ± 0.33 | 46.1 | Good |
| 7 | VGG16 | None | 49.27 ± 0.23 | 102.08 ± 0.42 | 20.3 | Average |
| 8 | VGG16 | SE | 49.53 ± 0.14 | 101.71 ± 1.10 | 20.2 | Average |
| 9 | VGG16 | CBAM | 50.36 ± 0.42 | 102.47 ± 1.52 | 19.9 | Average |
| Backbone | Attention | CPU inference (ms) | GPU inference (ms) | Speedup | Rating |
|---|---|---|---|---|---|
| ResNet34 | None | 171.73 | 18.10 | 9.5× | Efficient |
| ResNet34 | CBAM | 406.07 | 18.23 | 22.3× | Excellent |
| ResNet34 | SE | 419.52 | 18.14 | 23.1× | Excellent |
| VGG16 | None | 514.94 | 49.27 | 10.4× | Efficient |
| VGG16 | SE | 808.86 | 49.53 | 16.3× | Very good |
| VGG16 | CBAM | 809.15 | 50.36 | 16.1× | Very good |
| EfficientNet-B0 | None | 1820.03 | 21.40 | 85.1× | Outstanding |
| EfficientNet-B0 | SE | 1815.73 | 21.67 | 83.8× | Outstanding |
| EfficientNet-B0 | CBAM | 1954.59 | 21.55 | 90.7× | Outstanding |
GPU acceleration is significant. The average speedup is 39.7×, the maximum speedup is 90.7× with EfficientNet-B0 + CBAM, and the minimum speedup is 9.5× with ResNet34 + None.
| Direction | Work |
|---|---|
| Data preparation | Collect IC layout data from Prof. Zheng’s company, clean data, convert formats, and perform quality control. |
| Basic training | Train the ResNet34 backbone, validate basic geometry-consistency loss, and tune simple hyperparameters. |
| Functional validation | Complete end-to-end tests and basic performance benchmarks. |
This phase will use advanced-node data from Prof. Chen to support high-quality research papers and patent applications.
| Direction | Work |
|---|---|
| Advanced-node adaptation | Analyze advanced-process layout features and optimize geometric matching at very small scales. |
| Algorithmic innovation | Study mathematical modeling for more complex geometric transformations and multimodal layout information fusion. |
| Academic output | Target ICCAD, DAC, ASP-DAC, DATE, or TCAD. |
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Difficult model convergence | Medium | High | Adjust learning rate and increase data augmentation. |
| Insufficient training data | Medium | High | Use diffusion data augmentation and synthetic data. |
| Performance below target | Low | Medium | Compare multiple backbones and optimize architecture. |
| Overfitting | Medium | Medium | Use early stopping and regularization. |
Project risk will be reduced through modular design, staged execution, and parallel backup solutions, ensuring that core goals can be completed on time.
Core results:
Loading comments...