Subscribe to the PwC Newsletter
Join the community, edit resnet - rjt1990, add metadata.
Your model lacks metadata. Adding metadata gives context on how your model was trained.
Take the following JSON template, fill it in with your model's correct values:
[INSERT ADVICE HERE]
Add Training Data for resnet18
Remove dataset from resnet18, add attribute for resnet18, remove attribute from resnet18.
resnet18 | |
0.875 | |
224 | |
bilinear |
Add Technique for resnet18
Remove technique from resnet18, add motif for resnet18.
- 1X1 CONVOLUTION
- BATCH NORMALIZATION
- RESIDUAL CONNECTION
- BOTTLENECK RESIDUAL BLOCK
- GLOBAL AVERAGE POOLING
- RESIDUAL BLOCK
- CONVOLUTION
- MAX POOLING
Remove Motif from resnet18
- 1X1 CONVOLUTION -
- BATCH NORMALIZATION -
- RESIDUAL CONNECTION -
- BOTTLENECK RESIDUAL BLOCK -
- GLOBAL AVERAGE POOLING -
- RESIDUAL BLOCK -
- CONVOLUTION -
- MAX POOLING -
Add Training Data for resnet26
Remove dataset from resnet26, add attribute for resnet26, remove attribute from resnet26.
resnet26 | |
0.875 | |
224 | |
bicubic |
Add Technique for resnet26
Remove technique from resnet26, add motif for resnet26, remove motif from resnet26, add training data for resnet34, remove dataset from resnet34, add attribute for resnet34, remove attribute from resnet34.
resnet34 | |
0.875 | |
224 | |
bilinear |
Add Technique for resnet34
Remove technique from resnet34, add motif for resnet34, remove motif from resnet34, add training data for resnet50, remove dataset from resnet50, add attribute for resnet50, remove attribute from resnet50.
resnet50 | |
0.875 | |
224 | |
bicubic |
Add Technique for resnet50
Remove technique from resnet50, add motif for resnet50, remove motif from resnet50, add training data for resnetblur50, remove dataset from resnetblur50, add attribute for resnetblur50, remove attribute from resnetblur50.
resnetblur50 | |
0.875 | |
224 | |
bicubic |
Add Technique for resnetblur50
Remove technique from resnetblur50, add motif for resnetblur50, remove motif from resnetblur50, add training data for tv_resnet101, remove dataset from tv_resnet101, add attribute for tv_resnet101, remove attribute from tv_resnet101.
tv_resnet101 | |
0.1 | |
90 | |
0.875 | |
0.1 | |
0.9 | |
32 | |
224 | |
30 | |
0.0001 | |
bilinear |
Add Technique for tv_resnet101
- WEIGHT DECAY
- SGD WITH MOMENTUM
Remove Technique from tv_resnet101
- WEIGHT DECAY -
- SGD WITH MOMENTUM -
Add Motif for tv_resnet101
Remove motif from tv_resnet101, add training data for tv_resnet152, remove dataset from tv_resnet152, add attribute for tv_resnet152, remove attribute from tv_resnet152.
tv_resnet152 | |
0.1 | |
90 | |
0.875 | |
0.1 | |
0.9 | |
32 | |
224 | |
30 | |
0.0001 | |
bilinear |
Add Technique for tv_resnet152
Remove technique from tv_resnet152, add motif for tv_resnet152, remove motif from tv_resnet152, add training data for tv_resnet34, remove dataset from tv_resnet34, add attribute for tv_resnet34, remove attribute from tv_resnet34.
tv_resnet34 | |
0.1 | |
90 | |
0.875 | |
0.1 | |
0.9 | |
32 | |
224 | |
30 | |
0.0001 | |
bilinear |
Add Technique for tv_resnet34
Remove technique from tv_resnet34, add motif for tv_resnet34, remove motif from tv_resnet34, add training data for tv_resnet50, remove dataset from tv_resnet50, add attribute for tv_resnet50, remove attribute from tv_resnet50.
tv_resnet50 | |
0.1 | |
90 | |
0.875 | |
0.1 | |
0.9 | |
32 | |
224 | |
30 | |
0.0001 | |
bilinear |
Add Technique for tv_resnet50
Remove technique from tv_resnet50, add motif for tv_resnet50, remove motif from tv_resnet50, rwightman / pytorch-image-models.
Architecture | , , , , , , , , , |
---|---|
ID | resnet18 |
Crop Pct | 0.875 |
Image Size | 224 |
Interpolation | bilinear |
Architecture | , , , , , , , , , |
---|---|
ID | resnet26 |
Crop Pct | 0.875 |
Image Size | 224 |
Interpolation | bicubic |
Architecture | , , , , , , , , , |
---|---|
ID | resnet34 |
Crop Pct | 0.875 |
Image Size | 224 |
Interpolation | bilinear |
Architecture | , , , , , , , , , |
---|---|
ID | resnet50 |
Crop Pct | 0.875 |
Image Size | 224 |
Interpolation | bicubic |
Architecture | , , , , , , , , , , Blur Pooling |
---|---|
ID | resnetblur50 |
Crop Pct | 0.875 |
Image Size | 224 |
Interpolation | bicubic |
Training Techniques | , |
---|---|
Architecture | , , , , , , , , , |
ID | tv_resnet101 |
LR | 0.1 |
Epochs | 90 |
Crop Pct | 0.875 |
LR Gamma | 0.1 |
Momentum | 0.9 |
Batch Size | 32 |
Image Size | 224 |
LR Step Size | 30 |
Weight Decay | 0.0001 |
Interpolation | bilinear |
Training Techniques | , |
---|---|
Architecture | , , , , , , , , , |
ID | tv_resnet152 |
LR | 0.1 |
Epochs | 90 |
Crop Pct | 0.875 |
LR Gamma | 0.1 |
Momentum | 0.9 |
Batch Size | 32 |
Image Size | 224 |
LR Step Size | 30 |
Weight Decay | 0.0001 |
Interpolation | bilinear |
Training Techniques | , |
---|---|
Architecture | , , , , , , , , , |
ID | tv_resnet34 |
LR | 0.1 |
Epochs | 90 |
Crop Pct | 0.875 |
LR Gamma | 0.1 |
Momentum | 0.9 |
Batch Size | 32 |
Image Size | 224 |
LR Step Size | 30 |
Weight Decay | 0.0001 |
Interpolation | bilinear |
Training Techniques | , |
---|---|
Architecture | , , , , , , , , , |
ID | tv_resnet50 |
LR | 0.1 |
Epochs | 90 |
Crop Pct | 0.875 |
LR Gamma | 0.1 |
Momentum | 0.9 |
Batch Size | 32 |
Image Size | 224 |
LR Step Size | 30 |
Weight Decay | 0.0001 |
Interpolation | bilinear |
Residual Networks , or ResNets , learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack residual blocks ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks.
How do I load this model?
To load a pretrained model:
Replace the model name with the variant you want to use, e.g. resnet18 . You can find the IDs in the model summaries at the top of this page.
How do I train this model?
You can follow the timm recipe scripts for training a new model afresh.
Image Classification on ImageNet
Image Classification on ImageNet Top 1 Accuracy Top 5 Accuracy Image Size FLOPs LR Epochs Interpolation Momentum Batch Size LR Step Size Weight Decay Parameters LR Gamma Crop Pct Parameters FLOPs Top 1 Accuracy Top 5 Accuracy Image Size LR Epochs Interpolation Momentum Batch Size LR Step Size Weight Decay LR Gamma Crop Pct PyTorch Image Models All Models ResNet
MODEL | TOP 1 ACCURACY | TOP 5 ACCURACY |
---|---|---|
resnetblur50 | 79.29% | 94.64% |
resnet50 | 79.04% | 94.39% |
tv_resnet152 | 78.32% | 94.05% |
tv_resnet101 | 77.37% | 93.56% |
tv_resnet50 | 76.16% | 92.88% |
resnet26 | 75.29% | 92.57% |
resnet34 | 75.11% | 92.28% |
tv_resnet34 | 73.3% | 91.42% |
resnet18 | 69.74% | 89.09% |
- First Online: 05 January 2021
Cite this chapter
- Brett Koonce 2
4940 Accesses
51 Citations
ResNet 50 is a crucial network for you to understand. It is the basis of much academic research in this field. Many different papers will compare their results to a ResNet 50 baseline, and it is valuable as a reference point. As well, we can easily download the weights for ResNet 50 networks that have been trained on the Imagenet dataset and modify the last layers (called **retraining** or **transfer learning**) to quickly produce models to tackle new problems. For most problems, this is the best approach to get started with, rather than trying to invent new networks or techniques. Building a custom dataset and scaling it up with data augmentation techniques will get you a lot further than trying to build a new architecture.
This is a preview of subscription content, log in via an institution to check access.
Access this chapter
Subscribe and save.
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
- Available as PDF
- Read on any device
- Instant download
- Own it forever
- Available as EPUB and PDF
- Compact, lightweight edition
- Dispatched in 3 to 5 business days
- Free shipping worldwide - see info
Tax calculation will be finalised at checkout
Purchases are for personal use only
Institutional subscriptions
Author information
Authors and affiliations.
Jefferson, MO, USA
Brett Koonce
You can also search for this author in PubMed Google Scholar
Rights and permissions
Reprints and permissions
Copyright information
© 2021 Brett Koonce
About this chapter
Koonce, B. (2021). ResNet 50. In: Convolutional Neural Networks with Swift for Tensorflow. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-6168-2_6
Download citation
DOI : https://doi.org/10.1007/978-1-4842-6168-2_6
Published : 05 January 2021
Publisher Name : Apress, Berkeley, CA
Print ISBN : 978-1-4842-6167-5
Online ISBN : 978-1-4842-6168-2
eBook Packages : Professional and Applied Computing Apress Access Books Professional and Applied Computing (R0)
Share this chapter
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Publish with us
Policies and ethics
- Find a journal
- Track your research
Search anything:
Understanding ResNet50 architecture
Machine learning (ml).
Open-Source Internship opportunity by OpenGenus for programmers. Apply now.
ResNet50 is a variant of ResNet model which has 48 Convolution layers along with 1 MaxPool and 1 Average Pool layer. It has 3.8 x 10^9 Floating points operations. It is a widely used ResNet model and we have explored ResNet50 architecture in depth.
We start with some background information, comparison with other models and then, dive directly into ResNet50 architecture.
Introduction
In 2012 at the LSVRC2012 classification contest AlexNet won the the first price, After that ResNet was the most interesting thing that happened to the computer vision and the deep learning world.
Because of the framework that ResNets presented it was made possible to train ultra deep neural networks and by that i mean that i network can contain hundreds or thousands of layers and still achieve great performance.
The ResNets were initially applied to the image recognition task but as it is mentioned in the paper that the framework can also be used for non computer vision tasks also to achieve better accuracy.
Many of you may argue that simply stacking more layers also gives us better accuracy why was there a need of Residual learning for training ultra deep neural networks.
As we know that Deep Convolutional neural networks are really great at identifying low, mid and high level features from the images and stacking more layers generally gives us better accuracy so a question arrises that is getting better model performance as easy as stacking more layers?
With this questions arises the problem of vanishing/exploding gradients those problems were largely handled by many ways and enabled networks with tens of layers to converge but when deep neural networks start to converge we see another problem of the accuracy getting saturated and then degrading rapidly and this was not caused by overfitting as one may guess and adding more layers to a suitable deep model just increased the training error.
This problem was further rectifed by by taking a shallower model and a deep model that was constructed with the layers from the shallow model and and adding identity layers to it and accordingly the deeper model shouldn't have produced any higher training error than its counterpart as the added layers were just the identity layers.
In Figure 1 we can see on the left and the right that the deeper model is always producing more error, where in fact it shouldn't have done that.
The authors addressed this problem by introducing deep residual learning framework so for this they introduce shortcut connections that simply perform identity mappings
They explicitly let the layers fit a residual mapping and denoated that as H(x) and they let the non linear layers fit another mapping F(x):=H(x)−x so the original mapping becomes H(x):=F(x)+x as can be seen in Figure 2.
And the benifit of these shortcut identity mapping were that there was no additional parameters added to the model and also the computational time was kept in check.
To demonstrate how much better the ResNet are they comapred it with a 34 layer model and a 18 layer model both with plain and residual mappings and the results were not so astounding the 18 layer plain net outperformed the 34 layer plain net and in the case of ResNet the 34 layer ResNet outperformed the 18 layer ResNet as can be seen in figure 3.
ResNet50 Architecture
Now we are going to discuss about Resnet 50 and also the architecture for the above talked 18 and 34 layer ResNet is also given residual mapping and not shown for simplicity.
There was a small change that was made for the ResNet 50 and above that before this the shortcut connections skipped two layers but now they skip three layers and also there was 1 * 1 convolution layers added that we are going to see in detail with the ResNet 50 Architecture.
So as we can see in the table 1 the resnet 50 architecture contains the following element:
- A convoultion with a kernel size of 7 * 7 and 64 different kernels all with a stride of size 2 giving us 1 layer .
- Next we see max pooling with also a stride size of 2.
- In the next convolution there is a 1 * 1,64 kernel following this a 3 * 3,64 kernel and at last a 1 * 1,256 kernel, These three layers are repeated in total 3 time so giving us 9 layers in this step.
- Next we see kernel of 1 * 1,128 after that a kernel of 3 * 3,128 and at last a kernel of 1 * 1,512 this step was repeated 4 time so giving us 12 layers in this step.
- After that there is a kernal of 1 * 1,256 and two more kernels with 3 * 3,256 and 1 * 1,1024 and this is repeated 6 time giving us a total of 18 layers .
- And then again a 1 * 1,512 kernel with two more of 3 * 3,512 and 1 * 1,2048 and this was repeated 3 times giving us a total of 9 layers .
- After that we do a average pool and end it with a fully connected layer containing 1000 nodes and at the end a softmax function so this gives us 1 layer .
We don't actually count the activation functions and the max/ average pooling layers.
so totaling this it gives us a 1 + 9 + 12 + 18 + 9 + 1 = 50 layers Deep Convolutional network.
The Result were pretty good on the ImageNet validation set, The ResNet 50 model achieved a top-1 error rate of 20.47 percent and and achieved a top-5 error rate of 5.25 percent, This is reported for single model that consists of 50 layers not a ensemble of it. below is the table given if you want to compare it with other ResNets or with other models.
- This architecture can be used on computer vision tasks such as image classififcation, object localisation, object detection.
- and this framework can also be applied to non computer vision tasks to give them the benifit of depth and to reduce the computational expense also.
- Research paper for Deep residual learning.
- VGG-19 by Aakash Kaushik (opengenus).
- Floating point operations per second (FLOPS) of Machine Learning models.
- Convolutional Neural Network by Piyush Mishra and Junaid N Z (OpenGenus)
ResNet-50 neural network architecture [56].
Context in source publication
- Sih Yuliana Wahyuningtyas
- Cheick Abdoul Kadir A. Kounta
- Xiaoyan Wang
- J PARALLEL DISTR COM
- Comput Model Eng Sci
- Wenjun Zhang
- Wenfeng Wang
- MULTIMED TOOLS APPL
- Dogus Karabulut
- Cagri Ozcinar
- Mohd Umair Ali Siddique
- Sonu Moni Rabha
- Janoo Periwal
- Nupur Choudhury
- Darshan Lahamage
- Tanya Anupam
- Rajendra Sawant
- P. Sivakamasundari
- Recruit researchers
- Join for free
- Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up
Information
- Author Services
Initiatives
You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.
Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.
Original Submission Date Received: .
- Active Journals
- Find a Journal
- Proceedings Series
- For Authors
- For Reviewers
- For Editors
- For Librarians
- For Publishers
- For Societies
- For Conference Organizers
- Open Access Policy
- Institutional Open Access Program
- Special Issues Guidelines
- Editorial Process
- Research and Publication Ethics
- Article Processing Charges
- Testimonials
- Preprints.org
- SciProfiles
- Encyclopedia
Article Menu
- Subscribe SciFeed
- Recommended Articles
- Google Scholar
- on Google Scholar
- Table of Contents
Find support for a specific problem in the support section of our website.
Please let us know what you think of our products and services.
Visit our dedicated information section to learn more about MDPI.
JSmol Viewer
A review of computer vision-based crack detection methods in civil infrastructure: progress and challenges.
1. Introduction
2. crack detection combining traditional image processing methods and deep learning, 2.1. crack detection based on image edge detection and deep learning, 2.2. crack detection based on threshold segmentation and deep learning, 2.3. crack detection based on morphological operations and deep learning, 3. crack detection based on multimodal data fusion, 3.1. multi-sensor fusion, 3.2. multi-source data fusion, 4. crack detection based on image semantic understanding, 4.1. crack detection based on classification networks, 4.2. crack detection based on object detection networks, 4.3. crack detection based on segmentation networks.
Model | Improvement/Innovation | Backbone/Feature Extraction Architecture | Efficiency | Results |
---|---|---|---|---|
FCS-Net [ ] | Integrating ResNet-50, ASPP, and BN | ResNet-50 | - | MIoU = 74.08% |
FCN-SFW [ ] | Combining fully convolutional network (FCN) and structural forests with wavelet transform (SFW) for detecting tiny cracks | FCN | Computing time = 1.5826 s | Precision = 64.1% Recall = 87.22% F1 score = 68.28% |
AFFNet [ ] | Using ResNet101 as the backbone network, and incorporating two attention mechanism modules, namely VH-CAM and ECAUM | ResNet101 | Execution time = 52 ms | MIoU = 84.49% FWIoU = 97.07% PA = 98.36% MPA = 92.01% |
DeepLabv3+ [ ] | Replacing ordinary convolution with separable convolution; improved SE_ASSP module | Xception-65 | - | AP = 97.63% MAP = 95.58% MIoU = 81.87% |
U-Net [ ] | The parameters were optimized (the depths of the network, the choice of activation functions, the selection of loss functions, and the data augmentation) | Encoder and decoder | Analysis speed (1024 × 1024 pixels) = 0.022 s | Precision = 84.6% Recall = 72.5% F1 score = 78.1% IoU = 64% |
KTCAM-Net [ ] | Combined CAM and RCM; integrating classification network and segmentation network | DeepLabv3 | FPS = 28 | Accuracy = 97.26% Precision = 68.9% Recall = 83.7% F1 score = 75.4% MIoU = 74.3% |
ADDU-Net [ ] | Featuring asymmetric dual decoders and dual attention mechanisms | Encoder and decoder | FPS = 35 | Precision = 68.9% Recall = 83.7% F1 score = 75.4% MIoU = 74.3% |
CGTr-Net [ ] | Optimized CG-Trans, TCFF, and hybrid loss functions | CG-Trans | - | Precision = 88.8% Recall = 88.3% F1 score = 88.6% MIoU = 89.4% |
PCSN [ ] | Using Adadelta as the optimizer and categorical cross-entropy as the loss function for the network | SegNet | Inference time = 0.12 s | mAP = 83% Accuracy = 90% Recall = 50% |
DEHF-Net [ ] | Introducing dual-branch encoder unit, feature fusion scheme, edge refinement module, and multi-scale feature fusion module | Dual-branch encoder unit | - | Precision = 86.3% Recall = 92.4% Dice score = 78.7% mIoU = 81.6% |
Student model + teacher model [ ] | Proposed a semi-supervised semantic segmentation network | EfficientUNet | - | Precision = 84.98% Recall = 84.38% F1 score = 83.15% |
5. Datasets
6. evaluation index, 7. discussion, 8. conclusions, author contributions, data availability statement, acknowledgments, conflicts of interest.
Aspect | Combining Traditional Image Processing Methods and Deep Learning | Multimodal Data Fusion |
---|---|---|
Processing speed | Moderate—traditional methods are usually fast, but deep learning models may be slower, and the overall speed depends on the complexity of the deep learning model | Slower—data fusion and processing speed can be slow, especially with large-scale multimodal data, involving significant computational and data transfer overhead |
Accuracy | High—combines the interpretability of traditional methods with the complex pattern handling of deep learning, generally resulting in high detection accuracy | Typically higher—combining different data sources (e.g., images, text, audio) provides comprehensive information, improving overall detection accuracy |
Robustness | Strong—traditional methods provide background knowledge, enhancing robustness, but deep learning’s risk of overfitting may reduce robustness | Very strong—fusion of multiple data sources enhances the model’s adaptability to different environments and conditions, better handling noise and anomalies |
Complexity | High—integrating traditional methods and deep learning involves complex design and balancing, with challenges in tuning and interpreting deep learning models | High—involves complex data preprocessing, alignment, and fusion, handling inconsistencies and complexities from multiple data sources |
Adaptability | Strong—can adapt to different types of cracks and background variations, with deep learning models learning features from data, though it requires substantial labeled data | Very strong—combines diverse data sources, adapting well to various environments and conditions, and handling complex backgrounds and variations effectively |
Interpretability | Higher—traditional methods provide clear explanations, while deep learning models often lack interpretability; combining them can improve overall interpretability | Lower—fusion models generally have lower interpretability, making it difficult to intuitively explain how different data sources influence the final results |
Data requirements | High—deep learning models require a lot of labeled data, while traditional methods are more lenient, though deep learning still demands substantial data | Very high—requires large amounts of data from various modalities, and these data need to be processed and aligned effectively for successful fusion |
Flexibility | Moderate—combining traditional methods and deep learning handles various types of cracks, but may be limited in very complex scenarios | High—handles multiple data sources and different crack information, improving performance in diverse conditions through multimodal fusion |
Real-time capability | Poor—deep learning models are often slow to train and infer, making them less suitable for real-time detection, though combining with traditional methods can help | Poor—multimodal data fusion processing is generally slow, making it less suitable for real-time applications |
Maintenance cost | Moderate to high—deep learning models require regular updates and maintenance, while traditional methods have lower maintenance costs | High—involves ongoing maintenance and updates for multiple data sources, with complex data preprocessing and fusion processes |
Noise handling | Good—traditional methods effectively handle noise under certain conditions, and deep learning models can mitigate noise effects through training | Strong—multimodal fusion can complement information from different sources, improving robustness to noise and enhancing detection accuracy |
- Azimi, M.; Eslamlou, A.D.; Pekcan, G. Data-driven structural health monitoring and damage detection through deep learning: State-of-the-art review. Sensors 2020 , 20 , 2778. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Han, X.; Zhao, Z. Structural surface crack detection method based on computer vision technology. J. Build. Struct. 2018 , 39 , 418–427. [ Google Scholar ]
- Kruachottikul, P.; Cooharojananone, N.; Phanomchoeng, G.; Chavarnakul, T.; Kovitanggoon, K.; Trakulwaranont, D. Deep learning-based visual defect-inspection system for reinforced concrete bridge substructure: A case of thailand’s department of highways. J. Civ. Struct. Health Monit. 2021 , 11 , 949–965. [ Google Scholar ] [ CrossRef ]
- Gehri, N.; Mata-Falcón, J.; Kaufmann, W. Automated crack detection and measurement based on digital image correlation. Constr. Build. Mater. 2020 , 256 , 119383. [ Google Scholar ] [ CrossRef ]
- Mohan, A.; Poobal, S. Crack detection using image processing: A critical review and analysis. Alex. Eng. J. 2018 , 57 , 787–798. [ Google Scholar ] [ CrossRef ]
- Liu, Y.; Fan, J.; Nie, J.; Kong, S.; Qi, Y. Review and prospect of digital-image-based crack detection of structure surface. China Civ. Eng. J. 2021 , 54 , 79–98. [ Google Scholar ]
- Hsieh, Y.-A.; Tsai, Y.J. Machine learning for crack detection: Review and model performance comparison. J. Comput. Civ. Eng. 2020 , 34 , 04020038. [ Google Scholar ] [ CrossRef ]
- Xu, Y.; Bao, Y.; Chen, J.; Zuo, W.; Li, H. Surface fatigue crack identification in steel box girder of bridges by a deep fusion convolutional neural network based on consumer-grade camera images. Struct. Health Monit. 2019 , 18 , 653–674. [ Google Scholar ] [ CrossRef ]
- Wang, W.; Deng, L.; Shao, X. Fatigue design of steel bridges considering the effect of dynamic vehicle loading and overloaded trucks. J. Bridge Eng. 2016 , 21 , 04016048. [ Google Scholar ] [ CrossRef ]
- Zheng, K.; Zhou, S.; Zhang, Y.; Wei, Y.; Wang, J.; Wang, Y.; Qin, X. Simplified evaluation of shear stiffness degradation of diagonally cracked reinforced concrete beams. Materials 2023 , 16 , 4752. [ Google Scholar ] [ CrossRef ]
- Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986 , PAMI-8 , 679–698. [ Google Scholar ] [ CrossRef ]
- Otsu, N. A threshold selection method from gray-level histograms. Automatica 1975 , 11 , 23–27. [ Google Scholar ] [ CrossRef ]
- Sohn, H.G.; Lim, Y.M.; Yun, K.H.; Kim, G.H. Monitoring crack changes in concrete structures. Comput.-Aided Civ. Infrastruct. Eng. 2005 , 20 , 52–61. [ Google Scholar ] [ CrossRef ]
- Wang, P.; Qiao, H.; Feng, Q.; Xue, C. Internal corrosion cracks evolution in reinforced magnesium oxychloride cement concrete. Adv. Cem. Res. 2023 , 36 , 15–30. [ Google Scholar ] [ CrossRef ]
- Loutridis, S.; Douka, E.; Trochidis, A. Crack identification in double-cracked beams using wavelet analysis. J. Sound Vib. 2004 , 277 , 1025–1039. [ Google Scholar ] [ CrossRef ]
- Fan, C.L. Detection of multidamage to reinforced concrete using support vector machine-based clustering from digital images. Struct. Control Health Monit. 2021 , 28 , e2841. [ Google Scholar ] [ CrossRef ]
- Kyal, C.; Reza, M.; Varu, B.; Shreya, S. Image-based concrete crack detection using random forest and convolution neural network. In Computational Intelligence in Pattern Recognition: Proceedings of the International Conference on Computational Intelligence in Pattern Recognition (CIPR 2021), Held at the Institute of Engineering and Management, Kolkata, West Bengal, India, on 24–25 April 2021 ; Springer: Singapore, 2022; pp. 471–481. [ Google Scholar ]
- Jia, H.; Lin, J.; Liu, J. Bridge seismic damage assessment model applying artificial neural networks and the random forest algorithm. Adv. Civ. Eng. 2020 , 2020 , 6548682. [ Google Scholar ] [ CrossRef ]
- Park, M.J.; Kim, J.; Jeong, S.; Jang, A.; Bae, J.; Ju, Y.K. Machine learning-based concrete crack depth prediction using thermal images taken under daylight conditions. Remote Sens. 2022 , 14 , 2151. [ Google Scholar ] [ CrossRef ]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015 , 521 , 436–444. [ Google Scholar ] [ CrossRef ]
- Liu, Z.; Cao, Y.; Wang, Y.; Wang, W. Computer vision-based concrete crack detection using u-net fully convolutional networks. Autom. Constr. 2019 , 104 , 129–139. [ Google Scholar ] [ CrossRef ]
- Li, G.; Ma, B.; He, S.; Ren, X.; Liu, Q. Automatic tunnel crack detection based on u-net and a convolutional neural network with alternately updated clique. Sensors 2020 , 20 , 717. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Chaiyasarn, K.; Buatik, A.; Mohamad, H.; Zhou, M.; Kongsilp, S.; Poovarodom, N. Integrated pixel-level cnn-fcn crack detection via photogrammetric 3d texture mapping of concrete structures. Autom. Constr. 2022 , 140 , 104388. [ Google Scholar ] [ CrossRef ]
- Li, S.; Zhao, X.; Zhou, G. Automatic pixel-level multiple damage detection of concrete structure using fully convolutional network. Comput.-Aided Civ. Infrastruct. Eng. 2019 , 34 , 616–634. [ Google Scholar ] [ CrossRef ]
- Zheng, X.; Zhang, S.; Li, X.; Li, G.; Li, X. Lightweight bridge crack detection method based on segnet and bottleneck depth-separable convolution with residuals. IEEE Access 2021 , 9 , 161649–161668. [ Google Scholar ] [ CrossRef ]
- Azouz, Z.; Honarvar Shakibaei Asli, B.; Khan, M. Evolution of crack analysis in structures using image processing technique: A review. Electronics 2023 , 12 , 3862. [ Google Scholar ] [ CrossRef ]
- Hamishebahar, Y.; Guan, H.; So, S.; Jo, J. A comprehensive review of deep learning-based crack detection approaches. Appl. Sci. 2022 , 12 , 1374. [ Google Scholar ] [ CrossRef ]
- Meng, S.; Gao, Z.; Zhou, Y.; He, B.; Djerrad, A. Real-time automatic crack detection method based on drone. Comput.-Aided Civ. Infrastruct. Eng. 2023 , 38 , 849–872. [ Google Scholar ] [ CrossRef ]
- Humpe, A. Bridge inspection with an off-the-shelf 360 camera drone. Drones 2020 , 4 , 67. [ Google Scholar ] [ CrossRef ]
- Truong-Hong, L.; Lindenbergh, R. Automatically extracting surfaces of reinforced concrete bridges from terrestrial laser scanning point clouds. Autom. Constr. 2022 , 135 , 104127. [ Google Scholar ] [ CrossRef ]
- Cusson, D.; Rossi, C.; Ozkan, I.F. Early warning system for the detection of unexpected bridge displacements from radar satellite data. J. Civ. Struct. Health Monit. 2021 , 11 , 189–204. [ Google Scholar ] [ CrossRef ]
- Bonaldo, G.; Caprino, A.; Lorenzoni, F.; da Porto, F. Monitoring displacements and damage detection through satellite MT-INSAR techniques: A new methodology and application to a case study in rome (Italy). Remote Sens. 2023 , 15 , 1177. [ Google Scholar ] [ CrossRef ]
- Zheng, Z.; Zhong, Y.; Wang, J.; Ma, A.; Zhang, L. Building damage assessment for rapid disaster response with a deep object-based semantic change detection framework: From natural disasters to man-made disasters. Remote Sens. Environ. 2021 , 265 , 112636. [ Google Scholar ] [ CrossRef ]
- Chen, X.; Zhang, X.; Ren, M.; Zhou, B.; Sun, M.; Feng, Z.; Chen, B.; Zhi, X. A multiscale enhanced pavement crack segmentation network coupling spectral and spatial information of UAV hyperspectral imagery. Int. J. Appl. Earth Obs. Geoinf. 2024 , 128 , 103772. [ Google Scholar ] [ CrossRef ]
- Liu, F.; Liu, J.; Wang, L. Deep learning and infrared thermography for asphalt pavement crack severity classification. Autom. Constr. 2022 , 140 , 104383. [ Google Scholar ] [ CrossRef ]
- Liu, S.; Han, Y.; Xu, L. Recognition of road cracks based on multi-scale retinex fused with wavelet transform. Array 2022 , 15 , 100193. [ Google Scholar ] [ CrossRef ]
- Zhang, H.; Qian, Z.; Tan, Y.; Xie, Y.; Li, M. Investigation of pavement crack detection based on deep learning method using weakly supervised instance segmentation framework. Constr. Build. Mater. 2022 , 358 , 129117. [ Google Scholar ] [ CrossRef ]
- Dorafshan, S.; Thomas, R.J.; Maguire, M. Comparison of deep convolutional neural networks and edge detectors for image-based crack detection in concrete. Constr. Build. Mater. 2018 , 186 , 1031–1045. [ Google Scholar ] [ CrossRef ]
- Munawar, H.S.; Hammad, A.W.; Haddad, A.; Soares, C.A.P.; Waller, S.T. Image-based crack detection methods: A review. Infrastructures 2021 , 6 , 115. [ Google Scholar ] [ CrossRef ]
- Chen, D.; Li, X.; Hu, F.; Mathiopoulos, P.T.; Di, S.; Sui, M.; Peethambaran, J. Edpnet: An encoding–decoding network with pyramidal representation for semantic image segmentation. Sensors 2023 , 23 , 3205. [ Google Scholar ] [ CrossRef ]
- Mo, S.; Shi, Y.; Yuan, Q.; Li, M. A survey of deep learning road extraction algorithms using high-resolution remote sensing images. Sensors 2024 , 24 , 1708. [ Google Scholar ] [ CrossRef ]
- Chen, D.; Li, J.; Di, S.; Peethambaran, J.; Xiang, G.; Wan, L.; Li, X. Critical points extraction from building façades by analyzing gradient structure tensor. Remote Sens. 2021 , 13 , 3146. [ Google Scholar ] [ CrossRef ]
- Liu, Y.; Yeoh, J.K.; Chua, D.K. Deep learning-based enhancement of motion blurred UAV concrete crack images. J. Comput. Civ. Eng. 2020 , 34 , 04020028. [ Google Scholar ] [ CrossRef ]
- Flah, M.; Nunez, I.; Ben Chaabene, W.; Nehdi, M.L. Machine learning algorithms in civil structural health monitoring: A systematic review. Arch. Comput. Methods Eng. 2021 , 28 , 2621–2643. [ Google Scholar ] [ CrossRef ]
- Li, G.; Li, X.; Zhou, J.; Liu, D.; Ren, W. Pixel-level bridge crack detection using a deep fusion about recurrent residual convolution and context encoder network. Measurement 2021 , 176 , 109171. [ Google Scholar ] [ CrossRef ]
- Ali, R.; Chuah, J.H.; Talip, M.S.A.; Mokhtar, N.; Shoaib, M.A. Structural crack detection using deep convolutional neural networks. Autom. Constr. 2022 , 133 , 103989. [ Google Scholar ] [ CrossRef ]
- Wang, H.; Li, Y.; Dang, L.M.; Lee, S.; Moon, H. Pixel-level tunnel crack segmentation using a weakly supervised annotation approach. Comput. Ind. 2021 , 133 , 103545. [ Google Scholar ] [ CrossRef ]
- Zhu, J.; Song, J. Weakly supervised network based intelligent identification of cracks in asphalt concrete bridge deck. Alex. Eng. J. 2020 , 59 , 1307–1317. [ Google Scholar ] [ CrossRef ]
- Li, Y.; Bao, T.; Xu, B.; Shu, X.; Zhou, Y.; Du, Y.; Wang, R.; Zhang, K. A deep residual neural network framework with transfer learning for concrete dams patch-level crack classification and weakly-supervised localization. Measurement 2022 , 188 , 110641. [ Google Scholar ] [ CrossRef ]
- Yang, Q.; Shi, W.; Chen, J.; Lin, W. Deep convolution neural network-based transfer learning method for civil infrastructure crack detection. Autom. Constr. 2020 , 116 , 103199. [ Google Scholar ] [ CrossRef ]
- Dais, D.; Bal, I.E.; Smyrou, E.; Sarhosis, V. Automatic crack classification and segmentation on masonry surfaces using convolutional neural networks and transfer learning. Autom. Constr. 2021 , 125 , 103606. [ Google Scholar ] [ CrossRef ]
- Abdellatif, M.; Peel, H.; Cohn, A.G.; Fuentes, R. Combining block-based and pixel-based approaches to improve crack detection and localisation. Autom. Constr. 2021 , 122 , 103492. [ Google Scholar ] [ CrossRef ]
- Dan, D.; Dan, Q. Automatic recognition of surface cracks in bridges based on 2D-APES and mobile machine vision. Measurement 2021 , 168 , 108429. [ Google Scholar ] [ CrossRef ]
- Weng, X.; Huang, Y.; Wang, W. Segment-based pavement crack quantification. Autom. Constr. 2019 , 105 , 102819. [ Google Scholar ] [ CrossRef ]
- Kao, S.-P.; Chang, Y.-C.; Wang, F.-L. Combining the YOLOv4 deep learning model with UAV imagery processing technology in the extraction and quantization of cracks in bridges. Sensors 2023 , 23 , 2572. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Li, X.; Xu, X.; He, X.; Wei, X.; Yang, H. Intelligent crack detection method based on GM-ResNet. Sensors 2023 , 23 , 8369. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Choi, Y.; Park, H.W.; Mi, Y.; Song, S. Crack detection and analysis of concrete structures based on neural network and clustering. Sensors 2024 , 24 , 1725. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Guo, J.-M.; Markoni, H.; Lee, J.-D. BARNet: Boundary aware refinement network for crack detection. IEEE Trans. Intell. Transp. Syst. 2021 , 23 , 7343–7358. [ Google Scholar ] [ CrossRef ]
- Luo, J.; Lin, H.; Wei, X.; Wang, Y. Adaptive canny and semantic segmentation networks based on feature fusion for road crack detection. IEEE Access 2023 , 11 , 51740–51753. [ Google Scholar ] [ CrossRef ]
- Ranyal, E.; Sadhu, A.; Jain, K. Enhancing pavement health assessment: An attention-based approach for accurate crack detection, measurement, and mapping. Expert Syst. Appl. 2024 , 247 , 123314. [ Google Scholar ] [ CrossRef ]
- Liu, K.; Chen, B.M. Industrial UAV-based unsupervised domain adaptive crack recognitions: From database towards real-site infrastructural inspections. IEEE Trans. Ind. Electron. 2022 , 70 , 9410–9420. [ Google Scholar ] [ CrossRef ]
- Wang, W.; Hu, W.; Wang, W.; Xu, X.; Wang, M.; Shi, Y.; Qiu, S.; Tutumluer, E. Automated crack severity level detection and classification for ballastless track slab using deep convolutional neural network. Autom. Constr. 2021 , 124 , 103484. [ Google Scholar ] [ CrossRef ]
- Xu, Z.; Zhang, X.; Chen, W.; Liu, J.; Xu, T.; Wang, Z. Muraldiff: Diffusion for ancient murals restoration on large-scale pre-training. IEEE Trans. Emerg. Top. Comput. Intell. 2024 , 8 , 2169–2181. [ Google Scholar ] [ CrossRef ]
- Bradley, D.; Roth, G. Adaptive thresholding using the integral image. J. Graph. Tools 2007 , 12 , 13–21. [ Google Scholar ] [ CrossRef ]
- Sezgin, M.; Sankur, B.l. Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging 2004 , 13 , 146–168. [ Google Scholar ]
- Kapur, J.N.; Sahoo, P.K.; Wong, A.K. A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 1985 , 29 , 273–285. [ Google Scholar ] [ CrossRef ]
- Pal, N.R.; Pal, S.K. A review on image segmentation techniques. Pattern Recognit. 1993 , 26 , 1277–1294. [ Google Scholar ] [ CrossRef ]
- Flah, M.; Suleiman, A.R.; Nehdi, M.L. Classification and quantification of cracks in concrete structures using deep learning image-based techniques. Cem. Concr. Compos. 2020 , 114 , 103781. [ Google Scholar ] [ CrossRef ]
- Mazni, M.; Husain, A.R.; Shapiai, M.I.; Ibrahim, I.S.; Anggara, D.W.; Zulkifli, R. An investigation into real-time surface crack classification and measurement for structural health monitoring using transfer learning convolutional neural networks and otsu method. Alex. Eng. J. 2024 , 92 , 310–320. [ Google Scholar ] [ CrossRef ]
- He, Z.; Xu, W. Deep learning and image preprocessing-based crack repair trace and secondary crack classification detection method for concrete bridges. Struct. Infrastruct. Eng. 2024 , 20 , 1–17. [ Google Scholar ] [ CrossRef ]
- He, T.; Li, H.; Qian, Z.; Niu, C.; Huang, R. Research on weakly supervised pavement crack segmentation based on defect location by generative adversarial network and target re-optimization. Constr. Build. Mater. 2024 , 411 , 134668. [ Google Scholar ] [ CrossRef ]
- Su, H.; Wang, X.; Han, T.; Wang, Z.; Zhao, Z.; Zhang, P. Research on a U-Net bridge crack identification and feature-calculation methods based on a CBAM attention mechanism. Buildings 2022 , 12 , 1561. [ Google Scholar ] [ CrossRef ]
- Kang, D.; Benipal, S.S.; Gopal, D.L.; Cha, Y.-J. Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning. Autom. Constr. 2020 , 118 , 103291. [ Google Scholar ] [ CrossRef ]
- Lei, Q.; Zhong, J.; Wang, C. Joint optimization of crack segmentation with an adaptive dynamic threshold module. IEEE Trans. Intell. Transp. Syst. 2024 , 25 , 6902–6916. [ Google Scholar ] [ CrossRef ]
- Lei, Q.; Zhong, J.; Wang, C.; Xia, Y.; Zhou, Y. Dynamic thresholding for accurate crack segmentation using multi-objective optimization. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Turin, Italy, 18 September 2023 ; Springer: Cham, Switzerland, 2023; pp. 389–404. [ Google Scholar ]
- Vincent, L.; Soille, P. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 1991 , 13 , 583–598. [ Google Scholar ] [ CrossRef ]
- Huang, H.; Zhao, S.; Zhang, D.; Chen, J. Deep learning-based instance segmentation of cracks from shield tunnel lining images. Struct. Infrastruct. Eng. 2022 , 18 , 183–196. [ Google Scholar ] [ CrossRef ]
- Fan, Z.; Lin, H.; Li, C.; Su, J.; Bruno, S.; Loprencipe, G. Use of parallel resnet for high-performance pavement crack detection and measurement. Sustainability 2022 , 14 , 1825. [ Google Scholar ] [ CrossRef ]
- Kong, S.Y.; Fan, J.S.; Liu, Y.F.; Wei, X.C.; Ma, X.W. Automated crack assessment and quantitative growth monitoring. Comput.-Aided Civ. Infrastruct. Eng. 2021 , 36 , 656–674. [ Google Scholar ] [ CrossRef ]
- Dang, L.M.; Wang, H.; Li, Y.; Park, Y.; Oh, C.; Nguyen, T.N.; Moon, H. Automatic tunnel lining crack evaluation and measurement using deep learning. Tunn. Undergr. Space Technol. 2022 , 124 , 104472. [ Google Scholar ] [ CrossRef ]
- Andrushia, A.D.; Anand, N.; Lubloy, E. Deep learning based thermal crack detection on structural concrete exposed to elevated temperature. Adv. Struct. Eng. 2021 , 24 , 1896–1909. [ Google Scholar ] [ CrossRef ]
- Dang, L.M.; Wang, H.; Li, Y.; Nguyen, L.Q.; Nguyen, T.N.; Song, H.-K.; Moon, H. Deep learning-based masonry crack segmentation and real-life crack length measurement. Constr. Build. Mater. 2022 , 359 , 129438. [ Google Scholar ] [ CrossRef ]
- Nguyen, A.; Gharehbaghi, V.; Le, N.T.; Sterling, L.; Chaudhry, U.I.; Crawford, S. ASR crack identification in bridges using deep learning and texture analysis. Structures 2023 , 50 , 494–507. [ Google Scholar ] [ CrossRef ]
- Dong, C.; Li, L.; Yan, J.; Zhang, Z.; Pan, H.; Catbas, F.N. Pixel-level fatigue crack segmentation in large-scale images of steel structures using an encoder–decoder network. Sensors 2021 , 21 , 4135. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Jian, L.; Chengshun, L.; Guanhong, L.; Zhiyuan, Z.; Bo, H.; Feng, G.; Quanyi, X. Lightweight defect detection equipment for road tunnels. IEEE Sens. J. 2023 , 24 , 5107–5121. [ Google Scholar ]
- Liang, H.; Qiu, D.; Ding, K.-L.; Zhang, Y.; Wang, Y.; Wang, X.; Liu, T.; Wan, S. Automatic pavement crack detection in multisource fusion images using similarity and difference features. IEEE Sens. J. 2023 , 24 , 5449–5465. [ Google Scholar ] [ CrossRef ]
- Alamdari, A.G.; Ebrahimkhanlou, A. A multi-scale robotic approach for precise crack measurement in concrete structures. Autom. Constr. 2024 , 158 , 105215. [ Google Scholar ] [ CrossRef ]
- Liu, H.; Kollosche, M.; Laflamme, S.; Clarke, D.R. Multifunctional soft stretchable strain sensor for complementary optical and electrical sensing of fatigue cracks. Smart Mater. Struct. 2023 , 32 , 045010. [ Google Scholar ] [ CrossRef ]
- Dang, D.-Z.; Wang, Y.-W.; Ni, Y.-Q. Nonlinear autoregression-based non-destructive evaluation approach for railway tracks using an ultrasonic fiber bragg grating array. Constr. Build. Mater. 2024 , 411 , 134728. [ Google Scholar ] [ CrossRef ]
- Yan, M.; Tan, X.; Mahjoubi, S.; Bao, Y. Strain transfer effect on measurements with distributed fiber optic sensors. Autom. Constr. 2022 , 139 , 104262. [ Google Scholar ] [ CrossRef ]
- Shukla, H.; Piratla, K. Leakage detection in water pipelines using supervised classification of acceleration signals. Autom. Constr. 2020 , 117 , 103256. [ Google Scholar ] [ CrossRef ]
- Chen, X.; Zhang, X.; Li, J.; Ren, M.; Zhou, B. A new method for automated monitoring of road pavement aging conditions based on recurrent neural network. IEEE Trans. Intell. Transp. Syst. 2022 , 23 , 24510–24523. [ Google Scholar ] [ CrossRef ]
- Zhang, S.; He, X.; Xue, B.; Wu, T.; Ren, K.; Zhao, T. Segment-anything embedding for pixel-level road damage extraction using high-resolution satellite images. Int. J. Appl. Earth Obs. Geoinf. 2024 , 131 , 103985. [ Google Scholar ] [ CrossRef ]
- Park, S.E.; Eem, S.-H.; Jeon, H. Concrete crack detection and quantification using deep learning and structured light. Constr. Build. Mater. 2020 , 252 , 119096. [ Google Scholar ] [ CrossRef ]
- Yan, Y.; Mao, Z.; Wu, J.; Padir, T.; Hajjar, J.F. Towards automated detection and quantification of concrete cracks using integrated images and lidar data from unmanned aerial vehicles. Struct. Control Health Monit. 2021 , 28 , e2757. [ Google Scholar ] [ CrossRef ]
- Dong, Q.; Wang, S.; Chen, X.; Jiang, W.; Li, R.; Gu, X. Pavement crack detection based on point cloud data and data fusion. Philos. Trans. R. Soc. A 2023 , 381 , 20220165. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Kim, H.; Lee, S.; Ahn, E.; Shin, M.; Sim, S.-H. Crack identification method for concrete structures considering angle of view using RGB-D camera-based sensor fusion. Struct. Health Monit. 2021 , 20 , 500–512. [ Google Scholar ] [ CrossRef ]
- Chen, J.; Lu, W.; Lou, J. Automatic concrete defect detection and reconstruction by aligning aerial images onto semantic-rich building information model. Comput.-Aided Civ. Infrastruct. Eng. 2023 , 38 , 1079–1098. [ Google Scholar ] [ CrossRef ]
- Pozzer, S.; Rezazadeh Azar, E.; Dalla Rosa, F.; Chamberlain Pravia, Z.M. Semantic segmentation of defects in infrared thermographic images of highly damaged concrete structures. J. Perform. Constr. Facil. 2021 , 35 , 04020131. [ Google Scholar ] [ CrossRef ]
- Kaur, R.; Singh, S. A comprehensive review of object detection with deep learning. Digit. Signal Process. 2023 , 132 , 103812. [ Google Scholar ] [ CrossRef ]
- Sharma, V.K.; Mir, R.N. A comprehensive and systematic look up into deep learning based object detection techniques: A review. Comput. Sci. Rev. 2020 , 38 , 100301. [ Google Scholar ] [ CrossRef ]
- Zhang, L.; Yang, F.; Zhang, Y.D.; Zhu, Y.J. Road crack detection using deep convolutional neural network. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 3708–3712. [ Google Scholar ]
- Yang, C.; Chen, J.; Li, Z.; Huang, Y. Structural crack detection and recognition based on deep learning. Appl. Sci. 2021 , 11 , 2868. [ Google Scholar ] [ CrossRef ]
- Rajadurai, R.-S.; Kang, S.-T. Automated vision-based crack detection on concrete surfaces using deep learning. Appl. Sci. 2021 , 11 , 5229. [ Google Scholar ] [ CrossRef ]
- Kim, B.; Yuvaraj, N.; Sri Preethaa, K.; Arun Pandian, R. Surface crack detection using deep learning with shallow CNN architecture for enhanced computation. Neural Comput. Appl. 2021 , 33 , 9289–9305. [ Google Scholar ] [ CrossRef ]
- O’Brien, D.; Osborne, J.A.; Perez-Duenas, E.; Cunningham, R.; Li, Z. Automated crack classification for the CERN underground tunnel infrastructure using deep learning. Tunn. Undergr. Space Technol. 2023 , 131 , 104668. [ Google Scholar ]
- Chen, K.; Reichard, G.; Xu, X.; Akanmu, A. Automated crack segmentation in close-range building façade inspection images using deep learning techniques. J. Build. Eng. 2021 , 43 , 102913. [ Google Scholar ] [ CrossRef ]
- Dong, Z.; Wang, J.; Cui, B.; Wang, D.; Wang, X. Patch-based weakly supervised semantic segmentation network for crack detection. Constr. Build. Mater. 2020 , 258 , 120291. [ Google Scholar ] [ CrossRef ]
- Buatik, A.; Thansirichaisree, P.; Kalpiyapun, P.; Khademi, N.; Pasityothin, I.; Poovarodom, N. Mosaic crack mapping of footings by convolutional neural networks. Sci. Rep. 2024 , 14 , 7851. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Zhang, Y.; Zhang, L. Detection of pavement cracks by deep learning models of transformer and UNet. arXiv 2023 , arXiv:2304.12596. [ Google Scholar ] [ CrossRef ]
- Al-Huda, Z.; Peng, B.; Algburi, R.N.A.; Al-antari, M.A.; Rabea, A.-J.; Zhai, D. A hybrid deep learning pavement crack semantic segmentation. Eng. Appl. Artif. Intell. 2023 , 122 , 106142. [ Google Scholar ] [ CrossRef ]
- Shamsabadi, E.A.; Xu, C.; Rao, A.S.; Nguyen, T.; Ngo, T.; Dias-da-Costa, D. Vision transformer-based autonomous crack detection on asphalt and concrete surfaces. Autom. Constr. 2022 , 140 , 104316. [ Google Scholar ] [ CrossRef ]
- Huang, S.; Tang, W.; Huang, G.; Huangfu, L.; Yang, D. Weakly supervised patch label inference networks for efficient pavement distress detection and recognition in the wild. IEEE Trans. Intell. Transp. Syst. 2023 , 24 , 5216–5228. [ Google Scholar ] [ CrossRef ]
- Huang, G.; Huang, S.; Huangfu, L.; Yang, D. Weakly supervised patch label inference network with image pyramid for pavement diseases recognition in the wild. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 7978–7982. [ Google Scholar ]
- Guo, J.-M.; Markoni, H. Efficient and adaptable patch-based crack detection. IEEE Trans. Intell. Transp. Syst. 2022 , 23 , 21885–21896. [ Google Scholar ] [ CrossRef ]
- König, J.; Jenkins, M.D.; Mannion, M.; Barrie, P.; Morison, G. Weakly-supervised surface crack segmentation by generating pseudo-labels using localization with a classifier and thresholding. IEEE Trans. Intell. Transp. Syst. 2022 , 23 , 24083–24094. [ Google Scholar ] [ CrossRef ]
- Al-Huda, Z.; Peng, B.; Algburi, R.N.A.; Al-antari, M.A.; Rabea, A.-J.; Al-maqtari, O.; Zhai, D. Asymmetric dual-decoder-U-Net for pavement crack semantic segmentation. Autom. Constr. 2023 , 156 , 105138. [ Google Scholar ] [ CrossRef ]
- Wen, T.; Lang, H.; Ding, S.; Lu, J.J.; Xing, Y. PCDNet: Seed operation-based deep learning model for pavement crack detection on 3d asphalt surface. J. Transp. Eng. Part B Pavements 2022 , 148 , 04022023. [ Google Scholar ] [ CrossRef ]
- Mishra, A.; Gangisetti, G.; Eftekhar Azam, Y.; Khazanchi, D. Weakly supervised crack segmentation using crack attention networks on concrete structures. Struct. Health Monit. 2024 , 23 , 14759217241228150. [ Google Scholar ] [ CrossRef ]
- Kompanets, A.; Pai, G.; Duits, R.; Leonetti, D.; Snijder, B. Deep learning for segmentation of cracks in high-resolution images of steel bridges. arXiv 2024 , arXiv:2403.17725. [ Google Scholar ]
- Liu, Y.; Yeoh, J.K. Robust pixel-wise concrete crack segmentation and properties retrieval using image patches. Autom. Constr. 2021 , 123 , 103535. [ Google Scholar ] [ CrossRef ]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [ Google Scholar ]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [ Google Scholar ]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; Volume 28. [ Google Scholar ]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [ Google Scholar ]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [ Google Scholar ]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [ Google Scholar ]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018 , arXiv:1804.02767. [ Google Scholar ]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020 , arXiv:2004.10934. [ Google Scholar ]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [ Google Scholar ]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [ Google Scholar ]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [ Google Scholar ]
- Xu, Y.; Li, D.; Xie, Q.; Wu, Q.; Wang, J. Automatic defect detection and segmentation of tunnel surface using modified mask R-CNN. Measurement 2021 , 178 , 109316. [ Google Scholar ] [ CrossRef ]
- Zhao, W.; Liu, Y.; Zhang, J.; Shao, Y.; Shu, J. Automatic pixel-level crack detection and evaluation of concrete structures using deep learning. Struct. Control Health Monit. 2022 , 29 , e2981. [ Google Scholar ] [ CrossRef ]
- Li, R.; Yu, J.; Li, F.; Yang, R.; Wang, Y.; Peng, Z. Automatic bridge crack detection using unmanned aerial vehicle and faster R-CNN. Constr. Build. Mater. 2023 , 362 , 129659. [ Google Scholar ] [ CrossRef ]
- Tran, T.S.; Nguyen, S.D.; Lee, H.J.; Tran, V.P. Advanced crack detection and segmentation on bridge decks using deep learning. Constr. Build. Mater. 2023 , 400 , 132839. [ Google Scholar ] [ CrossRef ]
- Zhang, J.; Qian, S.; Tan, C. Automated bridge crack detection method based on lightweight vision models. Complex Intell. Syst. 2023 , 9 , 1639–1652. [ Google Scholar ] [ CrossRef ]
- Ren, R.; Liu, F.; Shi, P.; Wang, H.; Huang, Y. Preprocessing of crack recognition: Automatic crack-location method based on deep learning. J. Mater. Civ. Eng. 2023 , 35 , 04022452. [ Google Scholar ] [ CrossRef ]
- Liu, Z.; Yeoh, J.K.; Gu, X.; Dong, Q.; Chen, Y.; Wu, W.; Wang, L.; Wang, D. Automatic pixel-level detection of vertical cracks in asphalt pavement based on gpr investigation and improved mask R-CNN. Autom. Constr. 2023 , 146 , 104689. [ Google Scholar ] [ CrossRef ]
- Li, Z.; Zhu, H.; Huang, M. A deep learning-based fine crack segmentation network on full-scale steel bridge images with complicated backgrounds. IEEE Access 2021 , 9 , 114989–114997. [ Google Scholar ] [ CrossRef ]
- Alipour, M.; Harris, D.K.; Miller, G.R. Robust pixel-level crack detection using deep fully convolutional neural networks. J. Comput. Civ. Eng. 2019 , 33 , 04019040. [ Google Scholar ] [ CrossRef ]
- Wang, S.; Pan, Y.; Chen, M.; Zhang, Y.; Wu, X. FCN-SFW: Steel structure crack segmentation using a fully convolutional network and structured forests. IEEE Access 2020 , 8 , 214358–214373. [ Google Scholar ] [ CrossRef ]
- Hang, J.; Wu, Y.; Li, Y.; Lai, T.; Zhang, J.; Li, Y. A deep learning semantic segmentation network with attention mechanism for concrete crack detection. Struct. Health Monit. 2023 , 22 , 3006–3026. [ Google Scholar ] [ CrossRef ]
- Sun, Y.; Yang, Y.; Yao, G.; Wei, F.; Wong, M. Autonomous crack and bughole detection for concrete surface image based on deep learning. IEEE Access 2021 , 9 , 85709–85720. [ Google Scholar ] [ CrossRef ]
- Wang, Z.; Leng, Z.; Zhang, Z. A weakly-supervised transformer-based hybrid network with multi-attention for pavement crack detection. Constr. Build. Mater. 2024 , 411 , 134134. [ Google Scholar ] [ CrossRef ]
- Chen, T.; Cai, Z.; Zhao, X.; Chen, C.; Liang, X.; Zou, T.; Wang, P. Pavement crack detection and recognition using the architecture of segNet. J. Ind. Inf. Integr. 2020 , 18 , 100144. [ Google Scholar ] [ CrossRef ]
- Bai, S.; Ma, M.; Yang, L.; Liu, Y. Pixel-wise crack defect segmentation with dual-encoder fusion network. Constr. Build. Mater. 2024 , 426 , 136179. [ Google Scholar ] [ CrossRef ]
- Wang, W.; Su, C. Semi-supervised semantic segmentation network for surface crack detection. Autom. Constr. 2021 , 128 , 103786. [ Google Scholar ] [ CrossRef ]
- Tabernik, D.; Šela, S.; Skvarč, J.; Skočaj, D. Segmentation-based deep-learning approach for surface-defect detection. J. Intell. Manuf. 2020 , 31 , 759–776. [ Google Scholar ] [ CrossRef ]
- König, J.; Jenkins, M.D.; Mannion, M.; Barrie, P.; Morison, G. Optimized deep encoder-decoder methods for crack segmentation. Digit. Signal Process. 2021 , 108 , 102907. [ Google Scholar ] [ CrossRef ]
- Wang, C.; Liu, H.; An, X.; Gong, Z.; Deng, F. Swincrack: Pavement crack detection using convolutional swin-transformer network. Digit. Signal Process. 2024 , 145 , 104297. [ Google Scholar ] [ CrossRef ]
- Lan, Z.-X.; Dong, X.-M. Minicrack: A simple but efficient convolutional neural network for pixel-level narrow crack detection. Comput. Ind. 2022 , 141 , 103698. [ Google Scholar ] [ CrossRef ]
- Salton, G. Introduction to Modern Information Retrieval ; McGraw-Hill: New York, NY, USA, 1983. [ Google Scholar ]
- Jenkins, M.D.; Carr, T.A.; Iglesias, M.I.; Buggy, T.; Morison, G. A deep convolutional neural network for semantic pixel-wise segmentation of road and pavement surface cracks. In Proceedings of the 2018 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, 3–7 September 2018; IEEE: Piscataway, NJ, USA; pp. 2120–2124. [ Google Scholar ]
- Tsai, Y.-C.; Chatterjee, A. Comprehensive, quantitative crack detection algorithm performance evaluation system. J. Comput. Civ. Eng. 2017 , 31 , 04017047. [ Google Scholar ] [ CrossRef ]
- Li, H.; Wang, J.; Zhang, Y.; Wang, Z.; Wang, T. A study on evaluation standard for automatic crack detection regard the random fractal. arXiv 2020 , arXiv:2007.12082. [ Google Scholar ]
Click here to enlarge figure
Method | Features | Domain | Dataset | Image Device/Source | Results | Limitations |
---|---|---|---|---|---|---|
Canny and YOLOv4 [ ] | Crack detection and measurement | Bridges | 1463 images 256 × 256 pixels | Smartphone and DJI UAV | Accuracy = 92% mAP = 92% | The Canny edge detector is affected by the threshold |
Canny and GM-ResNet [ ] | Crack detection, measurement, and classification | Road | 522 images 224 × 224 pixels | Concrete crack sub-dataset | Precision = 97.9% Recall = 98.9% F1 measure = 98.0% Accuracy in shadow conditions = 99.3% Accuracy in shadow-free conditions = 99.9% | Its detection performance for complex cracks is not yet perfect |
Sobel and ResNet50 [ ] | Crack detection | Concrete | 4500 images 100 × 100 pixels | FLIR E8 | Precision = 98.4% Recall = 88.7% F1 measure = 93.2% | - |
Sobel and BARNet [ ] | Crack detection and localization | Road | 206 images 800 × 600 pixels | CrackTree200 dataset | AIU = 19.85% ODS = 79.9% OIS = 81.4% | Hyperparameter tuning is needed to balance the penalty weights for different types of cracks |
Canny and DeepLabV3+ [ ] | Crack detection | Road | 2000 × 1500 pixels | Crack500 dataset | MIoU = 77.64% MAE = 1.55 PA = 97.38% F1 score = 63% | Detection performance deteriorating in dark environments or when interfering objects are present |
Canny and RetinaNet [ ] | Crack detection and measurement | Road | 850 images 256 × 256 pixels | SDNET 2018 dataset | Precision = 85.96% Recall = 84.48% F1 score = 85.21% | - |
Canny and Transformer [ ] | Crack detection and segmentation | Buildings | 11298 images 450 × 450 pixels | UAVs | GA = 83.5% MIoU = 76.2% Precision = 74.3% Recall = 75.2% F1 score = 74.7% | Resulting in a marginal increment in computational costs for various network backbones |
Canny and Inception-ResNet-v2 [ ] | Crack detection, measurement, and classification | High-speed railway | 4650 images 400 × 400 pixels | The track inspection vehicle | High severity level: Precision = 98.37% Recall = 93.82% F1 score = 95.99% Low severity level: Precision = 94.25% Recall = 98.39% F1 score = 96.23% | Only the average width was used to define the severity of the crack, and the influence of the length on the detection result was not considered |
Canny and Unet [ ] | Crack detection | Buildings | 165 images | - | SSIM = 14.5392 PSNR = 0.3206 RMSE = 0.0747 | Relies on a large amount of mural data for training and enhancement |
Method | Features | Domain | Dataset | Image Device/Source | Results | Limitations |
---|---|---|---|---|---|---|
Otsu and Keras classifier [ ] | Crack detection, measurement, and classification | Concrete | 4000 images 227 × 227 pixels | Open dataset available | Classifiers accuracy = 98.25%, 97.18%, 96.17% Length error = 1.5% Width error = 5% Angle of orientation error = 2% | Only accurately quantify one single crack per image |
Otsu and TL MobileNetV2 [ ] | Crack detection, measurement, and classification | Concrete | 11435 images 224 × 224 pixels | Mendeley data—crack detection | Accuracy = 99.87% Recall = 99.74% Precision = 100% F1 score = 99.87% | Dependency on image quality |
Otsu, YOLOv7, Poisson noise, and bilateral filtering [ ] | Crack detection and classification | Bridges | 500 images 640 × 640 pixels | Dataset | Training time = 35 min Inference time = 8.9 s Target correct rate = 85.97% Negative sample misclassification rate = 42.86% | It does not provide quantified information such as length and area |
Adaptive threshold and WSIS [ ] | Crack detection | Road | 320 images 3024 × 4032 pixels | Photos of cracks | Recall = 90% Precision = 52% IoU = 50% F1 score = 66% Accuracy = 98% | For some small cracks (with a width of less than 3 pixels), model can only identify the existence of small cracks, but it is difficult to depict the cracks in detail |
Adaptive threshold and U-GAT-IT [ ] | Crack detection | Road | 300 training images and237 test images | DeepCrack dataset | Recall = 79.3% Precision = 82.2% F1 score = 80.7% | Further research is needed to address the interference caused by factors such as small cracks, road shadows, and water stains |
Local thresholding and DCNN [ ] | Crack detection | Concrete | 125 images 227 × 227 pixels | Cameras | Accuracy = 93% Recall = 91% Precision = 92% F1 score = 91% | - |
Otsu and Faster R-CNN [ ] | Crack detection, localization, and quantification | Concrete | 100 images 1920 × 1080 pixels | Nikon d7200 camera and Galaxy s9 camera | AP = 95% mIoU = 83% RMSE = 2.6 pixels Length accuracy = 93% | The proposed method is useful for concrete cracks only; its applicability for the detection of other crack materials might be limited |
Adaptive Dynamic Thresholding Module (ADTM) and Mask DINO [ ] | Crack detection and segmentation | Road | 395 images 2000 × 1500 pixels | Crack500 | mIoU = 81.3% mAcc = 96.4% gAcc = 85.0% | ADTM module can only handle binary classification problems |
Dynamic Thresholding Branch and DeepCrack [ ] | Crack detection and classification | Bridges | 3648 × 5472 pixels | Crack500 | mIoU = 79.3% mAcc = 98.5% gAcc = 86.6% | Image-level thresholds lead to misclassification of the background |
Method | Features | Domain | Dataset | Image Device/Source | Results | Limitations |
---|---|---|---|---|---|---|
Morphological closing operations and Mask R-CNN [ ] | Crack detection | Tunnel | 761 images 227 × 227 pixels | MTI-200a | Balanced accuracy = 81.94% F1 score = 68.68% IoU = 52.72% | Relatively small compared to the needs of the required sample size for universal conditions |
Morphological operations and Parallel ResNet [ ] | Crack detection and measurement | Road | 206 images (CrackTree200) 800 × 600 pixels and 118 images (CFD) 320 × 480 pixels | CrackTree200 dataset and CFD dataset | CrackTree200: Precision = 94.27% Recall = 92.52% F1 = 93.08% CFD: Precision = 96.21% Recall = 95.12% F1 = 95.63% | The method was only performed on accurate static images |
Closing and CNN [ ] | Crack detection, measurement, and classification | Concrete | 3208 images 256 × 256 pixels or 128 × 128 pixels | Hand-held DSLR cameras | Relative error = 5% Accuracy > 95% Loss < 0.1 | The extraction of the cracks’ edge will have a larger influence on the results |
Dilation and TunnelURes [ ] | Crack detection, measurement, and classification | Tunnel | 6810 images image sizes vary 10441 × 2910 to 50739 × 3140 | Night 4K line-scan cameras | AUC = 0.97 PA = 0.928 IoU = 0.847 | The medial-axis skeletonization algorithm created many errors because it was susceptible to the crack intersection and the image edges where the crack’s representation changed |
Opening, closing, and U-Net [ ] | Crack detection, measurement, and classification | Concrete | 200 images 512 × 512 pixels | Canon SX510 HS camera | Precision = 96.52% Recall = 93.73% F measure = 96.12% Accuracy = 99.74% IoU = 78.12% | It can only detect the other type of cracks which have the same crack geometry as that of thermal cracks |
Morphological operations and DeepLabV3+ [ ] | Crack detection and measurement | Masonry structure | 200 images 780 × 355 pixels and 2880 × 1920 pixels | Internet, drones, and smartphones | IoU = 0.97 F1 score = 98% Accuracy = 98% | The model will not detect crack features that do not appear in the dataset (complicated cracks, tiny cracks, etc.) |
Erosion, texture analysis techniques, and InceptionV3 [ ] | Crack detection and classification | Bridges | 1706 images 256 × 256 pixels | Cameras | F1 score = 93.7% Accuracy = 94.07% | - |
U-Net, opening, and closing operations [ ] | Crack detection and segmentation | Bridges | 244 images 512 × 512 pixels | Cameras | mP = 44.57% mR = 53.13% Mf1 = 42.79% mIoU = 64.79% | The model lacks generality, and there are cases of false detection |
Sensor Type | Fusion Method | Advantages | Disadvantages | Application Scenarios |
---|---|---|---|---|
Optical sensor [ ] | Data-level fusion | High resolution, rich in details | Susceptible to light and occlusion | Surface crack detection, general environments |
Thermal sensor [ ] | Feature level fusion | Suitable for nighttime or low-light environments, detects temperature changes | Low resolution, lack of detail | Nighttime detection, heat-sensitive areas, large-area surface crack detection |
Laser sensor [ ] | Data-level fusion and feature level fusion | High-precision 3D point cloud data, accurately measures crack morphology | High equipment cost, complex data processing | Complex structures, precise measurements |
Strain sensor [ ] | Feature level fusion and decision-level fusion | High sensitivity to structural changes; durable | Requires contact with the material; installation complexity | Monitoring structural health in bridges and buildings; detecting early-stage crack development |
Ultrasonic sensor [ ] | Data-level fusion and feature level fusion | Detects internal cracks in materials, strong penetration | Affected by material and geometric shape, limited resolution | Internal cracks, metal material detection |
Optical fiber sensor [ ] | Feature level fusion | High sensitivity to changes in material properties, non-contact measurement | Affected by environmental conditions, requires calibration | Surface crack detection, structural health monitoring |
Vibration sensor [ ] | Data-level fusion | Detects structural vibration characteristics, strong adaptability | Affected by environmental vibrations, requires complex signal processing | Dynamic crack monitoring, bridges and other structures |
Multispectral satellite sensor [ ] | Data-level fusion | Rich spectral information | Limited spectral resolution, weather- and lighting-dependent, high cost | Pavement crack detection, bridge and infrastructure monitoring, building facade inspection |
High-resolution satellite sensors [ ] | Data-level fusion and feature level fusion | High spatial resolution, wide coverage, frequent revisit times, rich information content | Weather dependency, high cost, data processing complexity, limited temporal resolution | Road and pavement crack detection, bridge and infrastructure monitoring, urban building facade inspection, railway and highway crack monitoring |
Scale | Dataset/(Pixels × Pixels) | References |
---|---|---|
Image-based | 227 × 227 | [ , , , ] |
224 × 224 | [ ] | |
256 × 256 | [ ] | |
416 × 416 | [ ] | |
512 × 512 | [ ] | |
Patch-based | 128 × 128 | [ , ] |
200 × 200 | [ ] | |
224 × 224 | [ , , , , ] | |
227 × 227 | [ ] | |
256 × 256 | [ , ] | |
300 × 300 | [ , ] | |
320 × 480 | [ , ] | |
544 × 384 | [ ] | |
512 × 512 | [ , , , ] | |
584 × 384 | [ ] |
Model | Improvement/Innovation | Dataset | Backbone | Results |
---|---|---|---|---|
Faster R-CNN [ ] | Combined with drones for crack detection | 2000 images 5280 × 2970 pixels | VGG-16 | Precision = 92.03% Recall = 96.26% F1 score = 94.10% |
Faster R-CNN [ ] | Double-head structure is introduced, including an independent fully connected head and a convolution head | 1622 images 1612 × 1947 pixels | ResNet50 | AP = 47.2% |
Mask R-CNN [ ] | The morphological closing operation was incorporated into the M-R-101-FPN model to form an integrated model | 761 images 227 × 227 pixels | ResNets and VGG | Balanced accuracy = 81.94% F1 score = 68.68% IoU = 52.72% |
Mask R-CNN [ ] | PAFPN module and edge detection branch was introduced | 9680 images 1500 × 1500 pixels | ResNet-FPN | Precision = 92.03% Recall = 96.26% AP = 94.10% mAP = 90.57% Error rate = 0.57% |
Mask R-CNN [ ] | FPN structure introduces side join method and combines FPN with ResNet-101 to change RoI-Pooling layer to RoI-Align layer | 3430 images 1024 × 1024 pixels | ResNet101 | AP = 83.3% F1 score = 82.4% Average error = 2.33% mIoU = 70.1% |
YOLOv3-tiny [ ] | A structural crack detection and quantification method combined with structured light is proposed | 500 images 640 × 640 pixels | Darknet-53 | Accuracy = 94% Precision = 98% |
YOLOv4 [ ] | Some lightweight networks were used instead of the original backbone feature extraction network, and DenseNet, MobileNet, and GhostNet were selected for the lightweight networks | 800 images 416 × 416 pixels | DenseNet, MobileNet v1, MobileNet v2, MobileNet v3, and GhostNet | Precision = 93.96% Recall = 90.12% F1 score = 92% |
YOLOv4 [ ] | - | 1463 images 256 × 256 pixels | Darknet-53 | Accuracy = 92% mAP = 92% |
Datasets Name | Number of Images | Image Resolution | Manual Annotation | Scope of Applicability | Limitations |
---|---|---|---|---|---|
CrackTree200 [ ] | 206 images | 800 × 600 pixels | Pixel-level annotations for cracks | Crack classification and segmentation | With only 200 images, the dataset’s relatively small size can hinder the model’s ability to generalize across diverse conditions, potentially leading to overfitting on the specific examples provided |
Crack500 [ ] | 500 images | 2000 × 1500 pixels | Pixel-level annotations for cracks | Crack classification and segmentation | Limited number of images compared to larger datasets, which might affect the generalization of models trained on this dataset |
SDNET 2018 [ ] | 56000 images | 256 × 256 pixels | Pixel-level annotations for cracks | Crack classification and segmentation | The dataset’s focus on concrete surfaces may limit the model’s performance when applied to different types of surfaces or structures |
Mendeley data—crack detection [ ] | 40000 images | 227 × 227 pixels | Pixel-level annotations for cracks | Crack classification | The dataset might not cover all types of cracks or surface conditions, which can limit its applicability to a wide range of real-world scenarios |
DeepCrack [ ] | 2500 images | 512 × 512 pixels | Annotations for cracks | Crack segmentation | The resolution might limit the ability of models to capture very small or subtle crack features |
CFD [ ] | 118 images | 320 × 480 pixels | Pixel-level annotations for cracks | Crack segmentation | The dataset contains a limited number of data samples, which may limit the generalization ability of the model |
CrackTree260 [ ] | 260 images | 800 × 600 pixels and 960 × 720 pixels | Pixel-level labeling, bounding boxes, or other crack markers | Object detection and segmentation | Because the dataset is small, it can be easy for the model to overfit the training data, especially if you’re using a complex model |
CrackLS315 [ ] | 315 images | 512 × 512 pixels | Pixel-level segmentation mask or bounding box | Object detection and segmentation | The small size of the dataset may make the model perform poorly in complex scenarios, especially when encountering different types of cracks or uncommon crack features |
Stone331 [ ] | 331 images | 512 × 512 pixels | Pixel-level segmentation mask or bounding box | Object detection and segmentation | The relatively small number of images limits the generalization ability of the model, especially in deep learning tasks where smaller datasets tend to lead to overfitting |
Index | Index Value and Calculation Formula | Curve |
---|---|---|
True positive | - | |
False positive | - | |
True negative | - | |
False negative | - | |
Precision | PRC | |
Recall | PRC, ROC curve | |
F1 score | F1 score curve | |
Accuracy | Accuracy vs. threshold curve | |
Average precision | PRC | |
Mean average precision | - | |
IoU | IoU distribution curve, precision-recall curve with IoU thresholds |
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
Share and Cite
Yuan, Q.; Shi, Y.; Li, M. A Review of Computer Vision-Based Crack Detection Methods in Civil Infrastructure: Progress and Challenges. Remote Sens. 2024 , 16 , 2910. https://doi.org/10.3390/rs16162910
Yuan Q, Shi Y, Li M. A Review of Computer Vision-Based Crack Detection Methods in Civil Infrastructure: Progress and Challenges. Remote Sensing . 2024; 16(16):2910. https://doi.org/10.3390/rs16162910
Yuan, Qi, Yufeng Shi, and Mingyue Li. 2024. "A Review of Computer Vision-Based Crack Detection Methods in Civil Infrastructure: Progress and Challenges" Remote Sensing 16, no. 16: 2910. https://doi.org/10.3390/rs16162910
Article Metrics
Article access statistics, further information, mdpi initiatives, follow mdpi.
Subscribe to receive issue release notifications and newsletters from MDPI journals
IMAGES
COMMENTS
Residual Networks, or ResNets, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack residual blocks ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks ...
Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. View a PDF of the paper titled Deep Residual Learning for Image Recognition, by Kaiming He and 3 other authors. Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are ...
Summary Residual Networks, or ResNets, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack residual blocks ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these ...
In this paper, we use no maxout/dropout and just simply impose regular-ization via deep and thin architectures by design, without distracting from the focus on the difficulties of optimiza-tion.
Through combining minor architectural changes (used since 2018) and improved training and scaling strategies, we discover the ResNet architecture sets a state-of-the-art baseline for vision research.
This paper investigates a deep learning method in image classification for the detection of colorectal cancer with ResNet architecture. The exceptiona…
ResNet 50 is a crucial network for you to understand. It is the basis of much academic research in this field. Many different papers will compare their results to a ResNet 50 baseline, and it is valuable as a reference point. As well, we can easily download the weights for ResNet 50 networks that have been trained on the ImageNet dataset and ...
In this paper, we address the degradation problem by introducing a deep residual learning framework. In-stead of hoping each few stacked layers directly fit a desired underlying mapping, we explicitly let these lay-ers fit a residual mapping.
With the increasing popularity of deep learning, enterprises are replacing traditional inefficient and non-robust defect detection methods with intelligent recognition technology. This paper utilizes TL (transfer learning) to enhance the model's recognition performance by integrating the Adam optimizer and a learning rate decay strategy. By comparing the TL-ResNet50 model with other classic ...
The model is based on the ResNet-50 architecture and identifies individuals with face masks well. The research aims to compare the performance of two popular deep learning models, ResNet50V2 and VGG16, for feature extraction in image classification tasks.
The pre-trained ResNet50 TL model's weight which is trained on the ImageNet dataset utilizing M o C o _ v 2 was collected from these [55], [56] research paper.
In this paper, we intend to determine which model of the architecture of the Convoluted Neural Network (CNN) can be used to solve a real-life problem of product classification to help optimize pricing comparison.
Download scientific diagram | The architecture of ResNet-50 model. from publication: Performance Evaluation of Deep CNN-Based Crack Detection and Localization Techniques for Concrete Structures ...
In this paper, we conducted a comparative study of three deep neural network architectures, the VGG16, ResNet-50, and GoogLeNet for breathing sounds classification.
ResNet 50 ResNet 50 is a crucial network for you to understand. It is the basis of much academic research in this field. Many different papers will compare their results to a ResNet 50 baseline, and it is valuable as a reference point. As well, we can easily download the weights for ResNet 50 networks that have been trained on the ImageNet dataset and modify the last layers (called ...
Combining the DCT fusion, CNN SR, and ResNet50 frameworks (aka, DCT-CNN-ResNet50 architecture) within the same design gives rise to the third contribution. This is the first time appearance of the DCT-CNN-ResNet50 in research to the best of the authors' knowledge.
Download scientific diagram | The proposed Resnet50 CNN architecture from publication: Illumination-robust face recognition based on deep convolutional neural networks architectures | In the last ...
The objective of this research work is to categorize and identify various plant diseases at their earliest stages. Due to the late 20th century's dramatic expansion in human population, there is an enormous demand for crops, fruits, and vegetables that much exceeds availability. Due to plant disease and pests, nearly more than 30% of crops are lost. Pests have a direct negative influence on ...
The architecture of ResNet50 was divided into 4 stages. Every ResNet architecture performed the initial convolution and max-pooling using 7 Â 7 and 3 Â 3 kernel sizes respectively.
ResNet50 is a variant of ResNet model which has 48 Convolution layers along with 1 MaxPool and 1 Average Pool layer. It has 3.8 x 10^9 Floating points operations. It is a widely used ResNet model and we have explored ResNet50 architecture in depth. We start with some background information, comparison with other models and then, dive directly ...
ResNet50 incorporates skip connections to mitigate vanishing gradient issues during training. As shown in Fig. 2 [23], the architecture of ResNet50 is depicted. [23]. ...
Based on recent research and technology development trends in the field of crack detection in civil engineering infrastructure, this paper proposes a comprehensive classification framework that classifies crack detection methods into three categories: combination of traditional methods and deep learning, multimodal data fusion, and semantic ...
Research paper. Instance segmentation of pigs in infrared images based on INPC model ... The pioneer model for instance segmentation is Mask R-CNN, and its architecture is shown in Fig. 5. ... We used ResNet networks as our research subjects, including ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152 with different depths as backbones ...