Subscribe to the PwC Newsletter

Join the community, edit resnet - rjt1990, add metadata.

Your model lacks metadata. Adding metadata gives context on how your model was trained.

Take the following JSON template, fill it in with your model's correct values:

[INSERT ADVICE HERE]

Add Training Data for resnet18

Remove dataset from resnet18, add attribute for resnet18, remove attribute from resnet18.

resnet18
0.875
224
bilinear

Add Technique for resnet18

Remove technique from resnet18, add motif for resnet18.

  • 1X1 CONVOLUTION
  • BATCH NORMALIZATION
  • RESIDUAL CONNECTION
  • BOTTLENECK RESIDUAL BLOCK
  • GLOBAL AVERAGE POOLING
  • RESIDUAL BLOCK
  • CONVOLUTION
  • MAX POOLING

Remove Motif from resnet18

  • 1X1 CONVOLUTION -
  • BATCH NORMALIZATION -
  • RESIDUAL CONNECTION -
  • BOTTLENECK RESIDUAL BLOCK -
  • GLOBAL AVERAGE POOLING -
  • RESIDUAL BLOCK -
  • CONVOLUTION -
  • MAX POOLING -

Add Training Data for resnet26

Remove dataset from resnet26, add attribute for resnet26, remove attribute from resnet26.

resnet26
0.875
224
bicubic

Add Technique for resnet26

Remove technique from resnet26, add motif for resnet26, remove motif from resnet26, add training data for resnet34, remove dataset from resnet34, add attribute for resnet34, remove attribute from resnet34.

resnet34
0.875
224
bilinear

Add Technique for resnet34

Remove technique from resnet34, add motif for resnet34, remove motif from resnet34, add training data for resnet50, remove dataset from resnet50, add attribute for resnet50, remove attribute from resnet50.

resnet50
0.875
224
bicubic

Add Technique for resnet50

Remove technique from resnet50, add motif for resnet50, remove motif from resnet50, add training data for resnetblur50, remove dataset from resnetblur50, add attribute for resnetblur50, remove attribute from resnetblur50.

resnetblur50
0.875
224
bicubic

Add Technique for resnetblur50

Remove technique from resnetblur50, add motif for resnetblur50, remove motif from resnetblur50, add training data for tv_resnet101, remove dataset from tv_resnet101, add attribute for tv_resnet101, remove attribute from tv_resnet101.

tv_resnet101
0.1
90
0.875
0.1
0.9
32
224
30
0.0001
bilinear

Add Technique for tv_resnet101

  • WEIGHT DECAY
  • SGD WITH MOMENTUM

Remove Technique from tv_resnet101

  • WEIGHT DECAY -
  • SGD WITH MOMENTUM -

Add Motif for tv_resnet101

Remove motif from tv_resnet101, add training data for tv_resnet152, remove dataset from tv_resnet152, add attribute for tv_resnet152, remove attribute from tv_resnet152.

tv_resnet152
0.1
90
0.875
0.1
0.9
32
224
30
0.0001
bilinear

Add Technique for tv_resnet152

Remove technique from tv_resnet152, add motif for tv_resnet152, remove motif from tv_resnet152, add training data for tv_resnet34, remove dataset from tv_resnet34, add attribute for tv_resnet34, remove attribute from tv_resnet34.

tv_resnet34
0.1
90
0.875
0.1
0.9
32
224
30
0.0001
bilinear

Add Technique for tv_resnet34

Remove technique from tv_resnet34, add motif for tv_resnet34, remove motif from tv_resnet34, add training data for tv_resnet50, remove dataset from tv_resnet50, add attribute for tv_resnet50, remove attribute from tv_resnet50.

tv_resnet50
0.1
90
0.875
0.1
0.9
32
224
30
0.0001
bilinear

Add Technique for tv_resnet50

Remove technique from tv_resnet50, add motif for tv_resnet50, remove motif from tv_resnet50, rwightman / pytorch-image-models.

resnet50 architecture research paper

Architecture , , , , , , , , ,
ID resnet18
Crop Pct 0.875
Image Size 224
Interpolation bilinear

resnet50 architecture research paper

Architecture , , , , , , , , ,
ID resnet26
Crop Pct 0.875
Image Size 224
Interpolation bicubic

resnet50 architecture research paper

Architecture , , , , , , , , ,
ID resnet34
Crop Pct 0.875
Image Size 224
Interpolation bilinear

resnet50 architecture research paper

Architecture , , , , , , , , ,
ID resnet50
Crop Pct 0.875
Image Size 224
Interpolation bicubic

resnet50 architecture research paper

Architecture , , , , , , , , , , Blur Pooling
ID resnetblur50
Crop Pct 0.875
Image Size 224
Interpolation bicubic

resnet50 architecture research paper

Training Techniques ,
Architecture , , , , , , , , ,
ID tv_resnet101
LR 0.1
Epochs 90
Crop Pct 0.875
LR Gamma 0.1
Momentum 0.9
Batch Size 32
Image Size 224
LR Step Size 30
Weight Decay 0.0001
Interpolation bilinear

resnet50 architecture research paper

Training Techniques ,
Architecture , , , , , , , , ,
ID tv_resnet152
LR 0.1
Epochs 90
Crop Pct 0.875
LR Gamma 0.1
Momentum 0.9
Batch Size 32
Image Size 224
LR Step Size 30
Weight Decay 0.0001
Interpolation bilinear

resnet50 architecture research paper

Training Techniques ,
Architecture , , , , , , , , ,
ID tv_resnet34
LR 0.1
Epochs 90
Crop Pct 0.875
LR Gamma 0.1
Momentum 0.9
Batch Size 32
Image Size 224
LR Step Size 30
Weight Decay 0.0001
Interpolation bilinear

resnet50 architecture research paper

Training Techniques ,
Architecture , , , , , , , , ,
ID tv_resnet50
LR 0.1
Epochs 90
Crop Pct 0.875
LR Gamma 0.1
Momentum 0.9
Batch Size 32
Image Size 224
LR Step Size 30
Weight Decay 0.0001
Interpolation bilinear

Residual Networks , or ResNets , learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack residual blocks ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks.

How do I load this model?

To load a pretrained model:

Replace the model name with the variant you want to use, e.g. resnet18 . You can find the IDs in the model summaries at the top of this page.

How do I train this model?

You can follow the timm recipe scripts for training a new model afresh.

Image Classification on ImageNet

Image Classification on ImageNet Top 1 Accuracy Top 5 Accuracy Image Size FLOPs LR Epochs Interpolation Momentum Batch Size LR Step Size Weight Decay Parameters LR Gamma Crop Pct Parameters FLOPs Top 1 Accuracy Top 5 Accuracy Image Size LR Epochs Interpolation Momentum Batch Size LR Step Size Weight Decay LR Gamma Crop Pct PyTorch Image Models All Models ResNet

MODEL TOP 1 ACCURACY TOP 5 ACCURACY
resnetblur50 79.29% 94.64%
resnet50 79.04% 94.39%
tv_resnet152 78.32% 94.05%
tv_resnet101 77.37% 93.56%
tv_resnet50 76.16% 92.88%
resnet26 75.29% 92.57%
resnet34 75.11% 92.28%
tv_resnet34 73.3% 91.42%
resnet18 69.74% 89.09%
  • First Online: 05 January 2021

Cite this chapter

resnet50 architecture research paper

  • Brett Koonce 2  

4940 Accesses

51 Citations

ResNet 50 is a crucial network for you to understand. It is the basis of much academic research in this field. Many different papers will compare their results to a ResNet 50 baseline, and it is valuable as a reference point. As well, we can easily download the weights for ResNet 50 networks that have been trained on the Imagenet dataset and modify the last layers (called **retraining** or **transfer learning**) to quickly produce models to tackle new problems. For most problems, this is the best approach to get started with, rather than trying to invent new networks or techniques. Building a custom dataset and scaling it up with data augmentation techniques will get you a lot further than trying to build a new architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and affiliations.

Jefferson, MO, USA

Brett Koonce

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Brett Koonce

About this chapter

Koonce, B. (2021). ResNet 50. In: Convolutional Neural Networks with Swift for Tensorflow. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-6168-2_6

Download citation

DOI : https://doi.org/10.1007/978-1-4842-6168-2_6

Published : 05 January 2021

Publisher Name : Apress, Berkeley, CA

Print ISBN : 978-1-4842-6167-5

Online ISBN : 978-1-4842-6168-2

eBook Packages : Professional and Applied Computing Apress Access Books Professional and Applied Computing (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Search anything:

Understanding ResNet50 architecture

Machine learning (ml).

Binary Tree book by OpenGenus

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

ResNet50 is a variant of ResNet model which has 48 Convolution layers along with 1 MaxPool and 1 Average Pool layer. It has 3.8 x 10^9 Floating points operations. It is a widely used ResNet model and we have explored ResNet50 architecture in depth.

We start with some background information, comparison with other models and then, dive directly into ResNet50 architecture.

Introduction

In 2012 at the LSVRC2012 classification contest AlexNet won the the first price, After that ResNet was the most interesting thing that happened to the computer vision and the deep learning world.

Because of the framework that ResNets presented it was made possible to train ultra deep neural networks and by that i mean that i network can contain hundreds or thousands of layers and still achieve great performance.

The ResNets were initially applied to the image recognition task but as it is mentioned in the paper that the framework can also be used for non computer vision tasks also to achieve better accuracy.

Many of you may argue that simply stacking more layers also gives us better accuracy why was there a need of Residual learning for training ultra deep neural networks.

As we know that Deep Convolutional neural networks are really great at identifying low, mid and high level features from the images and stacking more layers generally gives us better accuracy so a question arrises that is getting better model performance as easy as stacking more layers?

With this questions arises the problem of vanishing/exploding gradients those problems were largely handled by many ways and enabled networks with tens of layers to converge but when deep neural networks start to converge we see another problem of the accuracy getting saturated and then degrading rapidly and this was not caused by overfitting as one may guess and adding more layers to a suitable deep model just increased the training error.

This problem was further rectifed by by taking a shallower model and a deep model that was constructed with the layers from the shallow model and and adding identity layers to it and accordingly the deeper model shouldn't have produced any higher training error than its counterpart as the added layers were just the identity layers.

Fig 1

In Figure 1 we can see on the left and the right that the deeper model is always producing more error, where in fact it shouldn't have done that.

The authors addressed this problem by introducing deep residual learning framework so for this they introduce shortcut connections that simply perform identity mappings

Fig 2

They explicitly let the layers fit a residual mapping and denoated that as H(x) and they let the non linear layers fit another mapping F(x):=H(x)−x so the original mapping becomes H(x):=F(x)+x as can be seen in Figure 2.

And the benifit of these shortcut identity mapping were that there was no additional parameters added to the model and also the computational time was kept in check.

Fig 3

To demonstrate how much better the ResNet are they comapred it with a 34 layer model and a 18 layer model both with plain and residual mappings and the results were not so astounding the 18 layer plain net outperformed the 34 layer plain net and in the case of ResNet the 34 layer ResNet outperformed the 18 layer ResNet as can be seen in figure 3.

ResNet50 Architecture

Table 1

Now we are going to discuss about Resnet 50 and also the architecture for the above talked 18 and 34 layer ResNet is also given residual mapping and not shown for simplicity.

There was a small change that was made for the ResNet 50 and above that before this the shortcut connections skipped two layers but now they skip three layers and also there was 1 * 1 convolution layers added that we are going to see in detail with the ResNet 50 Architecture.

Screenshot-from-2020-03-20-15-56-22

So as we can see in the table 1 the resnet 50 architecture contains the following element:

  • A convoultion with a kernel size of 7 * 7 and 64 different kernels all with a stride of size 2 giving us 1 layer .
  • Next we see max pooling with also a stride size of 2.
  • In the next convolution there is a 1 * 1,64 kernel following this a 3 * 3,64 kernel and at last a 1 * 1,256 kernel, These three layers are repeated in total 3 time so giving us 9 layers in this step.
  • Next we see kernel of 1 * 1,128 after that a kernel of 3 * 3,128 and at last a kernel of 1 * 1,512 this step was repeated 4 time so giving us 12 layers in this step.
  • After that there is a kernal of 1 * 1,256 and two more kernels with 3 * 3,256 and 1 * 1,1024 and this is repeated 6 time giving us a total of 18 layers .
  • And then again a 1 * 1,512 kernel with two more of 3 * 3,512 and 1 * 1,2048 and this was repeated 3 times giving us a total of 9 layers .
  • After that we do a average pool and end it with a fully connected layer containing 1000 nodes and at the end a softmax function so this gives us 1 layer .

We don't actually count the activation functions and the max/ average pooling layers.

so totaling this it gives us a 1 + 9 + 12 + 18 + 9 + 1 = 50 layers Deep Convolutional network.

The Result were pretty good on the ImageNet validation set, The ResNet 50 model achieved a top-1 error rate of 20.47 percent and and achieved a top-5 error rate of 5.25 percent, This is reported for single model that consists of 50 layers not a ensemble of it. below is the table given if you want to compare it with other ResNets or with other models.

Result table

  • This architecture can be used on computer vision tasks such as image classififcation, object localisation, object detection.
  • and this framework can also be applied to non computer vision tasks to give them the benifit of depth and to reduce the computational expense also.
  • Research paper for Deep residual learning.
  • VGG-19 by Aakash Kaushik (opengenus).
  • Floating point operations per second (FLOPS) of Machine Learning models.
  • Convolutional Neural Network by Piyush Mishra and Junaid N Z (OpenGenus)

OpenGenus IQ: Learn Algorithms, DL, System Design icon

ResNet-50 neural network architecture [56].

ResNet-50 neural network architecture [56].

Figure 1. Illustration of the main set-up and the sensors.

Context in source publication

  • Sih Yuliana Wahyuningtyas

Yerik Afrianto Singgalen

  • Cheick Abdoul Kadir A. Kounta

Lionel Arnaud

  • Xiaoyan Wang
  • J PARALLEL DISTR COM

Furqan Rustam

  • Comput Model Eng Sci
  • Wenjun Zhang
  • Wenfeng Wang
  • MULTIMED TOOLS APPL
  • Dogus Karabulut
  • Cagri Ozcinar

Thomas B. Moeslund

  • Mohd Umair Ali Siddique
  • Sonu Moni Rabha
  • Janoo Periwal
  • Nupur Choudhury

Rupesh Mandal

  • Darshan Lahamage
  • Tanya Anupam
  • Rajendra Sawant
  • P. Sivakamasundari

G. Niranjana

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

remotesensing-logo

Article Menu

resnet50 architecture research paper

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

A review of computer vision-based crack detection methods in civil infrastructure: progress and challenges.

resnet50 architecture research paper

1. Introduction

2. crack detection combining traditional image processing methods and deep learning, 2.1. crack detection based on image edge detection and deep learning, 2.2. crack detection based on threshold segmentation and deep learning, 2.3. crack detection based on morphological operations and deep learning, 3. crack detection based on multimodal data fusion, 3.1. multi-sensor fusion, 3.2. multi-source data fusion, 4. crack detection based on image semantic understanding, 4.1. crack detection based on classification networks, 4.2. crack detection based on object detection networks, 4.3. crack detection based on segmentation networks.

ModelImprovement/InnovationBackbone/Feature Extraction ArchitectureEfficiencyResults
FCS-Net [ ]Integrating ResNet-50, ASPP, and BNResNet-50-MIoU = 74.08%
FCN-SFW [ ]Combining fully convolutional network (FCN) and structural forests with wavelet transform (SFW) for detecting tiny cracksFCNComputing time = 1.5826 sPrecision = 64.1%
Recall = 87.22%
F1 score = 68.28%
AFFNet [ ]Using ResNet101 as the backbone network, and incorporating two attention mechanism modules, namely VH-CAM and ECAUMResNet101Execution time = 52 msMIoU = 84.49%
FWIoU = 97.07%
PA = 98.36%
MPA = 92.01%
DeepLabv3+ [ ]Replacing ordinary convolution with separable convolution; improved SE_ASSP moduleXception-65-AP = 97.63%
MAP = 95.58%
MIoU = 81.87%
U-Net [ ]The parameters were optimized (the depths of the network, the choice of activation functions, the selection of loss functions, and the data augmentation)Encoder and decoderAnalysis speed (1024 × 1024 pixels) = 0.022 sPrecision = 84.6%
Recall = 72.5%
F1 score = 78.1%
IoU = 64%
KTCAM-Net [ ]Combined CAM and RCM; integrating classification network and segmentation networkDeepLabv3FPS = 28Accuracy = 97.26%
Precision = 68.9%
Recall = 83.7%
F1 score = 75.4%
MIoU = 74.3%
ADDU-Net [ ]Featuring asymmetric dual decoders and dual attention mechanismsEncoder and decoderFPS = 35Precision = 68.9%
Recall = 83.7%
F1 score = 75.4%
MIoU = 74.3%
CGTr-Net [ ]Optimized CG-Trans, TCFF, and hybrid loss functionsCG-Trans-Precision = 88.8%
Recall = 88.3%
F1 score = 88.6%
MIoU = 89.4%
PCSN [ ]Using Adadelta as the optimizer and categorical cross-entropy as the loss function for the networkSegNetInference time = 0.12 smAP = 83%
Accuracy = 90%
Recall = 50%
DEHF-Net [ ]Introducing dual-branch encoder unit, feature fusion scheme, edge refinement module, and multi-scale feature fusion moduleDual-branch encoder unit-Precision = 86.3%
Recall = 92.4%
Dice score = 78.7%
mIoU = 81.6%
Student model + teacher model [ ]Proposed a semi-supervised semantic segmentation networkEfficientUNet-Precision = 84.98%
Recall = 84.38%
F1 score = 83.15%

5. Datasets

6. evaluation index, 7. discussion, 8. conclusions, author contributions, data availability statement, acknowledgments, conflicts of interest.

AspectCombining Traditional Image Processing Methods and Deep LearningMultimodal Data Fusion
Processing speedModerate—traditional methods are usually fast, but deep learning models may be slower, and the overall speed depends on the complexity of the deep learning modelSlower—data fusion and processing speed can be slow, especially with large-scale multimodal data, involving significant computational and data transfer overhead
AccuracyHigh—combines the interpretability of traditional methods with the complex pattern handling of deep learning, generally resulting in high detection accuracyTypically higher—combining different data sources (e.g., images, text, audio) provides comprehensive information, improving overall detection accuracy
RobustnessStrong—traditional methods provide background knowledge, enhancing robustness, but deep learning’s risk of overfitting may reduce robustnessVery strong—fusion of multiple data sources enhances the model’s adaptability to different environments and conditions, better handling noise and anomalies
ComplexityHigh—integrating traditional methods and deep learning involves complex design and balancing, with challenges in tuning and interpreting deep learning modelsHigh—involves complex data preprocessing, alignment, and fusion, handling inconsistencies and complexities from multiple data sources
AdaptabilityStrong—can adapt to different types of cracks and background variations, with deep learning models learning features from data, though it requires substantial labeled dataVery strong—combines diverse data sources, adapting well to various environments and conditions, and handling complex backgrounds and variations effectively
InterpretabilityHigher—traditional methods provide clear explanations, while deep learning models often lack interpretability; combining them can improve overall interpretabilityLower—fusion models generally have lower interpretability, making it difficult to intuitively explain how different data sources influence the final results
Data requirementsHigh—deep learning models require a lot of labeled data, while traditional methods are more lenient, though deep learning still demands substantial dataVery high—requires large amounts of data from various modalities, and these data need to be processed and aligned effectively for successful fusion
FlexibilityModerate—combining traditional methods and deep learning handles various types of cracks, but may be limited in very complex scenariosHigh—handles multiple data sources and different crack information, improving performance in diverse conditions through multimodal fusion
Real-time capabilityPoor—deep learning models are often slow to train and infer, making them less suitable for real-time detection, though combining with traditional methods can helpPoor—multimodal data fusion processing is generally slow, making it less suitable for real-time applications
Maintenance costModerate to high—deep learning models require regular updates and maintenance, while traditional methods have lower maintenance costsHigh—involves ongoing maintenance and updates for multiple data sources, with complex data preprocessing and fusion processes
Noise handlingGood—traditional methods effectively handle noise under certain conditions, and deep learning models can mitigate noise effects through trainingStrong—multimodal fusion can complement information from different sources, improving robustness to noise and enhancing detection accuracy
  • Azimi, M.; Eslamlou, A.D.; Pekcan, G. Data-driven structural health monitoring and damage detection through deep learning: State-of-the-art review. Sensors 2020 , 20 , 2778. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Han, X.; Zhao, Z. Structural surface crack detection method based on computer vision technology. J. Build. Struct. 2018 , 39 , 418–427. [ Google Scholar ]
  • Kruachottikul, P.; Cooharojananone, N.; Phanomchoeng, G.; Chavarnakul, T.; Kovitanggoon, K.; Trakulwaranont, D. Deep learning-based visual defect-inspection system for reinforced concrete bridge substructure: A case of thailand’s department of highways. J. Civ. Struct. Health Monit. 2021 , 11 , 949–965. [ Google Scholar ] [ CrossRef ]
  • Gehri, N.; Mata-Falcón, J.; Kaufmann, W. Automated crack detection and measurement based on digital image correlation. Constr. Build. Mater. 2020 , 256 , 119383. [ Google Scholar ] [ CrossRef ]
  • Mohan, A.; Poobal, S. Crack detection using image processing: A critical review and analysis. Alex. Eng. J. 2018 , 57 , 787–798. [ Google Scholar ] [ CrossRef ]
  • Liu, Y.; Fan, J.; Nie, J.; Kong, S.; Qi, Y. Review and prospect of digital-image-based crack detection of structure surface. China Civ. Eng. J. 2021 , 54 , 79–98. [ Google Scholar ]
  • Hsieh, Y.-A.; Tsai, Y.J. Machine learning for crack detection: Review and model performance comparison. J. Comput. Civ. Eng. 2020 , 34 , 04020038. [ Google Scholar ] [ CrossRef ]
  • Xu, Y.; Bao, Y.; Chen, J.; Zuo, W.; Li, H. Surface fatigue crack identification in steel box girder of bridges by a deep fusion convolutional neural network based on consumer-grade camera images. Struct. Health Monit. 2019 , 18 , 653–674. [ Google Scholar ] [ CrossRef ]
  • Wang, W.; Deng, L.; Shao, X. Fatigue design of steel bridges considering the effect of dynamic vehicle loading and overloaded trucks. J. Bridge Eng. 2016 , 21 , 04016048. [ Google Scholar ] [ CrossRef ]
  • Zheng, K.; Zhou, S.; Zhang, Y.; Wei, Y.; Wang, J.; Wang, Y.; Qin, X. Simplified evaluation of shear stiffness degradation of diagonally cracked reinforced concrete beams. Materials 2023 , 16 , 4752. [ Google Scholar ] [ CrossRef ]
  • Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986 , PAMI-8 , 679–698. [ Google Scholar ] [ CrossRef ]
  • Otsu, N. A threshold selection method from gray-level histograms. Automatica 1975 , 11 , 23–27. [ Google Scholar ] [ CrossRef ]
  • Sohn, H.G.; Lim, Y.M.; Yun, K.H.; Kim, G.H. Monitoring crack changes in concrete structures. Comput.-Aided Civ. Infrastruct. Eng. 2005 , 20 , 52–61. [ Google Scholar ] [ CrossRef ]
  • Wang, P.; Qiao, H.; Feng, Q.; Xue, C. Internal corrosion cracks evolution in reinforced magnesium oxychloride cement concrete. Adv. Cem. Res. 2023 , 36 , 15–30. [ Google Scholar ] [ CrossRef ]
  • Loutridis, S.; Douka, E.; Trochidis, A. Crack identification in double-cracked beams using wavelet analysis. J. Sound Vib. 2004 , 277 , 1025–1039. [ Google Scholar ] [ CrossRef ]
  • Fan, C.L. Detection of multidamage to reinforced concrete using support vector machine-based clustering from digital images. Struct. Control Health Monit. 2021 , 28 , e2841. [ Google Scholar ] [ CrossRef ]
  • Kyal, C.; Reza, M.; Varu, B.; Shreya, S. Image-based concrete crack detection using random forest and convolution neural network. In Computational Intelligence in Pattern Recognition: Proceedings of the International Conference on Computational Intelligence in Pattern Recognition (CIPR 2021), Held at the Institute of Engineering and Management, Kolkata, West Bengal, India, on 24–25 April 2021 ; Springer: Singapore, 2022; pp. 471–481. [ Google Scholar ]
  • Jia, H.; Lin, J.; Liu, J. Bridge seismic damage assessment model applying artificial neural networks and the random forest algorithm. Adv. Civ. Eng. 2020 , 2020 , 6548682. [ Google Scholar ] [ CrossRef ]
  • Park, M.J.; Kim, J.; Jeong, S.; Jang, A.; Bae, J.; Ju, Y.K. Machine learning-based concrete crack depth prediction using thermal images taken under daylight conditions. Remote Sens. 2022 , 14 , 2151. [ Google Scholar ] [ CrossRef ]
  • LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015 , 521 , 436–444. [ Google Scholar ] [ CrossRef ]
  • Liu, Z.; Cao, Y.; Wang, Y.; Wang, W. Computer vision-based concrete crack detection using u-net fully convolutional networks. Autom. Constr. 2019 , 104 , 129–139. [ Google Scholar ] [ CrossRef ]
  • Li, G.; Ma, B.; He, S.; Ren, X.; Liu, Q. Automatic tunnel crack detection based on u-net and a convolutional neural network with alternately updated clique. Sensors 2020 , 20 , 717. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Chaiyasarn, K.; Buatik, A.; Mohamad, H.; Zhou, M.; Kongsilp, S.; Poovarodom, N. Integrated pixel-level cnn-fcn crack detection via photogrammetric 3d texture mapping of concrete structures. Autom. Constr. 2022 , 140 , 104388. [ Google Scholar ] [ CrossRef ]
  • Li, S.; Zhao, X.; Zhou, G. Automatic pixel-level multiple damage detection of concrete structure using fully convolutional network. Comput.-Aided Civ. Infrastruct. Eng. 2019 , 34 , 616–634. [ Google Scholar ] [ CrossRef ]
  • Zheng, X.; Zhang, S.; Li, X.; Li, G.; Li, X. Lightweight bridge crack detection method based on segnet and bottleneck depth-separable convolution with residuals. IEEE Access 2021 , 9 , 161649–161668. [ Google Scholar ] [ CrossRef ]
  • Azouz, Z.; Honarvar Shakibaei Asli, B.; Khan, M. Evolution of crack analysis in structures using image processing technique: A review. Electronics 2023 , 12 , 3862. [ Google Scholar ] [ CrossRef ]
  • Hamishebahar, Y.; Guan, H.; So, S.; Jo, J. A comprehensive review of deep learning-based crack detection approaches. Appl. Sci. 2022 , 12 , 1374. [ Google Scholar ] [ CrossRef ]
  • Meng, S.; Gao, Z.; Zhou, Y.; He, B.; Djerrad, A. Real-time automatic crack detection method based on drone. Comput.-Aided Civ. Infrastruct. Eng. 2023 , 38 , 849–872. [ Google Scholar ] [ CrossRef ]
  • Humpe, A. Bridge inspection with an off-the-shelf 360 camera drone. Drones 2020 , 4 , 67. [ Google Scholar ] [ CrossRef ]
  • Truong-Hong, L.; Lindenbergh, R. Automatically extracting surfaces of reinforced concrete bridges from terrestrial laser scanning point clouds. Autom. Constr. 2022 , 135 , 104127. [ Google Scholar ] [ CrossRef ]
  • Cusson, D.; Rossi, C.; Ozkan, I.F. Early warning system for the detection of unexpected bridge displacements from radar satellite data. J. Civ. Struct. Health Monit. 2021 , 11 , 189–204. [ Google Scholar ] [ CrossRef ]
  • Bonaldo, G.; Caprino, A.; Lorenzoni, F.; da Porto, F. Monitoring displacements and damage detection through satellite MT-INSAR techniques: A new methodology and application to a case study in rome (Italy). Remote Sens. 2023 , 15 , 1177. [ Google Scholar ] [ CrossRef ]
  • Zheng, Z.; Zhong, Y.; Wang, J.; Ma, A.; Zhang, L. Building damage assessment for rapid disaster response with a deep object-based semantic change detection framework: From natural disasters to man-made disasters. Remote Sens. Environ. 2021 , 265 , 112636. [ Google Scholar ] [ CrossRef ]
  • Chen, X.; Zhang, X.; Ren, M.; Zhou, B.; Sun, M.; Feng, Z.; Chen, B.; Zhi, X. A multiscale enhanced pavement crack segmentation network coupling spectral and spatial information of UAV hyperspectral imagery. Int. J. Appl. Earth Obs. Geoinf. 2024 , 128 , 103772. [ Google Scholar ] [ CrossRef ]
  • Liu, F.; Liu, J.; Wang, L. Deep learning and infrared thermography for asphalt pavement crack severity classification. Autom. Constr. 2022 , 140 , 104383. [ Google Scholar ] [ CrossRef ]
  • Liu, S.; Han, Y.; Xu, L. Recognition of road cracks based on multi-scale retinex fused with wavelet transform. Array 2022 , 15 , 100193. [ Google Scholar ] [ CrossRef ]
  • Zhang, H.; Qian, Z.; Tan, Y.; Xie, Y.; Li, M. Investigation of pavement crack detection based on deep learning method using weakly supervised instance segmentation framework. Constr. Build. Mater. 2022 , 358 , 129117. [ Google Scholar ] [ CrossRef ]
  • Dorafshan, S.; Thomas, R.J.; Maguire, M. Comparison of deep convolutional neural networks and edge detectors for image-based crack detection in concrete. Constr. Build. Mater. 2018 , 186 , 1031–1045. [ Google Scholar ] [ CrossRef ]
  • Munawar, H.S.; Hammad, A.W.; Haddad, A.; Soares, C.A.P.; Waller, S.T. Image-based crack detection methods: A review. Infrastructures 2021 , 6 , 115. [ Google Scholar ] [ CrossRef ]
  • Chen, D.; Li, X.; Hu, F.; Mathiopoulos, P.T.; Di, S.; Sui, M.; Peethambaran, J. Edpnet: An encoding–decoding network with pyramidal representation for semantic image segmentation. Sensors 2023 , 23 , 3205. [ Google Scholar ] [ CrossRef ]
  • Mo, S.; Shi, Y.; Yuan, Q.; Li, M. A survey of deep learning road extraction algorithms using high-resolution remote sensing images. Sensors 2024 , 24 , 1708. [ Google Scholar ] [ CrossRef ]
  • Chen, D.; Li, J.; Di, S.; Peethambaran, J.; Xiang, G.; Wan, L.; Li, X. Critical points extraction from building façades by analyzing gradient structure tensor. Remote Sens. 2021 , 13 , 3146. [ Google Scholar ] [ CrossRef ]
  • Liu, Y.; Yeoh, J.K.; Chua, D.K. Deep learning-based enhancement of motion blurred UAV concrete crack images. J. Comput. Civ. Eng. 2020 , 34 , 04020028. [ Google Scholar ] [ CrossRef ]
  • Flah, M.; Nunez, I.; Ben Chaabene, W.; Nehdi, M.L. Machine learning algorithms in civil structural health monitoring: A systematic review. Arch. Comput. Methods Eng. 2021 , 28 , 2621–2643. [ Google Scholar ] [ CrossRef ]
  • Li, G.; Li, X.; Zhou, J.; Liu, D.; Ren, W. Pixel-level bridge crack detection using a deep fusion about recurrent residual convolution and context encoder network. Measurement 2021 , 176 , 109171. [ Google Scholar ] [ CrossRef ]
  • Ali, R.; Chuah, J.H.; Talip, M.S.A.; Mokhtar, N.; Shoaib, M.A. Structural crack detection using deep convolutional neural networks. Autom. Constr. 2022 , 133 , 103989. [ Google Scholar ] [ CrossRef ]
  • Wang, H.; Li, Y.; Dang, L.M.; Lee, S.; Moon, H. Pixel-level tunnel crack segmentation using a weakly supervised annotation approach. Comput. Ind. 2021 , 133 , 103545. [ Google Scholar ] [ CrossRef ]
  • Zhu, J.; Song, J. Weakly supervised network based intelligent identification of cracks in asphalt concrete bridge deck. Alex. Eng. J. 2020 , 59 , 1307–1317. [ Google Scholar ] [ CrossRef ]
  • Li, Y.; Bao, T.; Xu, B.; Shu, X.; Zhou, Y.; Du, Y.; Wang, R.; Zhang, K. A deep residual neural network framework with transfer learning for concrete dams patch-level crack classification and weakly-supervised localization. Measurement 2022 , 188 , 110641. [ Google Scholar ] [ CrossRef ]
  • Yang, Q.; Shi, W.; Chen, J.; Lin, W. Deep convolution neural network-based transfer learning method for civil infrastructure crack detection. Autom. Constr. 2020 , 116 , 103199. [ Google Scholar ] [ CrossRef ]
  • Dais, D.; Bal, I.E.; Smyrou, E.; Sarhosis, V. Automatic crack classification and segmentation on masonry surfaces using convolutional neural networks and transfer learning. Autom. Constr. 2021 , 125 , 103606. [ Google Scholar ] [ CrossRef ]
  • Abdellatif, M.; Peel, H.; Cohn, A.G.; Fuentes, R. Combining block-based and pixel-based approaches to improve crack detection and localisation. Autom. Constr. 2021 , 122 , 103492. [ Google Scholar ] [ CrossRef ]
  • Dan, D.; Dan, Q. Automatic recognition of surface cracks in bridges based on 2D-APES and mobile machine vision. Measurement 2021 , 168 , 108429. [ Google Scholar ] [ CrossRef ]
  • Weng, X.; Huang, Y.; Wang, W. Segment-based pavement crack quantification. Autom. Constr. 2019 , 105 , 102819. [ Google Scholar ] [ CrossRef ]
  • Kao, S.-P.; Chang, Y.-C.; Wang, F.-L. Combining the YOLOv4 deep learning model with UAV imagery processing technology in the extraction and quantization of cracks in bridges. Sensors 2023 , 23 , 2572. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Li, X.; Xu, X.; He, X.; Wei, X.; Yang, H. Intelligent crack detection method based on GM-ResNet. Sensors 2023 , 23 , 8369. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Choi, Y.; Park, H.W.; Mi, Y.; Song, S. Crack detection and analysis of concrete structures based on neural network and clustering. Sensors 2024 , 24 , 1725. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Guo, J.-M.; Markoni, H.; Lee, J.-D. BARNet: Boundary aware refinement network for crack detection. IEEE Trans. Intell. Transp. Syst. 2021 , 23 , 7343–7358. [ Google Scholar ] [ CrossRef ]
  • Luo, J.; Lin, H.; Wei, X.; Wang, Y. Adaptive canny and semantic segmentation networks based on feature fusion for road crack detection. IEEE Access 2023 , 11 , 51740–51753. [ Google Scholar ] [ CrossRef ]
  • Ranyal, E.; Sadhu, A.; Jain, K. Enhancing pavement health assessment: An attention-based approach for accurate crack detection, measurement, and mapping. Expert Syst. Appl. 2024 , 247 , 123314. [ Google Scholar ] [ CrossRef ]
  • Liu, K.; Chen, B.M. Industrial UAV-based unsupervised domain adaptive crack recognitions: From database towards real-site infrastructural inspections. IEEE Trans. Ind. Electron. 2022 , 70 , 9410–9420. [ Google Scholar ] [ CrossRef ]
  • Wang, W.; Hu, W.; Wang, W.; Xu, X.; Wang, M.; Shi, Y.; Qiu, S.; Tutumluer, E. Automated crack severity level detection and classification for ballastless track slab using deep convolutional neural network. Autom. Constr. 2021 , 124 , 103484. [ Google Scholar ] [ CrossRef ]
  • Xu, Z.; Zhang, X.; Chen, W.; Liu, J.; Xu, T.; Wang, Z. Muraldiff: Diffusion for ancient murals restoration on large-scale pre-training. IEEE Trans. Emerg. Top. Comput. Intell. 2024 , 8 , 2169–2181. [ Google Scholar ] [ CrossRef ]
  • Bradley, D.; Roth, G. Adaptive thresholding using the integral image. J. Graph. Tools 2007 , 12 , 13–21. [ Google Scholar ] [ CrossRef ]
  • Sezgin, M.; Sankur, B.l. Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging 2004 , 13 , 146–168. [ Google Scholar ]
  • Kapur, J.N.; Sahoo, P.K.; Wong, A.K. A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 1985 , 29 , 273–285. [ Google Scholar ] [ CrossRef ]
  • Pal, N.R.; Pal, S.K. A review on image segmentation techniques. Pattern Recognit. 1993 , 26 , 1277–1294. [ Google Scholar ] [ CrossRef ]
  • Flah, M.; Suleiman, A.R.; Nehdi, M.L. Classification and quantification of cracks in concrete structures using deep learning image-based techniques. Cem. Concr. Compos. 2020 , 114 , 103781. [ Google Scholar ] [ CrossRef ]
  • Mazni, M.; Husain, A.R.; Shapiai, M.I.; Ibrahim, I.S.; Anggara, D.W.; Zulkifli, R. An investigation into real-time surface crack classification and measurement for structural health monitoring using transfer learning convolutional neural networks and otsu method. Alex. Eng. J. 2024 , 92 , 310–320. [ Google Scholar ] [ CrossRef ]
  • He, Z.; Xu, W. Deep learning and image preprocessing-based crack repair trace and secondary crack classification detection method for concrete bridges. Struct. Infrastruct. Eng. 2024 , 20 , 1–17. [ Google Scholar ] [ CrossRef ]
  • He, T.; Li, H.; Qian, Z.; Niu, C.; Huang, R. Research on weakly supervised pavement crack segmentation based on defect location by generative adversarial network and target re-optimization. Constr. Build. Mater. 2024 , 411 , 134668. [ Google Scholar ] [ CrossRef ]
  • Su, H.; Wang, X.; Han, T.; Wang, Z.; Zhao, Z.; Zhang, P. Research on a U-Net bridge crack identification and feature-calculation methods based on a CBAM attention mechanism. Buildings 2022 , 12 , 1561. [ Google Scholar ] [ CrossRef ]
  • Kang, D.; Benipal, S.S.; Gopal, D.L.; Cha, Y.-J. Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning. Autom. Constr. 2020 , 118 , 103291. [ Google Scholar ] [ CrossRef ]
  • Lei, Q.; Zhong, J.; Wang, C. Joint optimization of crack segmentation with an adaptive dynamic threshold module. IEEE Trans. Intell. Transp. Syst. 2024 , 25 , 6902–6916. [ Google Scholar ] [ CrossRef ]
  • Lei, Q.; Zhong, J.; Wang, C.; Xia, Y.; Zhou, Y. Dynamic thresholding for accurate crack segmentation using multi-objective optimization. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Turin, Italy, 18 September 2023 ; Springer: Cham, Switzerland, 2023; pp. 389–404. [ Google Scholar ]
  • Vincent, L.; Soille, P. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 1991 , 13 , 583–598. [ Google Scholar ] [ CrossRef ]
  • Huang, H.; Zhao, S.; Zhang, D.; Chen, J. Deep learning-based instance segmentation of cracks from shield tunnel lining images. Struct. Infrastruct. Eng. 2022 , 18 , 183–196. [ Google Scholar ] [ CrossRef ]
  • Fan, Z.; Lin, H.; Li, C.; Su, J.; Bruno, S.; Loprencipe, G. Use of parallel resnet for high-performance pavement crack detection and measurement. Sustainability 2022 , 14 , 1825. [ Google Scholar ] [ CrossRef ]
  • Kong, S.Y.; Fan, J.S.; Liu, Y.F.; Wei, X.C.; Ma, X.W. Automated crack assessment and quantitative growth monitoring. Comput.-Aided Civ. Infrastruct. Eng. 2021 , 36 , 656–674. [ Google Scholar ] [ CrossRef ]
  • Dang, L.M.; Wang, H.; Li, Y.; Park, Y.; Oh, C.; Nguyen, T.N.; Moon, H. Automatic tunnel lining crack evaluation and measurement using deep learning. Tunn. Undergr. Space Technol. 2022 , 124 , 104472. [ Google Scholar ] [ CrossRef ]
  • Andrushia, A.D.; Anand, N.; Lubloy, E. Deep learning based thermal crack detection on structural concrete exposed to elevated temperature. Adv. Struct. Eng. 2021 , 24 , 1896–1909. [ Google Scholar ] [ CrossRef ]
  • Dang, L.M.; Wang, H.; Li, Y.; Nguyen, L.Q.; Nguyen, T.N.; Song, H.-K.; Moon, H. Deep learning-based masonry crack segmentation and real-life crack length measurement. Constr. Build. Mater. 2022 , 359 , 129438. [ Google Scholar ] [ CrossRef ]
  • Nguyen, A.; Gharehbaghi, V.; Le, N.T.; Sterling, L.; Chaudhry, U.I.; Crawford, S. ASR crack identification in bridges using deep learning and texture analysis. Structures 2023 , 50 , 494–507. [ Google Scholar ] [ CrossRef ]
  • Dong, C.; Li, L.; Yan, J.; Zhang, Z.; Pan, H.; Catbas, F.N. Pixel-level fatigue crack segmentation in large-scale images of steel structures using an encoder–decoder network. Sensors 2021 , 21 , 4135. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Jian, L.; Chengshun, L.; Guanhong, L.; Zhiyuan, Z.; Bo, H.; Feng, G.; Quanyi, X. Lightweight defect detection equipment for road tunnels. IEEE Sens. J. 2023 , 24 , 5107–5121. [ Google Scholar ]
  • Liang, H.; Qiu, D.; Ding, K.-L.; Zhang, Y.; Wang, Y.; Wang, X.; Liu, T.; Wan, S. Automatic pavement crack detection in multisource fusion images using similarity and difference features. IEEE Sens. J. 2023 , 24 , 5449–5465. [ Google Scholar ] [ CrossRef ]
  • Alamdari, A.G.; Ebrahimkhanlou, A. A multi-scale robotic approach for precise crack measurement in concrete structures. Autom. Constr. 2024 , 158 , 105215. [ Google Scholar ] [ CrossRef ]
  • Liu, H.; Kollosche, M.; Laflamme, S.; Clarke, D.R. Multifunctional soft stretchable strain sensor for complementary optical and electrical sensing of fatigue cracks. Smart Mater. Struct. 2023 , 32 , 045010. [ Google Scholar ] [ CrossRef ]
  • Dang, D.-Z.; Wang, Y.-W.; Ni, Y.-Q. Nonlinear autoregression-based non-destructive evaluation approach for railway tracks using an ultrasonic fiber bragg grating array. Constr. Build. Mater. 2024 , 411 , 134728. [ Google Scholar ] [ CrossRef ]
  • Yan, M.; Tan, X.; Mahjoubi, S.; Bao, Y. Strain transfer effect on measurements with distributed fiber optic sensors. Autom. Constr. 2022 , 139 , 104262. [ Google Scholar ] [ CrossRef ]
  • Shukla, H.; Piratla, K. Leakage detection in water pipelines using supervised classification of acceleration signals. Autom. Constr. 2020 , 117 , 103256. [ Google Scholar ] [ CrossRef ]
  • Chen, X.; Zhang, X.; Li, J.; Ren, M.; Zhou, B. A new method for automated monitoring of road pavement aging conditions based on recurrent neural network. IEEE Trans. Intell. Transp. Syst. 2022 , 23 , 24510–24523. [ Google Scholar ] [ CrossRef ]
  • Zhang, S.; He, X.; Xue, B.; Wu, T.; Ren, K.; Zhao, T. Segment-anything embedding for pixel-level road damage extraction using high-resolution satellite images. Int. J. Appl. Earth Obs. Geoinf. 2024 , 131 , 103985. [ Google Scholar ] [ CrossRef ]
  • Park, S.E.; Eem, S.-H.; Jeon, H. Concrete crack detection and quantification using deep learning and structured light. Constr. Build. Mater. 2020 , 252 , 119096. [ Google Scholar ] [ CrossRef ]
  • Yan, Y.; Mao, Z.; Wu, J.; Padir, T.; Hajjar, J.F. Towards automated detection and quantification of concrete cracks using integrated images and lidar data from unmanned aerial vehicles. Struct. Control Health Monit. 2021 , 28 , e2757. [ Google Scholar ] [ CrossRef ]
  • Dong, Q.; Wang, S.; Chen, X.; Jiang, W.; Li, R.; Gu, X. Pavement crack detection based on point cloud data and data fusion. Philos. Trans. R. Soc. A 2023 , 381 , 20220165. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kim, H.; Lee, S.; Ahn, E.; Shin, M.; Sim, S.-H. Crack identification method for concrete structures considering angle of view using RGB-D camera-based sensor fusion. Struct. Health Monit. 2021 , 20 , 500–512. [ Google Scholar ] [ CrossRef ]
  • Chen, J.; Lu, W.; Lou, J. Automatic concrete defect detection and reconstruction by aligning aerial images onto semantic-rich building information model. Comput.-Aided Civ. Infrastruct. Eng. 2023 , 38 , 1079–1098. [ Google Scholar ] [ CrossRef ]
  • Pozzer, S.; Rezazadeh Azar, E.; Dalla Rosa, F.; Chamberlain Pravia, Z.M. Semantic segmentation of defects in infrared thermographic images of highly damaged concrete structures. J. Perform. Constr. Facil. 2021 , 35 , 04020131. [ Google Scholar ] [ CrossRef ]
  • Kaur, R.; Singh, S. A comprehensive review of object detection with deep learning. Digit. Signal Process. 2023 , 132 , 103812. [ Google Scholar ] [ CrossRef ]
  • Sharma, V.K.; Mir, R.N. A comprehensive and systematic look up into deep learning based object detection techniques: A review. Comput. Sci. Rev. 2020 , 38 , 100301. [ Google Scholar ] [ CrossRef ]
  • Zhang, L.; Yang, F.; Zhang, Y.D.; Zhu, Y.J. Road crack detection using deep convolutional neural network. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 3708–3712. [ Google Scholar ]
  • Yang, C.; Chen, J.; Li, Z.; Huang, Y. Structural crack detection and recognition based on deep learning. Appl. Sci. 2021 , 11 , 2868. [ Google Scholar ] [ CrossRef ]
  • Rajadurai, R.-S.; Kang, S.-T. Automated vision-based crack detection on concrete surfaces using deep learning. Appl. Sci. 2021 , 11 , 5229. [ Google Scholar ] [ CrossRef ]
  • Kim, B.; Yuvaraj, N.; Sri Preethaa, K.; Arun Pandian, R. Surface crack detection using deep learning with shallow CNN architecture for enhanced computation. Neural Comput. Appl. 2021 , 33 , 9289–9305. [ Google Scholar ] [ CrossRef ]
  • O’Brien, D.; Osborne, J.A.; Perez-Duenas, E.; Cunningham, R.; Li, Z. Automated crack classification for the CERN underground tunnel infrastructure using deep learning. Tunn. Undergr. Space Technol. 2023 , 131 , 104668. [ Google Scholar ]
  • Chen, K.; Reichard, G.; Xu, X.; Akanmu, A. Automated crack segmentation in close-range building façade inspection images using deep learning techniques. J. Build. Eng. 2021 , 43 , 102913. [ Google Scholar ] [ CrossRef ]
  • Dong, Z.; Wang, J.; Cui, B.; Wang, D.; Wang, X. Patch-based weakly supervised semantic segmentation network for crack detection. Constr. Build. Mater. 2020 , 258 , 120291. [ Google Scholar ] [ CrossRef ]
  • Buatik, A.; Thansirichaisree, P.; Kalpiyapun, P.; Khademi, N.; Pasityothin, I.; Poovarodom, N. Mosaic crack mapping of footings by convolutional neural networks. Sci. Rep. 2024 , 14 , 7851. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Zhang, Y.; Zhang, L. Detection of pavement cracks by deep learning models of transformer and UNet. arXiv 2023 , arXiv:2304.12596. [ Google Scholar ] [ CrossRef ]
  • Al-Huda, Z.; Peng, B.; Algburi, R.N.A.; Al-antari, M.A.; Rabea, A.-J.; Zhai, D. A hybrid deep learning pavement crack semantic segmentation. Eng. Appl. Artif. Intell. 2023 , 122 , 106142. [ Google Scholar ] [ CrossRef ]
  • Shamsabadi, E.A.; Xu, C.; Rao, A.S.; Nguyen, T.; Ngo, T.; Dias-da-Costa, D. Vision transformer-based autonomous crack detection on asphalt and concrete surfaces. Autom. Constr. 2022 , 140 , 104316. [ Google Scholar ] [ CrossRef ]
  • Huang, S.; Tang, W.; Huang, G.; Huangfu, L.; Yang, D. Weakly supervised patch label inference networks for efficient pavement distress detection and recognition in the wild. IEEE Trans. Intell. Transp. Syst. 2023 , 24 , 5216–5228. [ Google Scholar ] [ CrossRef ]
  • Huang, G.; Huang, S.; Huangfu, L.; Yang, D. Weakly supervised patch label inference network with image pyramid for pavement diseases recognition in the wild. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 7978–7982. [ Google Scholar ]
  • Guo, J.-M.; Markoni, H. Efficient and adaptable patch-based crack detection. IEEE Trans. Intell. Transp. Syst. 2022 , 23 , 21885–21896. [ Google Scholar ] [ CrossRef ]
  • König, J.; Jenkins, M.D.; Mannion, M.; Barrie, P.; Morison, G. Weakly-supervised surface crack segmentation by generating pseudo-labels using localization with a classifier and thresholding. IEEE Trans. Intell. Transp. Syst. 2022 , 23 , 24083–24094. [ Google Scholar ] [ CrossRef ]
  • Al-Huda, Z.; Peng, B.; Algburi, R.N.A.; Al-antari, M.A.; Rabea, A.-J.; Al-maqtari, O.; Zhai, D. Asymmetric dual-decoder-U-Net for pavement crack semantic segmentation. Autom. Constr. 2023 , 156 , 105138. [ Google Scholar ] [ CrossRef ]
  • Wen, T.; Lang, H.; Ding, S.; Lu, J.J.; Xing, Y. PCDNet: Seed operation-based deep learning model for pavement crack detection on 3d asphalt surface. J. Transp. Eng. Part B Pavements 2022 , 148 , 04022023. [ Google Scholar ] [ CrossRef ]
  • Mishra, A.; Gangisetti, G.; Eftekhar Azam, Y.; Khazanchi, D. Weakly supervised crack segmentation using crack attention networks on concrete structures. Struct. Health Monit. 2024 , 23 , 14759217241228150. [ Google Scholar ] [ CrossRef ]
  • Kompanets, A.; Pai, G.; Duits, R.; Leonetti, D.; Snijder, B. Deep learning for segmentation of cracks in high-resolution images of steel bridges. arXiv 2024 , arXiv:2403.17725. [ Google Scholar ]
  • Liu, Y.; Yeoh, J.K. Robust pixel-wise concrete crack segmentation and properties retrieval using image patches. Autom. Constr. 2021 , 123 , 103535. [ Google Scholar ] [ CrossRef ]
  • Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [ Google Scholar ]
  • Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [ Google Scholar ]
  • Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; Volume 28. [ Google Scholar ]
  • He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [ Google Scholar ]
  • Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [ Google Scholar ]
  • Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [ Google Scholar ]
  • Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018 , arXiv:1804.02767. [ Google Scholar ]
  • Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020 , arXiv:2004.10934. [ Google Scholar ]
  • Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [ Google Scholar ]
  • Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [ Google Scholar ]
  • Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [ Google Scholar ]
  • Xu, Y.; Li, D.; Xie, Q.; Wu, Q.; Wang, J. Automatic defect detection and segmentation of tunnel surface using modified mask R-CNN. Measurement 2021 , 178 , 109316. [ Google Scholar ] [ CrossRef ]
  • Zhao, W.; Liu, Y.; Zhang, J.; Shao, Y.; Shu, J. Automatic pixel-level crack detection and evaluation of concrete structures using deep learning. Struct. Control Health Monit. 2022 , 29 , e2981. [ Google Scholar ] [ CrossRef ]
  • Li, R.; Yu, J.; Li, F.; Yang, R.; Wang, Y.; Peng, Z. Automatic bridge crack detection using unmanned aerial vehicle and faster R-CNN. Constr. Build. Mater. 2023 , 362 , 129659. [ Google Scholar ] [ CrossRef ]
  • Tran, T.S.; Nguyen, S.D.; Lee, H.J.; Tran, V.P. Advanced crack detection and segmentation on bridge decks using deep learning. Constr. Build. Mater. 2023 , 400 , 132839. [ Google Scholar ] [ CrossRef ]
  • Zhang, J.; Qian, S.; Tan, C. Automated bridge crack detection method based on lightweight vision models. Complex Intell. Syst. 2023 , 9 , 1639–1652. [ Google Scholar ] [ CrossRef ]
  • Ren, R.; Liu, F.; Shi, P.; Wang, H.; Huang, Y. Preprocessing of crack recognition: Automatic crack-location method based on deep learning. J. Mater. Civ. Eng. 2023 , 35 , 04022452. [ Google Scholar ] [ CrossRef ]
  • Liu, Z.; Yeoh, J.K.; Gu, X.; Dong, Q.; Chen, Y.; Wu, W.; Wang, L.; Wang, D. Automatic pixel-level detection of vertical cracks in asphalt pavement based on gpr investigation and improved mask R-CNN. Autom. Constr. 2023 , 146 , 104689. [ Google Scholar ] [ CrossRef ]
  • Li, Z.; Zhu, H.; Huang, M. A deep learning-based fine crack segmentation network on full-scale steel bridge images with complicated backgrounds. IEEE Access 2021 , 9 , 114989–114997. [ Google Scholar ] [ CrossRef ]
  • Alipour, M.; Harris, D.K.; Miller, G.R. Robust pixel-level crack detection using deep fully convolutional neural networks. J. Comput. Civ. Eng. 2019 , 33 , 04019040. [ Google Scholar ] [ CrossRef ]
  • Wang, S.; Pan, Y.; Chen, M.; Zhang, Y.; Wu, X. FCN-SFW: Steel structure crack segmentation using a fully convolutional network and structured forests. IEEE Access 2020 , 8 , 214358–214373. [ Google Scholar ] [ CrossRef ]
  • Hang, J.; Wu, Y.; Li, Y.; Lai, T.; Zhang, J.; Li, Y. A deep learning semantic segmentation network with attention mechanism for concrete crack detection. Struct. Health Monit. 2023 , 22 , 3006–3026. [ Google Scholar ] [ CrossRef ]
  • Sun, Y.; Yang, Y.; Yao, G.; Wei, F.; Wong, M. Autonomous crack and bughole detection for concrete surface image based on deep learning. IEEE Access 2021 , 9 , 85709–85720. [ Google Scholar ] [ CrossRef ]
  • Wang, Z.; Leng, Z.; Zhang, Z. A weakly-supervised transformer-based hybrid network with multi-attention for pavement crack detection. Constr. Build. Mater. 2024 , 411 , 134134. [ Google Scholar ] [ CrossRef ]
  • Chen, T.; Cai, Z.; Zhao, X.; Chen, C.; Liang, X.; Zou, T.; Wang, P. Pavement crack detection and recognition using the architecture of segNet. J. Ind. Inf. Integr. 2020 , 18 , 100144. [ Google Scholar ] [ CrossRef ]
  • Bai, S.; Ma, M.; Yang, L.; Liu, Y. Pixel-wise crack defect segmentation with dual-encoder fusion network. Constr. Build. Mater. 2024 , 426 , 136179. [ Google Scholar ] [ CrossRef ]
  • Wang, W.; Su, C. Semi-supervised semantic segmentation network for surface crack detection. Autom. Constr. 2021 , 128 , 103786. [ Google Scholar ] [ CrossRef ]
  • Tabernik, D.; Šela, S.; Skvarč, J.; Skočaj, D. Segmentation-based deep-learning approach for surface-defect detection. J. Intell. Manuf. 2020 , 31 , 759–776. [ Google Scholar ] [ CrossRef ]
  • König, J.; Jenkins, M.D.; Mannion, M.; Barrie, P.; Morison, G. Optimized deep encoder-decoder methods for crack segmentation. Digit. Signal Process. 2021 , 108 , 102907. [ Google Scholar ] [ CrossRef ]
  • Wang, C.; Liu, H.; An, X.; Gong, Z.; Deng, F. Swincrack: Pavement crack detection using convolutional swin-transformer network. Digit. Signal Process. 2024 , 145 , 104297. [ Google Scholar ] [ CrossRef ]
  • Lan, Z.-X.; Dong, X.-M. Minicrack: A simple but efficient convolutional neural network for pixel-level narrow crack detection. Comput. Ind. 2022 , 141 , 103698. [ Google Scholar ] [ CrossRef ]
  • Salton, G. Introduction to Modern Information Retrieval ; McGraw-Hill: New York, NY, USA, 1983. [ Google Scholar ]
  • Jenkins, M.D.; Carr, T.A.; Iglesias, M.I.; Buggy, T.; Morison, G. A deep convolutional neural network for semantic pixel-wise segmentation of road and pavement surface cracks. In Proceedings of the 2018 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, 3–7 September 2018; IEEE: Piscataway, NJ, USA; pp. 2120–2124. [ Google Scholar ]
  • Tsai, Y.-C.; Chatterjee, A. Comprehensive, quantitative crack detection algorithm performance evaluation system. J. Comput. Civ. Eng. 2017 , 31 , 04017047. [ Google Scholar ] [ CrossRef ]
  • Li, H.; Wang, J.; Zhang, Y.; Wang, Z.; Wang, T. A study on evaluation standard for automatic crack detection regard the random fractal. arXiv 2020 , arXiv:2007.12082. [ Google Scholar ]

Click here to enlarge figure

MethodFeaturesDomainDatasetImage Device/SourceResultsLimitations
Canny and YOLOv4 [ ]Crack detection and measurementBridges1463 images
256 × 256 pixels
Smartphone and DJI UAVAccuracy = 92%
mAP = 92%
The Canny edge detector is affected by the threshold
Canny and GM-ResNet [ ]Crack detection, measurement, and classificationRoad522 images
224 × 224 pixels
Concrete crack sub-datasetPrecision = 97.9%
Recall = 98.9%
F1 measure = 98.0%
Accuracy in shadow conditions = 99.3%
Accuracy in shadow-free conditions = 99.9%
Its detection performance for complex cracks is not yet perfect
Sobel and ResNet50 [ ]Crack detectionConcrete4500 images
100 × 100 pixels
FLIR E8Precision = 98.4%
Recall = 88.7%
F1 measure = 93.2%
-
Sobel and BARNet [ ]Crack detection and localizationRoad206 images
800 × 600 pixels
CrackTree200 datasetAIU = 19.85%
ODS = 79.9%
OIS = 81.4%
Hyperparameter tuning is needed to balance the penalty weights for different types of cracks
Canny and DeepLabV3+ [ ]Crack detectionRoad2000 × 1500 pixelsCrack500 datasetMIoU = 77.64%
MAE = 1.55
PA = 97.38%
F1 score = 63%
Detection performance deteriorating in dark environments or when interfering objects are present
Canny and RetinaNet [ ]Crack detection and measurementRoad850 images
256 × 256 pixels
SDNET 2018 datasetPrecision = 85.96%
Recall = 84.48%
F1 score = 85.21%
-
Canny and Transformer [ ]Crack detection and segmentationBuildings11298 images
450 × 450 pixels
UAVsGA = 83.5%
MIoU = 76.2%
Precision = 74.3%
Recall = 75.2%
F1 score = 74.7%
Resulting in a marginal increment in computational costs for various network backbones
Canny and Inception-ResNet-v2 [ ]Crack detection, measurement, and classificationHigh-speed railway4650 images
400 × 400 pixels
The track inspection vehicleHigh severity level:
Precision = 98.37%
Recall = 93.82%
F1 score = 95.99%
Low severity level:
Precision = 94.25%
Recall = 98.39%
F1 score = 96.23%
Only the average width was used to define the severity of the crack, and the influence of the length on the detection result was not considered
Canny and Unet [ ]Crack detectionBuildings165 images-SSIM = 14.5392
PSNR = 0.3206
RMSE = 0.0747
Relies on a large amount of mural data for training and enhancement
MethodFeaturesDomainDatasetImage Device/SourceResultsLimitations
Otsu and Keras classifier [ ]Crack detection, measurement, and classificationConcrete4000 images
227 × 227 pixels
Open dataset availableClassifiers accuracy = 98.25%, 97.18%, 96.17%
Length error = 1.5%
Width error = 5%
Angle of orientation error = 2%
Only accurately quantify one single crack per image
Otsu and TL MobileNetV2 [ ]Crack detection, measurement, and classificationConcrete11435 images
224 × 224 pixels
Mendeley data—crack detectionAccuracy = 99.87%
Recall = 99.74%
Precision = 100%
F1 score = 99.87%
Dependency on image quality
Otsu, YOLOv7, Poisson noise, and bilateral filtering [ ]Crack detection and classificationBridges500 images
640 × 640 pixels
DatasetTraining time = 35 min
Inference time = 8.9 s
Target correct rate = 85.97%
Negative sample misclassification rate = 42.86%
It does not provide quantified information such as length and area
Adaptive threshold and WSIS [ ]Crack detectionRoad320 images
3024 × 4032 pixels
Photos of cracksRecall = 90%
Precision = 52%
IoU = 50%
F1 score = 66%
Accuracy = 98%
For some small cracks (with a width of less than 3 pixels), model can only identify the existence of small cracks, but it is difficult to depict the cracks in detail
Adaptive threshold and U-GAT-IT [ ]Crack detectionRoad300 training images and237 test imagesDeepCrack datasetRecall = 79.3%
Precision = 82.2%
F1 score = 80.7%
Further research is needed to address the interference caused by factors such as small cracks, road shadows, and water stains
Local thresholding and DCNN [ ]Crack detectionConcrete125 images
227 × 227 pixels
CamerasAccuracy = 93%
Recall = 91%
Precision = 92%
F1 score = 91%
-
Otsu and Faster R-CNN [ ]Crack detection, localization, and quantificationConcrete100 images
1920 × 1080 pixels
Nikon d7200 camera and Galaxy s9 cameraAP = 95%
mIoU = 83%
RMSE = 2.6 pixels
Length accuracy = 93%
The proposed method is useful for concrete cracks only; its applicability for the detection of other crack materials might be limited
Adaptive Dynamic Thresholding
Module (ADTM) and Mask DINO [ ]
Crack detection and segmentationRoad395 images
2000 × 1500 pixels
Crack500mIoU = 81.3%
mAcc = 96.4%
gAcc = 85.0%
ADTM module can only handle binary classification problems
Dynamic Thresholding Branch and DeepCrack [ ]Crack detection and classificationBridges3648 × 5472 pixelsCrack500mIoU = 79.3%
mAcc = 98.5%
gAcc = 86.6%
Image-level thresholds lead to misclassification of the background
MethodFeaturesDomainDatasetImage Device/SourceResultsLimitations
Morphological closing operations and Mask R-CNN [ ]Crack detectionTunnel761 images
227 × 227 pixels
MTI-200aBalanced accuracy = 81.94%
F1 score = 68.68%
IoU = 52.72%
Relatively small compared to the needs of the required sample size for universal conditions
Morphological operations and Parallel ResNet [ ]Crack detection and measurementRoad206 images (CrackTree200)
800 × 600 pixels
and 118 images (CFD)
320 × 480 pixels
CrackTree200 dataset and CFD datasetCrackTree200:
Precision = 94.27%
Recall = 92.52%
F1 = 93.08%
CFD:
Precision = 96.21%
Recall = 95.12%
F1 = 95.63%
The method was only performed on accurate static images
Closing and CNN [ ]Crack detection, measurement, and classificationConcrete3208 images
256 × 256 pixels
or
128 × 128 pixels
Hand-held DSLR camerasRelative error = 5%
Accuracy > 95%
Loss < 0.1
The extraction of the cracks’ edge will have a larger influence on the results
Dilation and TunnelURes [ ]Crack detection, measurement, and classificationTunnel6810 images
image sizes vary 10441 × 2910 to 50739 × 3140
Night 4K line-scan camerasAUC = 0.97
PA = 0.928
IoU = 0.847
The medial-axis skeletonization algorithm created many errors because it was susceptible to the crack intersection and the image edges where the crack’s representation changed
Opening, closing, and U-Net [ ]Crack detection, measurement, and classificationConcrete200 images
512 × 512 pixels
Canon SX510 HS cameraPrecision = 96.52%
Recall = 93.73%
F measure = 96.12%
Accuracy = 99.74%
IoU = 78.12%
It can only detect the other type of cracks which have the same crack geometry as that of thermal cracks
Morphological operations and DeepLabV3+ [ ]Crack detection and measurementMasonry structure200 images
780 × 355 pixels
and
2880 × 1920 pixels
Internet, drones,
and smartphones
IoU = 0.97
F1 score = 98%
Accuracy = 98%
The model will not detect crack features that do not appear in the dataset (complicated cracks, tiny cracks, etc.)
Erosion, texture analysis techniques, and InceptionV3 [ ]Crack detection and classificationBridges1706 images
256 × 256 pixels
CamerasF1 score = 93.7%
Accuracy = 94.07%
-
U-Net, opening, and closing operations [ ]Crack detection and segmentationBridges244 images
512 × 512 pixels
CamerasmP = 44.57%
mR = 53.13%
Mf1 = 42.79%
mIoU = 64.79%
The model lacks generality, and there are cases of false detection
Sensor TypeFusion MethodAdvantagesDisadvantagesApplication Scenarios
Optical sensor [ ]Data-level fusionHigh resolution, rich in detailsSusceptible to light and occlusionSurface crack detection, general environments
Thermal sensor [ ]Feature level fusionSuitable for nighttime or low-light environments, detects temperature changesLow resolution, lack of detailNighttime detection, heat-sensitive areas, large-area surface crack detection
Laser sensor [ ]Data-level fusion and feature level fusionHigh-precision 3D point cloud data, accurately measures crack morphologyHigh equipment cost, complex data processingComplex structures, precise measurements
Strain sensor [ ]Feature level fusion and decision-level fusionHigh sensitivity to structural changes; durableRequires contact with the material; installation complexityMonitoring structural health in bridges and buildings; detecting early-stage crack development
Ultrasonic sensor [ ]Data-level fusion and feature level fusionDetects internal cracks in materials, strong penetrationAffected by material and geometric shape, limited resolutionInternal cracks, metal material detection
Optical fiber sensor [ ]Feature level fusionHigh sensitivity to changes in material properties, non-contact measurementAffected by environmental conditions, requires calibrationSurface crack detection, structural health monitoring
Vibration sensor [ ]Data-level fusionDetects structural vibration characteristics, strong adaptabilityAffected by environmental vibrations, requires complex signal processingDynamic crack monitoring, bridges and other structures
Multispectral satellite sensor [ ]Data-level fusionRich spectral informationLimited spectral resolution, weather- and lighting-dependent,
high cost
Pavement crack detection, bridge and infrastructure monitoring, building facade inspection
High-resolution satellite sensors [ ]Data-level fusion and feature level fusionHigh spatial resolution, wide coverage, frequent revisit times, rich information contentWeather dependency, high cost, data processing complexity, limited temporal resolutionRoad and pavement crack detection, bridge and infrastructure monitoring, urban building facade inspection, railway and highway crack monitoring
ScaleDataset/(Pixels × Pixels)References
Image-based227 × 227[ , , , ]
224 × 224[ ]
256 × 256[ ]
416 × 416[ ]
512 × 512[ ]
Patch-based128 × 128[ , ]
200 × 200[ ]
224 × 224[ , , , , ]
227 × 227[ ]
256 × 256[ , ]
300 × 300[ , ]
320 × 480[ , ]
544 × 384[ ]
512 × 512[ , , , ]
584 × 384[ ]
ModelImprovement/InnovationDatasetBackboneResults
Faster R-CNN [ ]Combined with drones for crack detection2000 images
5280 × 2970 pixels
VGG-16Precision = 92.03%
Recall = 96.26%
F1 score = 94.10%
Faster R-CNN [ ]Double-head structure is introduced, including an independent fully connected head and a convolution head1622 images
1612 × 1947 pixels
ResNet50AP = 47.2%
Mask R-CNN [ ]The morphological closing operation was incorporated into the M-R-101-FPN model to form an integrated model761 images
227 × 227 pixels
ResNets and VGGBalanced accuracy = 81.94%
F1 score = 68.68%
IoU = 52.72%
Mask R-CNN [ ]PAFPN module and edge detection branch was introduced9680 images
1500 × 1500 pixels
ResNet-FPNPrecision = 92.03%
Recall = 96.26%
AP = 94.10%
mAP = 90.57%
Error rate = 0.57%
Mask R-CNN [ ]FPN structure introduces side join method and combines FPN with ResNet-101 to change RoI-Pooling layer to RoI-Align layer3430 images
1024 × 1024 pixels
ResNet101AP = 83.3%
F1 score = 82.4%
Average error = 2.33%
mIoU = 70.1%
YOLOv3-tiny [ ]A structural crack detection and quantification method combined with structured light is proposed500 images
640 × 640 pixels
Darknet-53Accuracy = 94%
Precision = 98%
YOLOv4 [ ]Some lightweight networks were used instead of the original backbone feature extraction network, and DenseNet, MobileNet, and GhostNet were selected for the lightweight networks800 images
416 × 416 pixels
DenseNet, MobileNet v1, MobileNet v2, MobileNet v3, and GhostNetPrecision = 93.96%
Recall = 90.12%
F1 score = 92%
YOLOv4 [ ]-1463 images
256 × 256 pixels
Darknet-53Accuracy = 92%
mAP = 92%
Datasets NameNumber of ImagesImage ResolutionManual AnnotationScope of ApplicabilityLimitations
CrackTree200 [ ]206 images800 × 600 pixelsPixel-level annotations for cracksCrack classification and segmentationWith only 200 images, the dataset’s relatively small size can hinder the model’s ability to generalize across diverse conditions, potentially leading to overfitting on the specific examples provided
Crack500 [ ]500 images2000 × 1500 pixelsPixel-level annotations for cracksCrack classification and segmentationLimited number of images compared to larger datasets, which might affect the generalization of models trained on this dataset
SDNET 2018 [ ]56000 images256 × 256 pixelsPixel-level annotations for cracksCrack classification and segmentationThe dataset’s focus on concrete surfaces may limit the model’s performance when applied to different types of surfaces or structures
Mendeley data—crack detection [ ]40000 images227 × 227 pixelsPixel-level annotations for cracksCrack classificationThe dataset might not cover all types of cracks or surface conditions, which can limit its applicability to a wide range of real-world scenarios
DeepCrack [ ]2500 images512 × 512 pixelsAnnotations for cracksCrack segmentationThe resolution might limit the ability of models to capture very small or subtle crack features
CFD [ ]118 images320 × 480 pixelsPixel-level annotations for cracksCrack segmentationThe dataset contains a limited number of data samples, which may limit the generalization ability of the model
CrackTree260 [ ]260 images800 × 600 pixels
and
960 × 720 pixels
Pixel-level labeling, bounding boxes, or other crack markersObject detection and segmentationBecause the dataset is small, it can be easy for the model to overfit the training data, especially if you’re using a complex model
CrackLS315 [ ]315 images512 × 512 pixelsPixel-level segmentation mask or bounding boxObject detection and segmentationThe small size of the dataset may make the model perform poorly in complex scenarios, especially when encountering different types of cracks or uncommon crack features
Stone331 [ ]331 images512 × 512 pixelsPixel-level segmentation mask or bounding boxObject detection and segmentationThe relatively small number of images limits the generalization ability of the model, especially in deep learning tasks where smaller datasets tend to lead to overfitting
IndexIndex Value and Calculation FormulaCurve
True positive -
False positive -
True negative -
False negative -
Precision PRC
Recall PRC, ROC curve
F1 score F1 score curve
Accuracy Accuracy vs. threshold curve
Average precision PRC
Mean average precision -
IoU IoU distribution curve, precision-recall curve with IoU thresholds
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Yuan, Q.; Shi, Y.; Li, M. A Review of Computer Vision-Based Crack Detection Methods in Civil Infrastructure: Progress and Challenges. Remote Sens. 2024 , 16 , 2910. https://doi.org/10.3390/rs16162910

Yuan Q, Shi Y, Li M. A Review of Computer Vision-Based Crack Detection Methods in Civil Infrastructure: Progress and Challenges. Remote Sensing . 2024; 16(16):2910. https://doi.org/10.3390/rs16162910

Yuan, Qi, Yufeng Shi, and Mingyue Li. 2024. "A Review of Computer Vision-Based Crack Detection Methods in Civil Infrastructure: Progress and Challenges" Remote Sensing 16, no. 16: 2910. https://doi.org/10.3390/rs16162910

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES

  1. Block diagram of Resnet-50 1 by 2 architecture.

    resnet50 architecture research paper

  2. ResNet50 Tex · Issue #24 · HarisIqbal88/PlotNeuralNet · GitHub

    resnet50 architecture research paper

  3. The architecture of ResNet50 and deep learning model flowchart. a, b

    resnet50 architecture research paper

  4. The structure chart of Resnet-50.

    resnet50 architecture research paper

  5. The architecture of the ResNet50 neural net. Notice the many

    resnet50 architecture research paper

  6. Architecture of ResNet50

    resnet50 architecture research paper

COMMENTS

  1. ResNet Explained

    Residual Networks, or ResNets, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack residual blocks ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these blocks ...

  2. [1512.03385] Deep Residual Learning for Image Recognition

    Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. View a PDF of the paper titled Deep Residual Learning for Image Recognition, by Kaiming He and 3 other authors. Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are ...

  3. ResNet

    Summary Residual Networks, or ResNets, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. They stack residual blocks ontop of each other to form network: e.g. a ResNet-50 has fifty layers using these ...

  4. Deep Residual Learning for Image Recognition

    In this paper, we use no maxout/dropout and just simply impose regular-ization via deep and thin architectures by design, without distracting from the focus on the difficulties of optimiza-tion.

  5. Revisiting ResNets: Improved Training and Scaling Strategies

    Through combining minor architectural changes (used since 2018) and improved training and scaling strategies, we discover the ResNet architecture sets a state-of-the-art baseline for vision research.

  6. Deep Learning in Image Classification using Residual ...

    This paper investigates a deep learning method in image classification for the detection of colorectal cancer with ResNet architecture. The exceptiona…

  7. ResNet 50

    ResNet 50 is a crucial network for you to understand. It is the basis of much academic research in this field. Many different papers will compare their results to a ResNet 50 baseline, and it is valuable as a reference point. As well, we can easily download the weights for ResNet 50 networks that have been trained on the ImageNet dataset and ...

  8. PDF Deep Residual Learning for Image Recognition

    In this paper, we address the degradation problem by introducing a deep residual learning framework. In-stead of hoping each few stacked layers directly fit a desired underlying mapping, we explicitly let these lay-ers fit a residual mapping.

  9. A Transfer Residual Neural Network Based on ResNet-50 for ...

    With the increasing popularity of deep learning, enterprises are replacing traditional inefficient and non-robust defect detection methods with intelligent recognition technology. This paper utilizes TL (transfer learning) to enhance the model's recognition performance by integrating the Adam optimizer and a learning rate decay strategy. By comparing the TL-ResNet50 model with other classic ...

  10. Performance Comparison of ResNet50V2 and VGG16 Models for ...

    The model is based on the ResNet-50 architecture and identifies individuals with face masks well. The research aims to compare the performance of two popular deep learning models, ResNet50V2 and VGG16, for feature extraction in image classification tasks.

  11. Transfer learning with fine-tuned deep CNN ResNet50 model for

    The pre-trained ResNet50 TL model's weight which is trained on the ImageNet dataset utilizing M o C o _ v 2 was collected from these [55], [56] research paper.

  12. A comparison between VGG16, VGG19 and ResNet50 architecture frameworks

    In this paper, we intend to determine which model of the architecture of the Convoluted Neural Network (CNN) can be used to solve a real-life problem of product classification to help optimize pricing comparison.

  13. The architecture of ResNet-50 model.

    Download scientific diagram | The architecture of ResNet-50 model. from publication: Performance Evaluation of Deep CNN-Based Crack Detection and Localization Techniques for Concrete Structures ...

  14. VGG16, ResNet-50, and GoogLeNet Deep Learning Architecture for

    In this paper, we conducted a comparative study of three deep neural network architectures, the VGG16, ResNet-50, and GoogLeNet for breathing sounds classification.

  15. PDF ResNet 50

    ResNet 50 ResNet 50 is a crucial network for you to understand. It is the basis of much academic research in this field. Many different papers will compare their results to a ResNet 50 baseline, and it is valuable as a reference point. As well, we can easily download the weights for ResNet 50 networks that have been trained on the ImageNet dataset and modify the last layers (called ...

  16. The DCT-CNN-ResNet50 architecture to classify brain tumors with super

    Combining the DCT fusion, CNN SR, and ResNet50 frameworks (aka, DCT-CNN-ResNet50 architecture) within the same design gives rise to the third contribution. This is the first time appearance of the DCT-CNN-ResNet50 in research to the best of the authors' knowledge.

  17. The proposed Resnet50 CNN architecture

    Download scientific diagram | The proposed Resnet50 CNN architecture from publication: Illumination-robust face recognition based on deep convolutional neural networks architectures | In the last ...

  18. Performance Analysis of ResNet50 Architecture based Pest Detection

    The objective of this research work is to categorize and identify various plant diseases at their earliest stages. Due to the late 20th century's dramatic expansion in human population, there is an enormous demand for crops, fruits, and vegetables that much exceeds availability. Due to plant disease and pests, nearly more than 30% of crops are lost. Pests have a direct negative influence on ...

  19. Network architecture of ResNet50. (a) ResNet50 structure diagram. (b

    The architecture of ResNet50 was divided into 4 stages. Every ResNet architecture performed the initial convolution and max-pooling using 7 Â 7 and 3 Â 3 kernel sizes respectively.

  20. Understanding ResNet50 architecture

    ResNet50 is a variant of ResNet model which has 48 Convolution layers along with 1 MaxPool and 1 Average Pool layer. It has 3.8 x 10^9 Floating points operations. It is a widely used ResNet model and we have explored ResNet50 architecture in depth. We start with some background information, comparison with other models and then, dive directly ...

  21. ResNet-50 neural network architecture [56].

    ResNet50 incorporates skip connections to mitigate vanishing gradient issues during training. As shown in Fig. 2 [23], the architecture of ResNet50 is depicted. [23]. ...

  22. A Review of Computer Vision-Based Crack Detection Methods in ...

    Based on recent research and technology development trends in the field of crack detection in civil engineering infrastructure, this paper proposes a comprehensive classification framework that classifies crack detection methods into three categories: combination of traditional methods and deep learning, multimodal data fusion, and semantic ...

  23. Instance segmentation of pigs in infrared images based on INPC model

    Research paper. Instance segmentation of pigs in infrared images based on INPC model ... The pioneer model for instance segmentation is Mask R-CNN, and its architecture is shown in Fig. 5. ... We used ResNet networks as our research subjects, including ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152 with different depths as backbones ...