Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Journal Proposal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

jimaging-logo

Article Menu

latest research areas in image processing

  • Subscribe SciFeed
  • Recommended Articles
  • PubMed/Medline
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Developments in image processing using deep learning and reinforcement learning.

latest research areas in image processing

1. Introduction

2. methodology, 2.1. search process and sources of information, 2.2. inclusion and exclusion criteria for article selection, 3. technical background, 3.1. graphics processing units, 3.2. image processing, 3.3. machine learning overview.

  • In supervised learning, we can determine predictive functions using labeled training datasets, meaning each data object instance must include an input for both the values and the expected labels or output values [ 21 ]. This class of algorithms tries to identify the relationships between input and output values and generate a predictive model able to determine the result based only on the corresponding input data [ 3 , 21 ]. Supervised learning methods are suitable for regression and data classification, being primarily used for a variety of algorithms like linear regression, artificial neural networks (ANNs), decision trees (DTs), support vector machines (SVMs), k-nearest neighbors (KNNs), random forest (RF), and others [ 3 ]. As an example, systems using RF and DT algorithms have developed a huge impact on areas such as computational biology and disease prediction, while SVM has also been used to study drug–target interactions and to predict several life-threatening diseases, such as cancer or diabetes [ 23 ].
  • Unsupervised learning is typically used to solve several problems in pattern recognition based on unlabeled training datasets. Unsupervised learning algorithms are able to classify the training data into different categories according to their different characteristics [ 21 , 24 ], mainly based on clustering algorithms [ 24 ]. The number of categories is unknown, and the meaning of each category is unclear; therefore, unsupervised learning is usually used for classification problems and for association mining. Some commonly employed algorithms include K-means [ 3 ], SVM, or DT classifiers. Data processing tools like PCA, which is used for dimensionality reduction, are often necessary prerequisites before attempting to cluster a set of data.

3.3.1. Deep Learning Concepts

  • Training a DNN implies the definition of a loss function, which is responsible for calculating the error made in the process given by the difference between the expected output value and that produced by the network. One of the most used loss functions in regression problems is the mean squared error (MSE) [ 30 ]. In the training phase, the weight vector that minimizes the loss function is adjusted, meaning it is not possible to obtain analytical solutions effectively. The loss function minimization method usually used is gradient descent [ 30 ].
  • Activation functions are fundamental in the process of learning neural network models, as well as in the interpretation of complex nonlinear functions. The activation function adds nonlinear features to the model, allowing it to represent more than one linear function, which would not happen otherwise, no matter how many layers it had. The Sigmoid function is the most commonly used activation function in the early stages of studying neural networks [ 30 ].
  • As their capacity to learn and adjust to data is greater than that of traditional ML models, it is more likely that overfitting situations will occur in DL models. For this reason, regularization represents a crucial and highly effective set of techniques used to reduce the generalization errors in ML. Some other techniques that can contribute to achieving this goal are increasing the size of the training dataset, stopping at an early point in the training phase, or randomly discarding a portion of the output of neurons during the training phase [ 30 ].
  • In order to increase stability and reduce convergence times in DL algorithms, optimizers are used, with which greater efficiency in the hyperparameter adjustment process is also possible [ 30 ].

3.3.2. Reinforcement Learning Concepts

3.4. current challenges, 4. image processing developments, 4.1. domains, 4.1.1. research using deep learning.

  • One of the first DL models used for video prediction, inspired by the sequence-to-sequence model usually used in natural language processing [ 97 ], uses a recurrent long and short term memory network (LSTM) to predict future images based on a sequence of images encoded during video data processing [ 97 ].
  • In their research, Salahzadeh et al. [ 98 ] presented a novel mechatronics platform for static and real-time posture analysis, combining 3 complex components. The components included a mechanical structure with cameras, a software module for data collection and semi-automatic image analysis, and a network to provide the raw data to the DL server. The authors concluded that their device, in addition to being inexpensive and easy to use, is a method that allows postural assessment with great stability and in a non-invasive way, proving to be a useful tool in the rehabilitation of patients.
  • Studies in graphical search engines and content-based image retrieval (CBIR) systems have also been successfully developed recently [ 11 , 82 , 99 , 100 ], with processing times that might be compatible with real-time applications. Most importantly, the corresponding results of these studies appeared to show adequate image retrieval capabilities, displaying an undisputed similarity between input and output, both on a semantic basis and a graphical basis [ 82 ]. In a review by Latif et al. [ 101 ], the authors concluded that image feature representation, as it is performed, is impossible to be represented by using a unique feature representation. Instead, it should be achieved by a combination of said low-level features, considering they represent the image in the form of patches and, as such, the performance is increased.
  • In their publication, Rani et al. [ 102 ] reviewed the current literature found on this topic from the period from 1995 to 2021. The authors found that researchers in microbiology have employed ML techniques for the image recognition of four types of micro-organisms: bacteria, algae, protozoa, and fungi. In their research work, Kasinathan and Uyyala [ 17 ] apply computer vision and knowledge-based approaches to improve insect detection and classification in dense image scenarios. In this work, image processing techniques were applied to extract features, and classification models were built using ML algorithms. The proposed approach used different feature descriptors, such as texture, color, shape, histograms of oriented gradients (HOG) and global image descriptors (GIST). ML was used to analyze multivariety insect data to obtain the efficient utilization of resources and improved classification accuracy for field crop insects with a similar appearance.

4.1.2. Research Using Reinforcement Learning

5. discussion and future directions, 6. conclusions.

  • Interest in image-processing systems using DL methods has exponentially increased over the last few years. The most common research disciplines for image processing and AI are medicine, computer science, and engineering.
  • Traditional ML methods are still extremely relevant and are frequently used in fields such as computational biology and disease diagnosis and prediction or to assist in specific tasks when coupled with other more complex methods. DL methods have become of particular interest in many image-processing problems, particularly because of their ability to circumvent some of the challenges that more traditional approaches face.
  • A lot of attention from researchers seems to focus on improving model performance, reducing computational resources and time, and expanding the application of ML models to solve concrete real-world problems.
  • The medical field seems to have developed a particular interest in research using multiple classes and methods of learning algorithms. DL image processing has been useful in analyzing medical exams and other imaging applications. Some areas have also still found success using more traditional ML methods.
  • Another area of interest appears to be autonomous driving and driver profiling, possibly powered by the increased access to information available both for the drivers and the vehicles alike. Indeed, modern driving assistance systems have already implemented features such as (a) road lane finding, (b) free driving space finding, (c) traffic sign detection and recognition, (d) traffic light detection and recognition, and (e) road-object detection and tracking. This research field will undoubtedly be responsible for many more studies in the near future.
  • Graphical search engines and content-based image retrieval systems also present themselves as an interesting topic of research for image processing, with a diverse body of work and innovative approaches.

Author Contributions

Institutional review board statement, informed consent statement, data availability statement, acknowledgments, conflicts of interest, abbreviations.

AIArtificial Inteligence
MLMachine Learning
DLDeep Learning
CBIRContent Based Image Retrieval
CNNConvolutional Neural Network
DNNDeep Neural Network
DCNNDeep Convolution Neural Network
RGBRed, Green, and Blue
  • Raschka, S.; Patterson, J.; Nolet, C. Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence. Information 2020 , 11 , 193. [ Google Scholar ] [ CrossRef ]
  • Barros, D.; Moura, J.; Freire, C.; Taleb, A.; Valentim, R.; Morais, P. Machine learning applied to retinal image processing for glaucoma detection: Review and perspective. BioMed. Eng. OnLine 2020 , 19 , 20. [ Google Scholar ] [ CrossRef ]
  • Zhu, M.; Wang, J.; Yang, X.; Zhang, Y.; Zhang, L.; Ren, H.; Wu, B.; Ye, L. A review of the application of machine learning in water quality evaluation. Eco-Environ. Health 2022 , 1 , 107–116. [ Google Scholar ] [ CrossRef ]
  • Singh, V.; Chen, S.S.; Singhania, M.; Nanavati, B.; kumar kar, A.; Gupta, A. How are reinforcement learning and deep learning algorithms used for big data based decision making in financial industries–A review and research agenda. Int. J. Inf. Manag. Data Insights 2022 , 2 , 100094. [ Google Scholar ] [ CrossRef ]
  • Moscalu, M.; Moscalu, R.; Dascălu, C.G.; Țarcă, V.; Cojocaru, E.; Costin, I.M.; Țarcă, E.; Șerban, I.L. Histopathological Images Analysis and Predictive Modeling Implemented in Digital Pathology—Current Affairs and Perspectives. Diagnostics 2023 , 13 , 2379. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wang, S.; Yang, D.M.; Rong, R.; Zhan, X.; Fujimoto, J.; Liu, H.; Minna, J.; Wistuba, I.I.; Xie, Y.; Xiao, G. Artificial Intelligence in Lung Cancer Pathology Image Analysis. Cancers 2019 , 11 , 1673. [ Google Scholar ] [ CrossRef ]
  • van der Velden, B.H.M.; Kuijf, H.J.; Gilhuijs, K.G.; Viergever, M.A. Explainable artificial intelligence (XAI) in deep learning-based medical image analysis. Med. Image Anal. 2022 , 79 , 102470. [ Google Scholar ] [ CrossRef ]
  • Prevedello, L.M.; Halabi, S.S.; Shih, G.; Wu, C.C.; Kohli, M.D.; Chokshi, F.H.; Erickson, B.J.; Kalpathy-Cramer, J.; Andriole, K.P.; Flanders, A.E. Challenges related to artificial intelligence research in medical imaging and the importance of image analysis competitions. Radiol. Artif. Intell. 2019 , 1 , e180031. [ Google Scholar ] [ CrossRef ]
  • Smith, K.P.; Kirby, J.E. Image analysis and artificial intelligence in infectious disease diagnostics. Clin. Microbiol. Infect. 2020 , 26 , 1318–1323. [ Google Scholar ] [ CrossRef ]
  • Wu, Q. Research on deep learning image processing technology of second-order partial differential equations. Neural Comput. Appl. 2023 , 35 , 2183–2195. [ Google Scholar ] [ CrossRef ]
  • Jardim, S.; António, J.; Mora, C. Graphical Image Region Extraction with K-Means Clustering and Watershed. J. Imaging 2022 , 8 , 163. [ Google Scholar ] [ CrossRef ]
  • Ying, C.; Huang, Z.; Ying, C. Accelerating the image processing by the optimization strategy for deep learning algorithm DBN. EURASIP J. Wirel. Commun. Netw. 2018 , 232 , 232. [ Google Scholar ] [ CrossRef ]
  • Protopapadakis, E.; Voulodimos, A.; Doulamis, A.; Doulamis, N.; Stathaki, T. Automatic crack detection for tunnel inspection using deep learning and heuristic image post-processing. Appl. Intell. 2019 , 49 , 2793–2806. [ Google Scholar ] [ CrossRef ]
  • Yong, B.; Wang, C.; Shen, J.; Li, F.; Yin, H.; Zhou, R. Automatic ventricular nuclear magnetic resonance image processing with deep learning. Multimed. Tools Appl. 2021 , 80 , 34103–34119. [ Google Scholar ] [ CrossRef ]
  • Freeman, W.; Jones, T.; Pasztor, E. Example-based super-resolution. IEEE Comput. Graph. Appl. 2002 , 22 , 56–65. [ Google Scholar ] [ CrossRef ]
  • Rodellar, J.; Alférez, S.; Acevedo, A.; Molina, A.; Merino, A. Image processing and machine learning in the morphological analysis of blood cells. Int. J. Lab. Hematol. 2018 , 40 , 46–53. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kasinathan, T.; Uyyala, S.R. Machine learning ensemble with image processing for pest identification and classification in field crops. Neural Comput. Appl. 2021 , 33 , 7491–7504. [ Google Scholar ] [ CrossRef ]
  • Yadav, P.; Gupta, N.; Sharma, P.K. A comprehensive study towards high-level approaches for weapon detection using classical machine learning and deep learning methods. Expert Syst. Appl. 2023 , 212 , 118698. [ Google Scholar ] [ CrossRef ]
  • Suganyadevi, S.; Seethalakshmi, V.; Balasamy, K. Reinforcement learning coupled with finite element modeling for facial motion learning. Int. J. Multimed. Inf. Retr. 2022 , 11 , 19–38. [ Google Scholar ] [ CrossRef ]
  • Zeng, Y.; Guo, Y.; Li, J. Recognition and extraction of high-resolution satellite remote sensing image buildings based on deep learning. Neural Comput. Appl. 2022 , 34 , 2691–2706. [ Google Scholar ] [ CrossRef ]
  • Pratap, A.; Sardana, N. Machine learning-based image processing in materials science and engineering: A review. Mater. Today Proc. 2022 , 62 , 7341–7347. [ Google Scholar ] [ CrossRef ]
  • Mahesh, B. Machine Learning Algorithms—A Review. Int. J. Sci. Res. 2020 , 9 , 1–6. [ Google Scholar ] [ CrossRef ]
  • Singh, D.P.; Kaushik, B. Machine learning concepts and its applications for prediction of diseases based on drug behaviour: An extensive review. Chemom. Intell. Lab. Syst. 2022 , 229 , 104637. [ Google Scholar ] [ CrossRef ]
  • Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. In Proceedings of the 4th International Conference on Learning Representations 2016, San Juan, Puerto Rico, 2–4 May 2016. [ Google Scholar ] [ CrossRef ]
  • Dworschak, F.; Dietze, S.; Wittmann, M.; Schleich, B.; Wartzack, S. Reinforcement Learning for Engineering Design Automation. Adv. Eng. Inform. 2022 , 52 , 101612. [ Google Scholar ] [ CrossRef ]
  • Khan, T.; Tian, W.; Zhou, G.; Ilager, S.; Gong, M.; Buyya, R. Machine learning (ML)-centric resource management in cloud computing: A review and future directions. J. Netw. Comput. Appl. 2022 , 204 , 103405. [ Google Scholar ] [ CrossRef ]
  • Botvinick, M.; Ritter, S.; Wang, J.X.; Kurth-Nelson, Z.; Blundell, C.; Hassabis, D. Reinforcement Learning, Fast and Slow. Trends Cogn. Sci. 2019 , 23 , 408–422. [ Google Scholar ] [ CrossRef ]
  • Moravčík, M.; Schmid, M.; Burch, N.; Lisý, V.; Morrill, D.; Bard, N.; Davis, T.; Waugh, K.; Johanson, M.; Bowling, M. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker. Science 2017 , 356 , 508–513. [ Google Scholar ] [ CrossRef ]
  • ElDahshan, K.A.; Farouk, H.; Mofreh, E. Deep Reinforcement Learning based Video Games: A Review. In Proceedings of the 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt, 8–9 May 2022. [ Google Scholar ] [ CrossRef ]
  • Huawei Technologies Co., Ltd. Overview of Deep Learning. In Artificial Intelligence Technology ; Springer: Singapore, 2023; Chapter 1–4; pp. 87–122. [ Google Scholar ] [ CrossRef ]
  • Le, N.; Rathour, V.S.; Yamazaki, K.; Luu, K.; Savvides, M. Deep reinforcement learning in computer vision: A comprehensive survey. Artif. Intell. Rev. 2022 , 55 , 2733–2819. [ Google Scholar ] [ CrossRef ]
  • Melanthota, S.K.; Gopal, D.; Chakrabarti, S.; Kashyap, A.A.; Radhakrishnan, R.; Mazumder, N. Deep learning-based image processing in optical microscopy. Biophys. Rev. 2022 , 14 , 463–481. [ Google Scholar ] [ CrossRef ]
  • Winovich, N.; Ramani, K.; Lin, G. ConvPDE-UQ: Convolutional neural networks with quantified uncertainty for heterogeneous elliptic partial differential equations on varied domains. J. Comput. Phys. 2019 , 394 , 263–279. [ Google Scholar ] [ CrossRef ]
  • Pham, H.; Warin, X.; Germain, M. Neural networks-based backward scheme for fully nonlinear PDEs. SN Partial. Differ. Equ. Appl. 2021 , 2 , 16. [ Google Scholar ] [ CrossRef ]
  • Wei, X.; Jiang, S.; Li, Y.; Li, C.; Jia, L.; Li, Y. Defect Detection of Pantograph Slide Based on Deep Learning and Image Processing Technology. IEEE Trans. Intell. Transp. Syst. 2020 , 21 , 947–958. [ Google Scholar ] [ CrossRef ]
  • E, W.; Yu, B. The deep ritz method: A deep learning based numerical algorithm for solving variational problems. Commun. Math. Stat. 2018 , 6 , 1–12. [ Google Scholar ] [ CrossRef ]
  • Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional neural networks: An overview and application in radiology. Insights Imaging 2018 , 9 , 611–629. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Archarya, U.; Oh, S.; Hagiwara, Y.; Tan, J.; Adam, M.; Gertych, A.; Tan, R. A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 2021 , 89 , 389–396. [ Google Scholar ] [ CrossRef ]
  • Ha, V.K.; Ren, J.C.; Xu, X.Y.; Zhao, S.; Xie, G.; Masero, V.; Hussain, A. Deep Learning Based Single Image Super-resolution: A Survey. Int. J. Autom. Comput. 2019 , 16 , 413–426. [ Google Scholar ] [ CrossRef ]
  • Jeong, C.Y.; Yang, H.S.; Moon, K. Fast horizon detection in maritime images using region-of-interest. Int. J. Distrib. Sens. Netw. 2018 , 14 , 1550147718790753. [ Google Scholar ] [ CrossRef ]
  • Olmos, R.; Tabik, S.; Lamas, A.; Pérez-Hernández, F.; Herrera, F. A binocular image fusion approach for minimizing false positives in handgun detection with deep learning. Inf. Fusion 2019 , 49 , 271–280. [ Google Scholar ] [ CrossRef ]
  • Zhao, X.; Wu, Y.; Tian, J.; Zhang, H. Single Image Super-Resolution via Blind Blurring Estimation and Dictionary Learning. Neurocomputing 2016 , 212 , 3–11. [ Google Scholar ] [ CrossRef ]
  • Qi, C.; Song, C.; Xiao, F.; Song, S. Generalization ability of hybrid electric vehicle energy management strategy based on reinforcement learning method. Energy 2022 , 250 , 123826. [ Google Scholar ] [ CrossRef ]
  • Ritto, T.; Beregi, S.; Barton, D. Reinforcement learning and approximate Bayesian computation for model selection and parameter calibration applied to a nonlinear dynamical system. Mech. Syst. Signal Process. 2022 , 181 , 109485. [ Google Scholar ] [ CrossRef ]
  • Hwang, R.; Lee, H.; Hwang, H.J. Option compatible reward inverse reinforcement learning. Pattern Recognit. Lett. 2022 , 154 , 83–89. [ Google Scholar ] [ CrossRef ]
  • Ladosz, P.; Weng, L.; Kim, M.; Oh, H. Exploration in deep reinforcement learning: A survey. Inf. Fusion 2022 , 85 , 1–22. [ Google Scholar ] [ CrossRef ]
  • Khayyat, M.M.; Elrefaei, L.A. Deep reinforcement learning approach for manuscripts image classification and retrieval. Multimed. Tools Appl. 2022 , 81 , 15395–15417. [ Google Scholar ] [ CrossRef ]
  • Nguyen, D.P.; Ho Ba Tho, M.C.; Dao, T.T. A review on deep learning in medical image analysis. Comput. Methods Programs Biomed. 2022 , 221 , 106904. [ Google Scholar ] [ CrossRef ]
  • Laskin, M.; Lee, K.; Stooke, A.; Pinto, L.; Abbeel, P.; Srinivas, A. Reinforcement Learning with Augmented Data. In Proceedings of the 34th Conference on Neural Information Processing Systems 2020, Vancouver, BC, Canada, 6–12 December 2020; pp. 19884–19895. [ Google Scholar ]
  • Li, H.; Xu, H. Deep reinforcement learning for robust emotional classification in facial expression recognition. Knowl.-Based Syst. 2020 , 204 , 106172. [ Google Scholar ] [ CrossRef ]
  • Gomes, G.; Vidal, C.A.; Cavalcante-Neto, J.B.; Nogueira, Y.L. A modeling environment for reinforcement learning in games. Entertain. Comput. 2022 , 43 , 100516. [ Google Scholar ] [ CrossRef ]
  • Georgeon, O.L.; Casado, R.C.; Matignon, L.A. Modeling Biological Agents beyond the Reinforcement-learning Paradigm. Procedia Comput. Sci. 2015 , 71 , 17–22. [ Google Scholar ] [ CrossRef ]
  • Yin, S.; Liu, H. Wind power prediction based on outlier correction, ensemble reinforcement learning, and residual correction. Energy 2022 , 250 , 123857. [ Google Scholar ] [ CrossRef ]
  • Badia, A.P.; Piot, B.; Kapturowski, S.; Sprechmann, P.; Vitvitskyi, A.; Guo, D.; Blundell, C. Agent57: Outperforming the Atari Human Benchmark. arXiv 2020 , arXiv:2003.13350. [ Google Scholar ] [ CrossRef ]
  • Zong, K.; Luo, C. Reinforcement learning based framework for COVID-19 resource allocation. Comput. Ind. Eng. 2022 , 167 , 107960. [ Google Scholar ] [ CrossRef ]
  • Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015 , 518 , 529–533. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Ren, J.; Guan, F.; Li, X.; Cao, J.; Li, X. Optimization for image stereo-matching using deep reinforcement learning in rule constraints and parallax estimation. Neural Comput. Appl. 2023 , 1–11. [ Google Scholar ] [ CrossRef ]
  • Morales, E.F.; Murrieta-Cid, R.; Becerra, I.; Esquivel-Basaldua, M.A. A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning. Intell. Serv. Robot. 2021 , 14 , 773–805. [ Google Scholar ] [ CrossRef ]
  • Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017 , 60 , 84–90. [ Google Scholar ] [ CrossRef ]
  • Krichen, M. Convolutional Neural Networks: A Survey. Computers 2023 , 12 , 151. [ Google Scholar ] [ CrossRef ]
  • Song, D.; Kim, T.; Lee, Y.; Kim, J. Image-Based Artificial Intelligence Technology for Diagnosing Middle Ear Diseases: A Systematic Review. J. Clin. Med. 2023 , 12 , 5831. [ Google Scholar ] [ CrossRef ]
  • Muñoz-Saavedra, L.; Escobar-Linero, E.; Civit-Masot, J.; Luna-Perejón, F.; Civit, A.; Domínguez-Morales, M. A Robust Ensemble of Convolutional Neural Networks for the Detection of Monkeypox Disease from Skin Images. Sensors 2023 , 23 , 7134. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wang, Y.; Hargreaves, C.A. A Review Study of the Deep Learning Techniques used for the Classification of Chest Radiological Images for COVID-19 Diagnosis. Int. J. Inf. Manag. Data Insights 2022 , 2 , 100100. [ Google Scholar ] [ CrossRef ]
  • Teng, Y.; Pan, D.; Zhao, W. Application of deep learning ultrasound imaging in monitoring bone healing after fracture surgery. J. Radiat. Res. Appl. Sci. 2023 , 16 , 100493. [ Google Scholar ] [ CrossRef ]
  • Zaghari, N.; Fathy, M.; Jameii, S.M.; Sabokrou, M.; Shahverdy, M. Improving the learning of self-driving vehicles based on real driving behavior using deep neural network techniques. J. Supercomput. 2021 , 77 , 3752–3794. [ Google Scholar ] [ CrossRef ]
  • Farag, W. Cloning Safe Driving Behavior for Self-Driving Cars using Convolutional Neural Networks. Recent Patents Comput. Sci. 2019 , 11 , 120–127. [ Google Scholar ] [ CrossRef ]
  • Agyemang, I.; Zhang, X.; Acheampong, D.; Adjei-Mensah, I.; Kusi, G.; Mawuli, B.C.; Agbley, B.L. Autonomous health assessment of civil infrastructure using deep learning and smart devices. Autom. Constr. 2022 , 141 , 104396. [ Google Scholar ] [ CrossRef ]
  • Zhou, S.; Canchila, C.; Song, W. Deep learning-based crack segmentation for civil infrastructure: Data types, architectures, and benchmarked performance. Autom. Constr. 2023 , 146 , 104678. [ Google Scholar ] [ CrossRef ]
  • Guerrieri, M.; Parla, G. Flexible and stone pavements distress detection and measurement by deep learning and low-cost detection devices. Eng. Fail. Anal. 2022 , 141 , 106714. [ Google Scholar ] [ CrossRef ]
  • Hoang, N.; Nguyen, Q. A novel method for asphalt pavement crack classification based on image processing and machine learning. Eng. Comput. 2019 , 35 , 487–498. [ Google Scholar ] [ CrossRef ]
  • Tabrizi, S.E.; Xiao, K.; Van Griensven Thé, J.; Saad, M.; Farghaly, H.; Yang, S.X.; Gharabaghi, B. Hourly road pavement surface temperature forecasting using deep learning models. J. Hydrol. 2021 , 603 , 126877. [ Google Scholar ] [ CrossRef ]
  • Jardim, S.V.B. Sparse and Robust Signal Reconstruction. Theory Appl. Math. Comput. Sci. 2015 , 5 , 1–19. [ Google Scholar ]
  • Jackulin, C.; Murugavalli, S. A comprehensive review on detection of plant disease using machine learning and deep learning approaches. Meas. Sens. 2022 , 24 , 100441. [ Google Scholar ] [ CrossRef ]
  • Keceli, A.S.; Kaya, A.; Catal, C.; Tekinerdogan, B. Deep learning-based multi-task prediction system for plant disease and species detection. Ecol. Inform. 2022 , 69 , 101679. [ Google Scholar ] [ CrossRef ]
  • Kotwal, J.; Kashyap, D.; Pathan, D. Agricultural plant diseases identification: From traditional approach to deep learning. Mater. Today Proc. 2023 , 80 , 344–356. [ Google Scholar ] [ CrossRef ]
  • Naik, A.; Thaker, H.; Vyas, D. A survey on various image processing techniques and machine learning models to detect, quantify and classify foliar plant disease. Proc. Indian Natl. Sci. Acad. 2021 , 87 , 191–198. [ Google Scholar ] [ CrossRef ]
  • Thaiyalnayaki, K.; Joseph, C. Classification of plant disease using SVM and deep learning. Mater. Today Proc. 2021 , 47 , 468–470. [ Google Scholar ] [ CrossRef ]
  • Carnegie, A.J.; Eslick, H.; Barber, P.; Nagel, M.; Stone, C. Airborne multispectral imagery and deep learning for biosecurity surveillance of invasive forest pests in urban landscapes. Urban For. Urban Green. 2023 , 81 , 127859. [ Google Scholar ] [ CrossRef ]
  • Hadipour-Rokni, R.; Askari Asli-Ardeh, E.; Jahanbakhshi, A.; Esmaili paeen-Afrakoti, I.; Sabzi, S. Intelligent detection of citrus fruit pests using machine vision system and convolutional neural network through transfer learning technique. Comput. Biol. Med. 2023 , 155 , 106611. [ Google Scholar ] [ CrossRef ]
  • Agrawal, P.; Chaudhary, D.; Madaan, V.; Zabrovskiy, A.; Prodan, R.; Kimovski1, D.; Timmerer, C. Automated bank cheque verification using image processing and deep learning methods. Multimed. Tools Appl. 2021 , 80 , 5319–5350. [ Google Scholar ] [ CrossRef ]
  • Gordo, A.; Almazán, J.; Revaud, J.; Larlus, D. Deep Image Retrieval: Learning Global Representations for Image Search. In Proceedings of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 241–257. [ Google Scholar ]
  • Jardim, S.; António, J.; Mora, C.; Almeida, A. A Novel Trademark Image Retrieval System Based on Multi-Feature Extraction and Deep Networks. J. Imaging 2022 , 8 , 238. [ Google Scholar ] [ CrossRef ]
  • Lin, K.; Yang, H.F.; Hsiao, J.H.; Chen, C.S. Deep learning of binary hash codes for fast image retrieval. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, USA, 7–12 June 2015; pp. 27–35. [ Google Scholar ] [ CrossRef ]
  • Andriasyan, V.; Yakimovich, A.; Petkidis, A.; Georgi, F.; Georgi, R.; Puntener, D.; Greber, U. Microscopy deep learning predicts virus infections and reveals mechanics of lytic-infected cells. iScience 2021 , 24 , 102543. [ Google Scholar ] [ CrossRef ]
  • Lüneburg, N.; Reiss, N.; Feldmann, C.; van der Meulen, P.; van de Steeg, M.; Schmidt, T.; Wendl, R.; Jansen, S. Photographic LVAD Driveline Wound Infection Recognition Using Deep Learning. In dHealth 2019—From eHealth to dHealth ; IOS Press: Amsterdam, The Netherlands, 2019; pp. 192–199. [ Google Scholar ] [ CrossRef ]
  • Fink, O.; Wang, Q.; Svensén, M.; Dersin, P.; Lee, W.J.; Ducoffe, M. Potential, challenges and future directions for deep learning in prognostics and health management applications. Eng. Appl. Artif. Intell. 2020 , 92 , 103678. [ Google Scholar ] [ CrossRef ]
  • Ahmed, I.; Ahmad, M.; Jeon, G. Social distance monitoring framework using deep learning architecture to control infection transmission of COVID-19 pandemic. Sustain. Cities Soc. 2021 , 69 , 102777. [ Google Scholar ] [ CrossRef ]
  • Hussain, S.; Yu, Y.; Ayoub, M.; Khan, A.; Rehman, R.; Wahid, J.A.; Hou, W. IoT and Deep Learning Based Approach for Rapid Screening and Face Mask Detection for Infection Spread Control of COVID-19. Appl. Sci. 2021 , 11 , 3495. [ Google Scholar ] [ CrossRef ]
  • Kaur, J.; Kaur, P. Outbreak COVID-19 in Medical Image Processing Using Deep Learning: A State-of-the-Art Review. Arch. Comput. Methods Eng. 2022 , 29 , 2351–2382. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Groen, A.M.; Kraan, R.; Amirkhan, S.F.; Daams, J.G.; Maas, M. A systematic review on the use of explainability in deep learning systems for computer aided diagnosis in radiology: Limited use of explainable AI? Int. J. Autom. Comput. 2022 , 157 , 110592. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Hao, D.; Li, Q.; Feng, Q.X.; Qi, L.; Liu, X.S.; Arefan, D.; Zhang, Y.D.; Wu, S. SurvivalCNN: A deep learning-based method for gastric cancer survival prediction using radiological imaging data and clinicopathological variables. Artif. Intell. Med. 2022 , 134 , 102424. [ Google Scholar ] [ CrossRef ]
  • Cui, X.; Zheng, S.; Heuvelmans, M.A.; Du, Y.; Sidorenkov, G.; Fan, S.; Li, Y.; Xie, Y.; Zhu, Z.; Dorrius, M.D.; et al. Performance of a deep learning-based lung nodule detection system as an alternative reader in a Chinese lung cancer screening program. Eur. J. Radiol. 2022 , 146 , 110068. [ Google Scholar ] [ CrossRef ]
  • Liu, L.; Li, C. Comparative study of deep learning models on the images of biopsy specimens for diagnosis of lung cancer treatment. J. Radiat. Res. Appl. Sci. 2023 , 16 , 100555. [ Google Scholar ] [ CrossRef ]
  • Muniz, F.B.; de Freitas Oliveira Baffa, M.; Garcia, S.B.; Bachmann, L.; Felipe, J.C. Histopathological diagnosis of colon cancer using micro-FTIR hyperspectral imaging and deep learning. Comput. Methods Programs Biomed. 2023 , 231 , 107388. [ Google Scholar ] [ CrossRef ]
  • Gomes, S.L.; de S. Rebouças, E.; Neto, E.C.; Papa, J.P.; de Albuquerque, V.H.C.; Filho, P.P.R.; Tavares, J.M.R.S. Embedded real-time speed limit sign recognition using image processing and machine learning techniques. Neural Comput. Appl. 2017 , 28 , 573–584. [ Google Scholar ] [ CrossRef ]
  • Monga, V.; Li, Y.; Eldar, Y.C. Algorithm Unrolling: Interpretable, Efficient Deep Learning for Signal and Image Processing. IEEE Signal Process. Mag. 2021 , 38 , 18–44. [ Google Scholar ] [ CrossRef ]
  • Zhang, L.; Cheng, L.; Li, H.; Gao, J.; Yu, C.; Domel, R.; Yang, Y.; Tang, S.; Liu, W.K. Hierarchical deep-learning neural networks: Finite elements and beyond. Comput. Mech. 2021 , 67 , 207–230. [ Google Scholar ] [ CrossRef ]
  • Salahzadeh, Z.; Rezaei-Hachesu, P.; Gheibi, Y.; Aghamali, A.; Pakzad, H.; Foladlou, S.; Samad-Soltani, T. A mechatronics data collection, image processing, and deep learning platform for clinical posture analysis: A technical note. Phys. Eng. Sci. Med. 2021 , 44 , 901–910. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Singh, P.; Hrisheekesha, P.; Singh, V.K. CBIR-CNN: Content-Based Image Retrieval on Celebrity Data Using Deep Convolution Neural Network. Recent Adv. Comput. Sci. Commun. 2021 , 14 , 257–272. [ Google Scholar ] [ CrossRef ]
  • Varga, D.; Szirányi, T. Fast content-based image retrieval using convolutional neural network and hash function. In Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Budapest, Hungary, 9–12 October 2016; pp. 2636–2640. [ Google Scholar ] [ CrossRef ]
  • Latif, A.; Rasheed, A.; Sajid, U.; Ahmed, J.; Ali, N.; Ratyal, N.I.; Zafar, B.; Dar, S.H.; Sajid, M.; Khalil, T. Content-Based Image Retrieval and Feature Extraction: A Comprehensive Review. Math. Probl. Eng. 2019 , 2019 , 9658350. [ Google Scholar ] [ CrossRef ]
  • Rani, P.; Kotwal, S.; Manhas, J.; Sharma, V.; Sharma, S. Machine Learning and Deep Learning Based Computational Approaches in Automatic Microorganisms Image Recognition: Methodologies, Challenges, and Developments. Arch. Comput. Methods Eng. 2022 , 29 , 1801–1837. [ Google Scholar ] [ CrossRef ]
  • Jardim, S.V.B.; Figueiredo, M.A.T. Automatic Analysis of Fetal Echographic Images. Proc. Port. Conf. Pattern Recognit. 2002 , 1 , 1–6. [ Google Scholar ]
  • Jardim, S.V.B.; Figueiredo, M.A.T. Automatic contour estimation in fetal ultrasound images. In Proceedings of the 2003 International Conference on Image Processing 2003, Barcelona, Spain, 14–17 September 2003; Volum 1, pp. 1065–1068. [ Google Scholar ] [ CrossRef ]
  • Devunooru, S.; Alsadoon, A.; Chandana, P.W.C.; Beg, A. Deep learning neural networks for medical image segmentation of brain tumours for diagnosis: A recent review and taxonomy. J. Ambient Intell. Humaniz. Comput. 2021 , 12 , 455–483. [ Google Scholar ] [ CrossRef ]
  • Anaya-Isaza, A.; Mera-Jiménez, L.; Verdugo-Alejo, L.; Sarasti, L. Optimizing MRI-based brain tumor classification and detection using AI: A comparative analysis of neural networks, transfer learning, data augmentation, and the cross-transformer network. Eur. J. Radiol. Open 2023 , 10 , 100484. [ Google Scholar ] [ CrossRef ]
  • Cao, Y.; Kunaprayoon, D.; Xu, J.; Ren, L. AI-assisted clinical decision making (CDM) for dose prescription in radiosurgery of brain metastases using three-path three-dimensional CNN. Clin. Transl. Radiat. Oncol. 2023 , 39 , 100565. [ Google Scholar ] [ CrossRef ]
  • Chakrabarty, N.; Mahajan, A.; Patil, V.; Noronha, V.; Prabhash, K. Imaging of brain metastasis in non-small-cell lung cancer: Indications, protocols, diagnosis, post-therapy imaging, and implications regarding management. Clin. Radiol. 2023 , 78 , 175–186. [ Google Scholar ] [ CrossRef ]
  • Mehrotra, R.; Ansari, M.; Agrawal, R.; Anand, R. A Transfer Learning approach for AI-based classification of brain tumors. Mach. Learn. Appl. 2020 , 2 , 100003. [ Google Scholar ] [ CrossRef ]
  • Drai, M.; Testud, B.; Brun, G.; Hak, J.F.; Scavarda, D.; Girard, N.; Stellmann, J.P. Borrowing strength from adults: Transferability of AI algorithms for paediatric brain and tumour segmentation. Eur. J. Radiol. 2022 , 151 , 110291. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Ranjbarzadeh, R.; Caputo, A.; Tirkolaee, E.B.; Jafarzadeh Ghoushchi, S.; Bendechache, M. Brain tumor segmentation of MRI images: A comprehensive review on the application of artificial intelligence tools. Comput. Biol. Med. 2023 , 152 , 106405. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Yedder, H.B.; Cardoen, B.; Hamarneh, G. Deep learning for biomedical image reconstruction: A survey. Artif. Intell. Rev. 2021 , 54 , 215–251. [ Google Scholar ] [ CrossRef ]
  • Manuel Davila Delgado, J.; Oyedele, L. Robotics in construction: A critical review of the reinforcement learning and imitation learning paradigms. Adv. Eng. Inform. 2022 , 54 , 101787. [ Google Scholar ] [ CrossRef ]
  • Íñigo Elguea-Aguinaco; Serrano-Muñoz, A.; Chrysostomou, D.; Inziarte-Hidalgo, I.; Bøgh, S.; Arana-Arexolaleiba, N. A review on reinforcement learning for contact-rich robotic manipulation tasks. Robot. Comput.-Integr. Manuf. 2023 , 81 , 102517. [ Google Scholar ] [ CrossRef ]
  • Ahn, K.H.; Na, M.; Song, J.B. Robotic assembly strategy via reinforcement learning based on force and visual information. Robot. Auton. Syst. 2023 , 164 , 104399. [ Google Scholar ] [ CrossRef ]
  • Jafari, M.; Xu, H.; Carrillo, L.R.G. A biologically-inspired reinforcement learning based intelligent distributed flocking control for Multi-Agent Systems in presence of uncertain system and dynamic environment. IFAC J. Syst. Control 2020 , 13 , 100096. [ Google Scholar ] [ CrossRef ]
  • Wang, X.; Liu, S.; Yu, Y.; Yue, S.; Liu, Y.; Zhang, F.; Lin, Y. Modeling collective motion for fish schooling via multi-agent reinforcement learning. Ecol. Model. 2023 , 477 , 110259. [ Google Scholar ] [ CrossRef ]
  • Jain, D.K.; Dutta, A.K.; Verdú, E.; Alsubai, S.; Sait, A.R.W. An automated hyperparameter tuned deep learning model enabled facial emotion recognition for autonomous vehicle drivers. Image Vis. Comput. 2023 , 133 , 104659. [ Google Scholar ] [ CrossRef ]
  • Silver, D.; Hubert, T.; Schrittwieser, J.; Antonoglou, I.; Lai, M.; Guez, A.; Lanctot, M.; Sifre, L.; Kumaran, D.; Graepel, T.; et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 2018 , 362 , 1140–1144. [ Google Scholar ] [ CrossRef ]
  • Ueda, M. Memory-two strategies forming symmetric mutual reinforcement learning equilibrium in repeated prisoners’ dilemma game. Appl. Math. Comput. 2023 , 444 , 127819. [ Google Scholar ] [ CrossRef ]
  • Wang, X.; Liu, F.; Ma, X. Mixed distortion image enhancement method based on joint of deep residuals learning and reinforcement learning. Signal Image Video Process. 2021 , 15 , 995–1002. [ Google Scholar ] [ CrossRef ]
  • Dai, Y.; Wang, G.; Muhammad, K.; Liu, S. A closed-loop healthcare processing approach based on deep reinforcement learning. Multimed. Tools Appl. 2022 , 81 , 3107–3129. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Valente, J.; António, J.; Mora, C.; Jardim, S. Developments in Image Processing Using Deep Learning and Reinforcement Learning. J. Imaging 2023 , 9 , 207. https://doi.org/10.3390/jimaging9100207

Valente J, António J, Mora C, Jardim S. Developments in Image Processing Using Deep Learning and Reinforcement Learning. Journal of Imaging . 2023; 9(10):207. https://doi.org/10.3390/jimaging9100207

Valente, Jorge, João António, Carlos Mora, and Sandra Jardim. 2023. "Developments in Image Processing Using Deep Learning and Reinforcement Learning" Journal of Imaging 9, no. 10: 207. https://doi.org/10.3390/jimaging9100207

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Research Topics

Biomedical Imaging

Biomedical Imaging

The current plethora of imaging technologies such as magnetic resonance imaging (MR), computed tomography (CT), position emission tomography (PET), optical coherence tomography (OCT), and ultrasound provide great insight into the different anatomical and functional processes of the human body.

Computer Vision

Computer Vision

Computer vision is the science and technology of teaching a computer to interpret images and video as well as a typical human. Technically, computer vision encompasses the fields of image/video processing, pattern recognition, biological vision, artificial intelligence, augmented reality, mathematical modeling, statistics, probability, optimization, 2D sensors, and photography.

Image Segmentation/Classification

Image Segmentation/Classification

Extracting information from a digital image often depends on first identifying desired objects or breaking down the image into homogenous regions (a process called 'segmentation') and then assigning these objects to particular classes (a process called 'classification'). This is a fundamental part of computer vision, combining image processing and pattern recognition techniques.

Multiresolution Techniques

Multiresolution   Techniques

The VIP lab has a particularly extensive history with multiresolution methods, and a significant number of research students have explored this theme. Multiresolution methods are very broad, essentially meaning than an image or video is modeled, represented, or features extracted on more than one scale, somehow allowing both local and non-local phenomena.

Remote Sensing

Remote Sensing

Remote sensing, or the science of capturing data of the earth from airplanes or satellites, enables regular monitoring of land, ocean, and atmosphere expanses, representing data that cannot be captured using any other means. A vast amount of information is generated by remote sensing platforms and there is an obvious need to analyze the data accurately and efficiently.

Scientific Imaging

Scientific Imaging

Scientific Imaging refers to working on two- or three-dimensional imagery taken for a scientific purpose, in most cases acquired either through a microscope or remotely-sensed images taken at a distance.

Stochastic Models

Stochastic Models

In many image processing, computer vision, and pattern recognition applications, there is often a large degree of uncertainty associated with factors such as the appearance of the underlying scene within the acquired data, the location and trajectory of the object of interest, the physical appearance (e.g., size, shape, color, etc.) of the objects being detected, etc.

Video Analysis

Video Analysis

Video analysis is a field within  computer vision  that involves the automatic interpretation of digital video using computer algorithms. Although humans are readily able to interpret digital video, developing algorithms for the computer to perform the same task has been highly evasive and is now an active research field.

Deep Evolution Figure

Evolutionary Deep Intelligence

Deep learning has shown considerable promise in recent years, producing tremendous results and significantly improving the accuracy of a variety of challenging problems when compared to other machine learning methods.

Discovered Radiomics Sequencer

Discovery Radiomics

Radiomics, which involves the high-throughput extraction and analysis of a large amount of quantitative features from medical imaging data to characterize tumor phenotype in a quantitative manner, is ushering in a new era of imaging-driven quantitative personalized cancer decision support and management. 

Discovered Radiomics Sequencer

Sports Analytics

Sports Analytics is a growing field in computer vision that analyzes visual cues from images to provide statistical data on players, teams, and games. Want to know how a player's technique improves the quality of the team? Can a team, based on their defensive position, increase their chances to the finals? These are a few out of a plethora of questions that are answered in sports analytics.

Share via Facebook

  • Contact Waterloo
  • Maps & Directions
  • Accessibility

The University of Waterloo acknowledges that much of our work takes place on the traditional territory of the Neutral, Anishinaabeg, and Haudenosaunee peoples. Our main campus is situated on the Haldimand Tract, the land granted to the Six Nations that includes six miles on each side of the Grand River. Our active work toward reconciliation takes place across our campuses through research, learning, teaching, and community building, and is co-ordinated within the Office of Indigenous Relations .

facebook

  • Skip to primary navigation
  • Skip to main content

OpenCV

Open Computer Vision Library

Research Areas in Computer Vision: Trends and Challenges

Farooq Alvi February 7, 2024 Leave a Comment AI Careers

research areas in computer vision

Basics of Computer Vision

Computer Vision (CV) is a field of artificial intelligence that trains computers to interpret and understand the visual world. Using digital images from cameras and videos, along with deep learning models, computers can accurately identify and classify objects, and then react to what they “see.”

Key Concepts in Computer Vision

Image Processing: At the heart of CV is image processing, which involves enhancing image data (removing noise, sharpening, or brightening an image) and preparing it for further analysis.

Feature Detection and Matching: This involves identifying and using specific features of an image, like edges, corners, or objects, to understand the content of the image.

Pattern Recognition: CV uses pattern recognition to identify patterns and regularities in data. This can be as simple as recognizing the shape of an object or as complex as identifying a person’s face.

Core Technologies Powering Computer Vision

Machine Learning and Deep Learning: These are crucial for teaching computers to recognize patterns in visual data. Deep learning, especially, has been a game-changer, enabling advancements in facial recognition, object detection, and more.

Neural Networks: A type of machine learning, neural networks, particularly Convolutional Neural Networks (CNNs), are pivotal in analyzing visual imagery.

Image Recognition and Classification: This is the process of identifying and labeling objects within an image. It’s one of the most common applications of CV.

Object Detection: This goes a step further than image classification by not only identifying objects in images but also locating them.

Applications of Basic Computer Vision

Automated Inspection: Used in manufacturing to identify defects.

Surveillance: Helps in monitoring activities for security purposes.

Retail: For example, in cashier-less stores where CV tracks what customers pick up.

Healthcare: Assisting in diagnostic procedures through medical image analysis.

Challenges and Limitations

Data Quality and Quantity: The accuracy of a computer vision system is highly dependent on the quality and quantity of the data it’s trained on.

Computational Requirements: Advanced CV models require significant computational power, making them resource-intensive.

Ethical and Privacy Concerns: The use of CV in surveillance and data collection raises ethical and privacy issues that need to be addressed.

This interesting topic “2024 Guide to becoming a Computer Vision Engineer ” will help you set off on your journey to becoming one.

Key Research Areas in Computer Vision

research areas in computer vision

Augmented Reality: The Convergence with Computer Vision

In 2024, Augmented Reality (AR) continues to make significant strides, increasingly integrating with computer vision (CV) to create more immersive and interactive experiences across various sectors. This integration is crucial as AR requires understanding and interacting with the real world through visual information, a capability at the core of CV.

Manufacturing, Retail, and Education: Transformative Sectors

Manufacturing : AR devices enable manufacturing workers to access real-time instructional and administrative information. This integration significantly enhances efficiency and accuracy in production processes.

Retail : In the retail sector, AR is revolutionizing the shopping experience. Consumers can now visualize products in great detail, including pricing and features, right from their AR devices, offering a more engaging and informed shopping experience.

Education: The impact of AR in education is substantial. Traditional teaching methods are being supplemented with immersive and interactive AR experiences, making learning more engaging and effective for students.

Technological Advances in AR

The advancement in AR technology, backed by major companies like Apple and Meta, is seeing a surge of consumer-grade AR devices entering the market. These devices are set to become more widely available, making AR more integral to daily life and work.

The development of sophisticated AR gaming is a testament to this growth. AR games now offer realistic gameplay, integrating virtual objects and characters into the real world, enhancing player engagement, and creating new possibilities in gaming and non-gaming applications. Startups like Mohx-games and smar.toys are at the forefront of this innovation, developing platforms and controllers that elevate the AR gaming experience.

Mobile AR tools are another significant advancement. These tools utilize the increasing capabilities of smartphone cameras and sensors to enhance AR interactions’ realism and immersion. Platforms like Phantom Technology’s PhantomEngine enable developers to create more sophisticated and context-aware AR applications.

Wearables with AR capabilities , such as those developed by ARKH and Wavelens, are offering hands-free experiences, further expanding the usability and applications of AR in various industries, including manufacturing and logistics. These wearables provide real-time guidance and information directly in the user’s field of view, enhancing convenience and efficiency.

3D design and prototyping in AR , as exemplified by Virtualist’s building design platform, are enabling industries like architecture and automotive to visualize products and designs in real-world contexts, significantly improving the decision-making process and reducing design errors.

Robotic Language-Vision Models (RLVM)

Integration of vision and language in robotics.

In 2024, the field of robotics is witnessing a significant shift with the integration of Language-Vision Models (RLVM), which are transforming how robots understand and interact with their environment. This blend of visual comprehension and language interpretation is paving the way for a new era of intelligent, responsive robotics.

Advancements in Robotic Language-Vision Models

Enhanced Learning Capabilities: Research and development efforts are increasingly focusing on using generative AI to make robots faster learners, especially for complex manipulation tasks. This advancement is likely to continue throughout 2024, potentially leading to commercial applications in robotics.

Natural Language Understanding: 

Robots are becoming more personable, thanks to their improved ability to understand natural language instructions. This evolution is exemplified by projects where robots, such as Boston Dynamics’ Spot, are turned into interactive agents like tour guides.

Wider Application Spectrum: 

Robots are moving beyond traditional environments like warehouses and manufacturing into public-facing roles in restaurants, hotels, hospitals, and more. Enabled by generative AI, these robots are expected to interact more naturally with people, enhancing their utility in these new roles.

Autonomous Mobile Robots (AMRs): 

AMRs, combining sensors, AI, and computer vision, are increasingly used in varied settings, from factory floors to hospital corridors, for tasks like material handling, disinfection, and delivery services.

Intelligent Robotics: 

Integration of AI in robotics is allowing robots to use real-time information to optimize tasks. This includes leveraging computer vision and machine learning for improved accuracy and performance in applications such as manufacturing automation and customer service in retail and hospitality.

Collaborative Robots (Cobots): 

Cobots are being designed to safely interact and work alongside humans, augmenting human efforts in various industrial processes. Advances in sensor technology and software are enabling these robots to perform tasks more safely and efficiently alongside human workers.

Robotics as a Service (RaaS): 

RaaS models are becoming more popular, providing businesses with flexible and scalable access to robotic solutions. This approach is particularly beneficial for small and medium-sized enterprises that can leverage robotic technology without incurring significant upfront costs.

Robotics Cybersecurity: 

As robotics systems become more interconnected, the importance of cybersecurity in robotics is growing. Solutions are being developed to protect robotic systems from cyber threats, ensuring the safety and reliability of these systems in various applications.

Top research universities in the US

Advanced Satellite Vision: 

Monitoring environmental and urban changes.

In 2024, the capabilities of satellite imagery have been significantly enhanced by advancements in computer vision (CV), leading to more effective monitoring of environmental and urban changes.

Satellite Imagery and Computer Vision

High-Resolution Monitoring: CV-powered satellite imagery provides high-resolution monitoring of various terrestrial phenomena. This includes tracking urban sprawl, deforestation, and changes in marine environments.

Environmental Management

These technological advancements are crucial for environmental monitoring and management. The detailed data from satellite imagery enables the study of ecological and climatic changes with unprecedented precision.

Urban Planning and Development

In urban areas, satellite vision assists in planning and development, providing critical data for infrastructure development, land use planning, and resource management.

Disaster Response and Management

Advanced satellite vision plays a key role in disaster management. It helps in assessing the impact of natural disasters and planning effective response strategies.

Agricultural Applications

In agriculture, satellite imagery helps in monitoring crop health, soil conditions, and water resources, enabling more efficient and sustainable farming practices.

Climate Change Analysis

Satellite vision is instrumental in understanding and monitoring the effects of climate change globally, including polar ice melt, sea-level rise, and changes in weather patterns.

3D Computer Vision: Enhancing Autonomous Vehicles and Digital Twin Modeling

In 2024, 3D Computer Vision (3D CV) is playing a pivotal role in advancing technologies in various sectors, particularly in autonomous vehicles and digital twin modeling.

3D Computer Vision in Autonomous Vehicles

Depth Perception: 3D CV enables autonomous vehicles to accurately perceive depth and distance. This is crucial for navigating complex environments and ensuring safety on the roads.

Object Detection and Tracking: It allows for precise detection and tracking of objects around the vehicle, including other vehicles, pedestrians, and road obstacles.

Environment Mapping: Advanced 3D imaging and processing help in creating detailed maps of the vehicle’s surroundings, essential for route planning and navigation.

Digital Twin Modeling with 3D Computer Vision

Accurate Replication: 3D CV is integral in creating accurate digital replicas of physical objects, buildings, or even entire cities for digital twin applications.

Simulation and Analysis: These digital twins are used for simulations, allowing for analysis and optimization of systems in a virtual environment before actual implementation.

Predictive Maintenance and Planning: In industries such as manufacturing and urban planning, digital twins aid in predictive maintenance and strategic planning, minimizing risks and enhancing efficiency.

Ethics in Computer Vision: Navigating Bias and Privacy Concerns

As computer vision (CV) technologies become increasingly integrated into various aspects of life, ethical considerations, particularly related to bias and privacy, are gaining prominence.

Addressing Bias in Computer Vision

Data Diversity: One major ethical challenge in CV is the bias in algorithms, often stemming from non-representative training data. Efforts are being made to create more diverse and inclusive datasets to help overcome biases related to race, gender, and other factors.

Fairness in Algorithms: There is a growing focus on developing algorithms that are fair and non-discriminatory. This includes techniques to detect and correct biases in CV systems.

Your Image Alt Text

Transparent and Explainable AI: Transparency in how CV models are built and function is crucial. There’s an emphasis on explainable AI, where the decision-making process of CV systems can be understood and interrogated by users.

Ensuring Privacy in Computer Vision

Consent and Anonymity: With CV technologies being used in public spaces, ensuring individual privacy is paramount. Techniques like face-blurring in videos and images are being adopted to protect identities.

Regulatory Compliance: Governments and regulatory bodies are proposing strict regulations to ensure responsible development and use of AI and CV technologies. This includes guidelines for data collection, processing, and storage to protect individual privacy.

Ethical Design and Deployment: Ethical considerations are increasingly becoming a part of the design and deployment process of CV technologies. This involves assessing the potential impact on society and individuals and ensuring that privacy and individual rights are safeguarded.

Synthetic Data and Generative AI in Computer Vision

The role of generative AI in creating synthetic data has become increasingly significant in developing and improving computer vision (CV) systems.

Generative AI and Synthetic Data Creation

Enhancing Training of CV Models: Generative AI algorithms can create realistic, high-quality synthetic data. This data is particularly valuable for training CV models, especially when real-world data is scarce, sensitive, or difficult to obtain.

Diversity and Volume: Synthetic data generated by AI can encompass various scenarios and variations, offering a rich and diverse dataset. This diversity is crucial for training robust CV models capable of performing accurately in various real-world conditions.

Privacy and Ethical Compliance: Using synthetic data mitigates privacy concerns associated with using real data, especially in sensitive areas like healthcare and security. It offers a way to train effective CV models without compromising individual privacy.

Cost-Effectiveness and Efficiency: Generating synthetic data can be more cost-effective and efficient than collecting and labeling vast amounts of real-world data. It also speeds up the iterative process of training and refining CV models.

Computer Vision in Edge Computing

In 2024, the trend of integrating Computer Vision (CV) with edge computing is becoming increasingly prominent, revolutionizing how data is processed in various applications.

The Shift to On-Device Processing

Reduced Latency: By processing visual data directly on the device (edge computing), response times are significantly decreased. This is vital in applications where real-time analysis is crucial, such as in autonomous vehicles or real-time monitoring systems.

Improved Privacy and Security: Edge computing allows for sensitive data to be processed locally, reducing the risk of data breaches during transmission to cloud-based servers. This is particularly important in applications involving personal or sensitive information.

Enhanced Efficiency: Local data processing minimizes the need to transfer large volumes of data to the cloud, thereby reducing bandwidth usage and associated costs. This is beneficial for devices operating in remote or bandwidth-constrained environments.

Scalability : Edge computing enables scalability in CV applications. Devices can process data independently, alleviating the load on central servers and allowing for the deployment of more devices without a proportional increase in central processing requirements.

Applications in Diverse Fields

Intelligent Security Systems: In security and surveillance, edge computing allows for immediate processing and analysis of visual data, enabling quicker response to potential security threats.

Healthcare: Portable medical devices with integrated CV can process data on the edge, aiding in immediate diagnostic procedures and patient monitoring.

Retail and Consumer Applications: In retail, edge computing enables smart shelves and inventory management systems to process visual data in real time, improving efficiency and customer experience.

Industrial and Manufacturing: In industrial settings, edge computing facilitates real-time monitoring and quality inspection, improving operational efficiency and safety.

Computer Vision in Healthcare

Computer Vision (CV) is significantly impacting the healthcare sector, offering innovative solutions for medical image analysis, surgical assistance, and patient monitoring.

Medical Image Analysis

Diagnostic Accuracy: CV algorithms are increasingly used to analyze medical images such as X-rays, MRIs, and CT scans. They assist in identifying abnormalities, leading to quicker and more accurate diagnoses.

Cancer Detection : In oncology, CV aids in the early detection of cancers, such as breast or skin cancer, through detailed analysis of medical imagery.

Automated Analysis: Automated image analysis can handle large volumes of medical images, reducing the workload on radiologists and increasing efficiency.

Aiding Surgeries

Surgical Robotics: CV is integral to the functioning of surgical robots, providing them with the necessary visual information to assist surgeons in performing precise and minimally invasive procedures.

Real-Time Navigation: During surgeries, CV provides real-time imaging, aiding surgeons in navigating complex procedures and avoiding critical structures.

Training and Simulation: CV technologies are used in surgical training, providing simulations that help surgeons hone their skills in a risk-free environment.

Patient Monitoring

Remote Monitoring : CV enables remote patient monitoring, allowing healthcare providers to observe patients’ physical condition and movements without being physically present. This is particularly beneficial for elderly care and monitoring patients in intensive care units.

Fall Detection and Prevention: In elderly care, CV systems can detect falls or unusual behaviors, alerting caregivers to potential emergencies.

Behavioral Analysis: CV is also used in analyzing patients’ behaviors and movements, which can be vital in psychiatric care and physical therapy.

Challenges and Future Directions

While CV is bringing transformative changes to healthcare, it also presents challenges such as data privacy concerns, the need for large annotated datasets, and ensuring the accuracy and reliability of algorithms. The future of CV in healthcare is promising, with ongoing research and development aimed at addressing these challenges and expanding its applications.

Top 7 research universities in India

Detecting Deepfakes: The Crucial Role of Computer Vision

As AI-generated deepfakes become increasingly realistic and pervasive, the importance of Computer Vision (CV) in detecting and combating them has become more critical.

The Challenge of Deepfakes

Realism and Proliferation: Deepfakes, synthesized using advanced AI algorithms, are becoming more sophisticated, making them harder to distinguish from real footage. Their potential use in spreading misinformation or malicious content poses significant challenges.

Misinformation and Security Threats: The use of deepfakes in spreading false information can have serious implications in various spheres, including politics, security, and personal privacy.

CV’s Role in Deepfake Detection

Analyzing Visual Inconsistencies: CV algorithms are trained to detect subtle inconsistencies in videos and images that are typically overlooked by the human eye. This includes irregularities in facial expressions, lip movements, and eye blinking patterns.

Temporal and Spatial Analysis: CV techniques analyze both spatial features (like facial features) and temporal features (like movement over time) in videos to identify anomalies that suggest manipulation.

Training on Diverse Data Sets: To improve the accuracy of deepfake detection, CV systems are trained on diverse datasets that include various types of manipulations and original content.

The importance of CV in identifying deepfakes cannot be understated, as it stands at the forefront of preserving information integrity in the digital age. The advancements in this field will be instrumental in maintaining trust and authenticity in digital media.

Real-Time Computer Vision

Enhancing security, crowd monitoring, and industrial safety.

Real-time computer vision (CV) technologies are increasingly being deployed in various fields like security, crowd monitoring, and industrial safety, offering dynamic and immediate data analysis for enhanced operational efficiency and safety.

Applications in Security

Surveillance Systems: Real-time CV is revolutionizing surveillance by enabling immediate identification and alerting of security breaches or unusual activities. This includes facial recognition, intrusion detection, and unauthorized access alerts.

Automated Threat Detection: CV systems can detect potential threats in real-time, such as identifying unattended bags in public areas or spotting unusual behaviors that could indicate criminal activities.

Crowd Monitoring and Management

Public Safety: In large public gatherings, real-time CV aids in crowd density analysis, helping to prevent stampedes or accidents by alerting authorities to potential dangers due to overcrowding.

Traffic Management: In urban settings, CV systems monitor and analyze traffic flow in real time, helping in congestion management and accident prevention.

Event Management: For events like concerts or sports games, real-time CV can assist in crowd control, ensuring that safety regulations are adhered to and identifying potential bottlenecks or overcrowding situations.

Industrial Safety

Workplace Monitoring: CV systems monitor industrial environments in real time, detecting potential hazards like equipment malfunctions or unsafe worker behavior, thus preventing accidents and ensuring compliance with safety protocols.

Quality Control: In manufacturing, real-time CV assists in continuous monitoring of production lines, instantly identifying defects or deviations from standard protocols.

Equipment Maintenance: CV can help in predictive maintenance by detecting early signs of wear and tear in machinery, preventing costly downtime and accidents.

Top research universities in Europe

Conclusion: Navigating the Future of Computer Vision

From enhancing healthcare and security to revolutionizing interactive technologies like AR, CV is reshaping our interaction with the digital world. Its advancements, including AI integration and edge computing, highlight a future rich with potential.

Yet, this journey forward isn’t without challenges. Balancing innovation with ethical responsibility, privacy, and fairness remains crucial. As CV becomes more embedded in our lives, it calls for a collaborative approach among technologists, ethicists, and policymakers to ensure it benefits society responsibly and equitably.

In essence, CV’s future is not just about technological growth but also about addressing ethical and societal needs, marking an exciting, transformative journey ahead.

Related Posts

introduction to ai jobs in 2023

August 16, 2023    Leave a Comment

introduction to artificial intelligence

August 23, 2023    Leave a Comment

Knowing the history of AI is important in understanding where AI is now and where it may go in the future.

August 30, 2023    Leave a Comment

Become a Member

Stay up to date on OpenCV and Computer Vision news

Free Courses

  • TensorFlow & Keras Bootcamp
  • OpenCV Bootcamp
  • Python for Beginners
  • Mastering OpenCV with Python
  • Fundamentals of CV & IP
  • Deep Learning with PyTorch
  • Deep Learning with TensorFlow & Keras
  • Computer Vision & Deep Learning Applications
  • Mastering Generative AI for Art

Partnership

  • Intel, OpenCV’s Platinum Member
  • Gold Membership
  • Development Partnership

General Link

latest research areas in image processing

Subscribe and Start Your Free Crash Course

latest research areas in image processing

Stay up to date on OpenCV and Computer Vision news and our new course offerings

  • We hate SPAM and promise to keep your email address safe.

Join the waitlist to receive a 20% discount

Courses are (a little) oversubscribed and we apologize for your enrollment delay. As an apology, you will receive a 20% discount on all waitlist course purchases. Current wait time will be sent to you in the confirmation email. Thank you!

latest research areas in image processing

latest research areas in image processing

Trends and Advancements of Image Processing and Its Applications

  • © 2022
  • Prashant Johri 0 ,
  • Mario José Diván 1 ,
  • Ruqaiya Khanam 2 ,
  • Marcelo Marciszack 3 ,
  • Adrián Will 4

Plot No 2, Sector 17 A, Yamuna Expressway, School of Computing Science & Engineering, Galgotias University, G.B.Nagar, Greater Noida, India

You can also search for this editor in PubMed   Google Scholar

Data Science Research Group at Economy School, National University of La Pampa, Santa Rosa, Argentina

Computer science and engineering department, set, sharda university, greater noida, india, information system department, center for research, development, and transfer of information systems (cids), national technological university - córdoba regional faculty, córdoba, argentina, information system department, national technological university - tucumán regional faculty, tucumán, argentina.

  • Presents developments of current research in various areas of image processing
  • Includes applications of image processing in remote sensing, astronomy, and manufacturing
  • Pertains to researchers, academics, students, and practitioners in image processing

Part of the book series: EAI/Springer Innovations in Communication and Computing (EAISICC)

8380 Accesses

40 Citations

4 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

About this book

This book covers current technological innovations and applications in image processing, introducing analysis techniques and describing applications in remote sensing and manufacturing, among others. The authors include new concepts of color space transformation like color interpolation, among others. Also, the concept of Shearlet Transform and Wavelet Transform and their implementation are discussed. The authors include a perspective about concepts and techniques of remote sensing like image mining, geographical, and agricultural resources. The book also includes several applications of human organ biomedical image analysis. In addition, the principle of moving object detection and tracking — including recent trends in moving vehicles and ship detection – is described.

  • Presents developments of current research in various areas of image processing;
  • Includes applications of image processing in remote sensing, astronomy, and manufacturing;
  • Pertains to researchers, academics, students, and practitioners in image processing.

Similar content being viewed by others

latest research areas in image processing

Image Processing for Practical Applications

latest research areas in image processing

Electronics and Digital Computing Techniques for Images and Image Processing

latest research areas in image processing

Image Processing: What, How and Future

  • Medical Image detection and diagnoses
  • Image Recognition and Analysis
  • Deep learning-based image analysis
  • Pattern Recognition and Capsule Networks in Object Tracking
  • Moving Object Tracking
  • Wavelet Transformation

Table of contents (15 chapters)

Front matter, recent trends and advancements of image processing and its applications, using convolutional neural networks for classifying covid-19 in computerized tomography scans.

  • Lúcio Flávio de Jesus Silva, Elilson dos Santos, Omar Andres Carmona Cortes

Challenges in Processing Medical Images in Mobile Devices

  • Mariela Curiel, Leonardo Flórez-Valencia

Smart Traffic Control for Emergency Vehicles Using the Internet of Things and Image Processing

  • Sandesh Kumar Srivastava, Anshul Singh, Ruqaiya Khanam, Prashant Johri, Arya Siddhartha Gupta, Gaurav Kumar

Combining Image Processing and Artificial Intelligence for Dental Image Analysis: Trends, Challenges, and Applications

  • M. B. H. Moran, M. D. B. Faria, L. F. Bastos, G. A. Giraldi, A. Conci

Median Filter Based on the Entropy of the Color Components of RGB Images

  • José Luis Vázquez Noguera, Horacio Legal-Ayala, Julio César Mello Román, Derlis Argüello, Thelma Balbuena

Deep Learning Models for Predicting COVID-19 Using Chest X-Ray Images

  • L. J. Muhammad, Ebrahem A. Algehyne, Sani Sharif Usman, I. A. Mohammed, Ahmad Abdulkadir, Muhammed Besiru Jibrin et al.

Deep Learning Methods for Chronic Myeloid Leukaemia Diagnosis

  • Tanya Arora, Mandeep Kaur, Parma Nand

An Automatic Bean Classification System Based on Visual Features to Assist the Seed Breeding Process

  • Miguel Garcia, Deisy Chaves, Maria Trujillo

Supervised Machine Learning Classification of Human Sperm Head Based on Morphological Features

  • Natalia V. Revollo, G. Noelia Revollo Sarmiento, Claudio Delrieux, Marcela Herrera, Rolando González-José

Future Contribution of Artificial Vision in Methodologies for the Development of Applications That Allow for Identifying Optimal Harvest Times of Medicinal Cannabis Inflorescences in Colombia

  • Luis Octavio González-Salcedo, Andrés Palomino-Tovar, Adriana Martínez-Arias

Detection of Brain Tumor Region in MRI Image Through K-Means Clustering Algorithms

  • Sanjay Kumar, Naresh Kumar, J. N. Singh, Prashant Johri, Sanjeev Kumar Singh

Estimation of Human Posture Using Convolutional Neural Network Using Web Architecture

  • Dhruv Kumar, Abhay Kumar, M. Arvindhan, Ravi Sharma, Nalliyanna Goundar Veerappan Kousik, S. Anbuchelian

Histogram Distance Metric Learning to Diagnose Breast Cancer using Semantic Analysis and Natural Language Interpretation Methods

  • D. Gnana Jebadas, M. Sivaram, Arvindhan M, B. S. Vidhyasagar, B. Bharathi Kannan

Human Skin Color Detection Technique Using Different Color Models

  • Ruqaiya Khanam, Prashant Johri, Mario José Diván

A Study of Improved Methods on Image Inpainting

  • Ajay Sudhir Bale, S. Saravana Kumar, M. S. Kiran Mohan, N. Vinay

Back Matter

Editors and affiliations, plot no 2, sector 17 a, yamuna expressway, school of computing science & engineering, galgotias university, g.b.nagar, greater noida, india.

Prashant Johri

Mario José Diván

Ruqaiya Khanam

Marcelo Marciszack

Adrián Will

About the editors

Dr. Prashant Johri  is working as a Professor in School of Computing Science & Engineering, Galgotias University, Greater Noida, India. He completed his B.Sc.(H) from Aligarh Muslim University in 1992 and M.C.A. from Aligarh Muslim University in 1995 and Ph.D.  in Computer Science from Jiwaji University, Gwalior in 2011, India. He has also worked as a Professor and Director (M.C.A.), Galgotias Institute of  Management and Technology, (G.I.M.T.) and  worked as a Professor and Director (M.C.A.), Noida Institute of Engineering and Technology, (N.I.E.T.) Gr.Noida .He has served as Chair in many conferences and affiliated as member of program committee in many conferences of India and Abroad. He has supervised 2 PhD students and  M.Tech. students for their thesis. He published more than 100 research papers in National and International Journals and Conferences. He has  published Edited books  in Elsevier and Springer.He has also contributed numerous book chapters in the several books published with publishers of high international repute. Apart from scholarly contribution towards scientific community, he organized several Conferences/Workshops/Seminars at the national and international levels. He voluntarily served as reviewer for various International Journals and conferences.His research interest includes Information Security, Cloud Computing, Block Chain,Machine Learning, AI, AR & VR, Healthcare, Agriculture, Image Processing, Software Reliability. He is actively publishing in these areas.

Prof.(Dr.) Mario José Diván was born in Santa Rosa (La Pampa, Argentina) on March 10 of 1979. He received an engineering degree in Information Systems from the National Technological University (Argentina) in 2003, while he holds a specialty in managerial engineering from the National Technological University (Argentina) in 2004, a specialty in data mining and knowledge discovery in databases from the University of Buenos Aires (Argentina) in 2007, and a specialty on high-performance and grid computing from the National University of La Plata (Argentina) in 2011. He obtained his Ph.D. in Computer Science in 2012 from the National University of La Plata (Argentina).

He is an IEEE Senior Member, while he was recognized in two opportunities with the Electronic Government Award from the Computer Science Argentine Society.

Dr. Ruqaiya Khanam  is working as Professor in School of Electrical, Electronics and Communication Engineering, Sharda University, Greater Noida, India. She receivedM.Tech.degree from Aligarh Muslim University and Ph.D. degree in Electronics and Communication from Jamia Millia Islamia University, New Delhi, India. She has also worked as a Professor, GalgotiasUniversity, Greater Noida. She has served as Chair in many conferences and affiliated as member of program committee of many conferences in India. She has been guiding 07 research scholars for the degree of PhD, guided 27 PG dissertations, have authored 03 books and several research articles and have contributed to earn a patent for Galgotias University. Her current research interests include Image Processing, Biomedical Signal and Image Processing, Fuzzy Logic Processor Design, Internet of Things, VLSI Design and Technology, High level designing using VHDL/Verilog, Low Power Chip designing...etc. She is actively publishing in these areas.

Prof. (Dr.) Marcelo Martín Marciszack is a professor andresearcher at the National Technological University, Cordoba Regional Faculty (UTN-FRC). He is an Information Systems Engineer, graduated from the National Technological University (UTN-FRC), Argentina. He is a master’s in software engineering, graduated from the National University of La Plata, Argentina and Doctor in Software Engineering based on reusable components, Man-Machine interface applications, graduated from the University of Vigo - Spain. He is the director of the Center for Research, Development and Transfer of Information Systems (CIDS) at the UTN-FRC. He is Coordinator of the Information Systems Research Program and Member of the Central Postgraduate Commission of the UTN, and currently serves as the Postgraduate Secretary at the Cordoba Regional Faculty.

Prof.(Dr.) Adrián Will received a degree in Mathematics from FAMAF-UNC and a Ph.D in Mathematics from the same university. He currently is the Director of the Research Group G.I.T.I.A. (www.gitia.org), a Research Group focused on Artificial Intelligence, Machine Learning and its applications, in the National Technological University (UTN), Tucumán Faculty. He is also the Director of the Specialization and master’s in engineering of Information Systems in the same faculty. He has published articles in several national and international journals and conferences and presented two patent applications. His main research interests are Evolutionary Algorithms, Artificial Neural Networks, and Heuristic Optimization. He is also involved in several industrial projects dealing with Optimization of Electricity Distribution Networks, Energy Consumption Management and Human Thermal Comfort, among others.

Bibliographic Information

Book Title : Trends and Advancements of Image Processing and Its Applications

Editors : Prashant Johri, Mario José Diván, Ruqaiya Khanam, Marcelo Marciszack, Adrián Will

Series Title : EAI/Springer Innovations in Communication and Computing

DOI : https://doi.org/10.1007/978-3-030-75945-2

Publisher : Springer Cham

eBook Packages : Engineering , Engineering (R0)

Copyright Information : Springer Nature Switzerland AG 2022

Hardcover ISBN : 978-3-030-75944-5 Published: 14 November 2021

Softcover ISBN : 978-3-030-75947-6 Published: 14 November 2022

eBook ISBN : 978-3-030-75945-2 Published: 13 November 2021

Series ISSN : 2522-8595

Series E-ISSN : 2522-8609

Edition Number : 1

Number of Pages : XII, 305

Number of Illustrations : 42 b/w illustrations, 108 illustrations in colour

Topics : Computer Imaging, Vision, Pattern Recognition and Graphics , Signal, Image and Speech Processing , Control, Robotics, Mechatronics , Artificial Intelligence

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Recent advances and clinical applications of deep learning in medical image analysis

Affiliations.

  • 1 School of Electrical and Computer Engineering, University of Oklahoma, Norman, OK 73019, USA.
  • 2 School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.
  • 3 Department of Pathology, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104, USA.
  • 4 Department of Radiology, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104, USA.
  • 5 Department of Obstetrics and Gynecology, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104, USA.
  • 6 School of Electrical and Computer Engineering, University of Oklahoma, Norman, OK 73019, USA. Electronic address: [email protected].
  • PMID: 35472844
  • PMCID: PMC9156578
  • DOI: 10.1016/j.media.2022.102444

Deep learning has received extensive research interest in developing new medical image processing algorithms, and deep learning based models have been remarkably successful in a variety of medical imaging tasks to support disease detection and diagnosis. Despite the success, the further improvement of deep learning models in medical image analysis is majorly bottlenecked by the lack of large-sized and well-annotated datasets. In the past five years, many studies have focused on addressing this challenge. In this paper, we reviewed and summarized these recent studies to provide a comprehensive overview of applying deep learning methods in various medical image analysis tasks. Especially, we emphasize the latest progress and contributions of state-of-the-art unsupervised and semi-supervised deep learning in medical image analysis, which are summarized based on different application scenarios, including classification, segmentation, detection, and image registration. We also discuss major technical challenges and suggest possible solutions in the future research efforts.

Keywords: Attention; Classification; Deep learning; Detection; Medical images; Registration; Segmentation; Self-supervised learning; Semi-supervised learning; Unsupervised learning; Vision Transformer.

Copyright © 2022. Published by Elsevier B.V.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

The overall structure of this…

The overall structure of this survey.

A simple CNN for disease…

A simple CNN for disease classification from MRI images (Anwar et al., 2018).

(a) MoCo (He et al.,…

(a) MoCo (He et al., 2020); (b) SimCLR (Chen et al., 2020a).

Mean Teacher model application in…

Mean Teacher model application in medical image analysis (Li et al., 2020b). π…

Units of different segmentation networks…

Units of different segmentation networks (a) forward convolutional unit (U-Net), (b) recurrent convolutional…

(a) Transformer layer; (b) the…

(a) Transformer layer; (b) the architecture of TransUNet (Chen et al., 2021b)

VoxelMorph (Balakrishnan et al., 2018).

Similar articles

  • Medical image identification methods: A review. Li J, Jiang P, An Q, Wang GG, Kong HF. Li J, et al. Comput Biol Med. 2024 Feb;169:107777. doi: 10.1016/j.compbiomed.2023.107777. Epub 2023 Dec 5. Comput Biol Med. 2024. PMID: 38104516 Review.
  • Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Cheplygina V, de Bruijne M, Pluim JPW. Cheplygina V, et al. Med Image Anal. 2019 May;54:280-296. doi: 10.1016/j.media.2019.03.009. Epub 2019 Mar 29. Med Image Anal. 2019. PMID: 30959445 Review.
  • A review of self-supervised, generative, and few-shot deep learning methods for data-limited magnetic resonance imaging segmentation. Liu Z, Kainth K, Zhou A, Deyer TW, Fayad ZA, Greenspan H, Mei X. Liu Z, et al. NMR Biomed. 2024 Aug;37(8):e5143. doi: 10.1002/nbm.5143. Epub 2024 Mar 24. NMR Biomed. 2024. PMID: 38523402 Review.
  • Self-supervised pre-training with contrastive and masked autoencoder methods for dealing with small datasets in deep learning for medical imaging. Wolf D, Payer T, Lisson CS, Lisson CG, Beer M, Götz M, Ropinski T. Wolf D, et al. Sci Rep. 2023 Nov 20;13(1):20260. doi: 10.1038/s41598-023-46433-0. Sci Rep. 2023. PMID: 37985685 Free PMC article.
  • Shifting to machine supervision: annotation-efficient semi and self-supervised learning for automatic medical image segmentation and classification. Singh P, Chukkapalli R, Chaudhari S, Chen L, Chen M, Pan J, Smuda C, Cirrone J. Singh P, et al. Sci Rep. 2024 May 11;14(1):10820. doi: 10.1038/s41598-024-61822-9. Sci Rep. 2024. PMID: 38734825 Free PMC article.
  • A Multi-Scale Liver Tumor Segmentation Method Based on Residual and Hybrid Attention Enhanced Network with Contextual Integration. Sun L, Jiang L, Wang M, Wang Z, Xin Y. Sun L, et al. Sensors (Basel). 2024 Sep 9;24(17):5845. doi: 10.3390/s24175845. Sensors (Basel). 2024. PMID: 39275756 Free PMC article.
  • Focal cortical dysplasia lesion segmentation using multiscale transformer. Zhang X, Zhang Y, Wang C, Li L, Zhu F, Sun Y, Mo T, Hu Q, Xu J, Cao D. Zhang X, et al. Insights Imaging. 2024 Sep 12;15(1):222. doi: 10.1186/s13244-024-01803-8. Insights Imaging. 2024. PMID: 39266782 Free PMC article.
  • Segmentation of supragranular and infragranular layers in ultra-high-resolution 7T ex vivo MRI of the human cerebral cortex. Zeng X, Puonti O, Sayeed A, Herisse R, Mora J, Evancic K, Varadarajan D, Balbastre Y, Costantini I, Scardigli M, Ramazzotti J, DiMeo D, Mazzamuto G, Pesce L, Brady N, Cheli F, Saverio Pavone F, Hof PR, Frost R, Augustinack J, van der Kouwe A, Eugenio Iglesias J, Fischl B. Zeng X, et al. Cereb Cortex. 2024 Sep 3;34(9):bhae362. doi: 10.1093/cercor/bhae362. Cereb Cortex. 2024. PMID: 39264753
  • Addressing Deep Learning Model Calibration Using Evidential Neural Networks And Uncertainty-Aware Training. Dawood T, Chan E, Razavi R, King AP, Puyol-Antón E. Dawood T, et al. Proc IEEE Int Symp Biomed Imaging. 2023 Apr 18;34:1-5. doi: 10.1109/ISBI53787.2023.10230515. Proc IEEE Int Symp Biomed Imaging. 2023. PMID: 39253557 Free PMC article.
  • Adaptive 3DCNN-based Interpretable Ensemble Model for Early Diagnosis of Alzheimer's Disease. Pan D, Luo G, Zeng A, Zou C, Liang H, Wang J, Zhang T, Yang B; Alzheimer’s Disease Neuroimaging Initiative (ADNI). Pan D, et al. IEEE Trans Comput Soc Syst. 2024 Feb;11(1):247-266. doi: 10.1109/tcss.2022.3223999. Epub 2022 Nov 30. IEEE Trans Comput Soc Syst. 2024. PMID: 39239536
  • Meyers PH, Nice CM Jr, Becker HC, Nettleton WJ Jr, Sweeney JW, Meckstroth GR, 1964. Automated computer analysis of radiographic images. Radiology 83, 1029–1034. - PubMed
  • Kruger RP, Townes JR, Hall DL, Dwyer SJ, Lodwick GS, 1972. Automated Radiographic Diagnosis via Feature Extraction and Classification of Cardiac Size and Shape Descriptors. IEEE Transactions on Biomedical Engineering BME-19, 174–186. - PubMed
  • Sezaki N, Ukena K, 1973. Automatic Computation of the Cardiothoracic Ratio with Application to Mass Screening. IEEE Transactions on Biomedical Engineering BME-20, 248–253. - PubMed
  • Doi K, MacMahon H, Katsuragawa S, Nishikawa RM, Jiang Y, 1999. Computer-aided diagnosis in radiology: potential and pitfalls. European Journal of Radiology 31, 97–109. - PubMed
  • Shi J, Sahiner B, Chan H-P, Ge J, Hadjiiski L, Helvie MA, Nees A, Wu Y-T, Wei J, Zhou C, Zhang Y, Cui J, 2008. Characterization of mammographic masses based on level set segmentation with new image features and patient information. Med Phys 35, 280–290. - PMC - PubMed

Publication types

  • Search in MeSH

Related information

Grants and funding.

  • P20 GM135009/GM/NIGMS NIH HHS/United States
  • P30 CA225520/CA/NCI NIH HHS/United States

LinkOut - more resources

Full text sources.

  • Elsevier Science
  • Europe PubMed Central
  • Ovid Technologies, Inc.
  • PubMed Central

Miscellaneous

  • NCI CPTAC Assay Portal
  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Top 10 Digital Image Processing Project Topics

We guide research scholars in choosing novel digital image processing project topics. What is meant by digital image processing? Digital Image Processing is a method of handling images to get different insights into the digital image. It has a set of technologies to analyze the image in multiple aspects for better human / machine image interpretation . To be clearer, it is used to improve the actual quality of the image or to abstract the essential features from the entire picture is achieved through digital image processing projects.

This page is about the new upcoming Digital Image Processing Project Topics for scholars who wish to create a masterpiece in their research career!!!

Generally, the digital image is represented in the form of pixels which are arranged in array format. The dimension of the rectangular array gives the size of the image (MxN), where M denotes the column and N denotes the row. Further, x and y coordinates are used to signify the single-pixel position of an image. At the same time, the x value increases from left to right, and the y value increases from top to bottom in the coordinate representation of the image. When you get into the DIP research field, you need to know the following key terminologies.

Top 10 Digital Image Processing Project Topics Guidance

Important Digital Image Processing Terminologies  

  • Stereo Vision and Super Resolution
  • Multi-Spectral Remote Sensing and Imaging
  • Digital Photography and Imaging
  • Acoustic Imaging and Holographic Imaging
  • Computer Vision and Graphics
  • Image Manipulation and Retrieval
  • Quality Enrichment in Volumetric Imaging
  • Color Imaging and Bio-Medical Imaging
  • Pattern Recognition and Analysis
  • Imaging Software Tools, Technologies and Languages
  • Image Acquisition and Compression Techniques
  • Mathematical Morphological Image Segmentation

Image Processing Algorithms

In general, image processing techniques/methods are used to perform certain actions over the input images, and according to that, the desired information is extracted in it. For that, input is an image, and the result is an improved/expected image associated with their task. It is essential to find that the algorithms for image processing play a crucial role in current real-time applications. Various algorithms are used for various purposes as follows, 

  • Digital Image Detection
  • Image Reconstruction
  • Image Restoration
  • Image Enhancement
  • Image Quality Estimation
  • Spectral Image Estimation
  • Image Data Compression

For the above image processing tasks, algorithms are customized for the number of training and testing samples and also can be used for real-time/online processing. Till now, filtering techniques are used for image processing and enhancement, and their main functions are as follows, 

  • Brightness Correction
  • Contrast Enhancement
  • Resolution and Noise Level of Image
  • Contouring and Image Sharpening
  • Blurring, Edge Detection and Embossing

Some of the commonly used techniques for image processing can be classified into the following, 

  • Medium Level Image Processing Techniques – Binarization and Compression
  • Higher Level Image Processing Techniques – Image Segmentation
  • Low-Level Image Processing Techniques – Noise Elimination and Color Contrast Enhancement
  • Recognition and Detection Image Processing Algorithms – Semantic Analysis

Next, let’s see about some of the traditional image processing algorithms for your information. Our research team will guide in handpicking apt solutions for research problems . If there is a need, we are also ready to design own hybrid algorithms and techniques for sorting out complicated model . 

Types of Digital Image Processing Algorithms

  • Hough Transform Algorithm
  • Canny Edge Detector Algorithm
  • Scale-Invariant Feature Transform (SIFT) Algorithm
  • Generalized Hough Transform Algorithm
  • Speeded Up Robust Features (SURF) Algorithm
  • Marr–Hildreth Algorithm
  • Connected-component labeling algorithm: Identify and classify the disconnected areas
  • Histogram equalization algorithm: Enhance the contrast of image by utilizing the histogram
  • Adaptive histogram equalization algorithm: Perform slight alteration in contrast for the  equalization of the histogram
  • Error Diffusion Algorithm
  • Ordered Dithering Algorithm
  • Floyd–Steinberg Dithering Algorithm
  • Riemersma Dithering Algorithm
  • Richardson–Lucy deconvolution algorithm : It is also known as a deblurring algorithm, which removes the misrepresentation of the image to recover the original image
  • Seam carving algorithm : Differentiate the edge based on the image background information and also known as content-aware image resizing algorithm
  • Region Growing Algorithm
  • GrowCut Algorithm
  • Watershed Transformation Algorithm
  • Random Walker Algorithm
  • Elser difference-map algorithm: It is a search based algorithm primarily used for X-Ray diffraction microscopy to solve the general constraint satisfaction problems
  • Blind deconvolution algorithm : It is similar to Richardson–Lucy deconvolution to reconstruct the sharp point of blur image. In other words, it’s the process of deblurring the image.

Nowadays, various industries are also utilizing digital image processing by developing customizing procedures to satisfy their requirements. It may be achieved either from scratch or hybrid algorithmic functions . As a result, it is clear that image processing is revolutionary developed in many information technology sectors and applications.  

Research Digital Image Processing Project Topics

Digital Image Processing Techniques

  • In order to smooth the image, substitutes neighbor median / common value in the place of the actual pixel value. Whereas it is performed in the case of weak edge sharpness and blur image effect.
  • Eliminate the distortion in an image by scaling, wrapping, translation, and rotation process
  • Differentiate the in-depth image content to figure out the original hidden data or to convert the color image into a gray-scale image
  • Breaking up of image into multiple forms based on certain constraints. For instance: foreground, background
  • Enhance the image display through pixel-based threshold operation 
  • Reduce the noise in an image by the average of diverse quality multiple images 
  • Sharpening the image by improving the pixel value in the edge
  • Extract the specific feature for removal of noise in an image
  • Perform arithmetic operations (add, sub, divide and multiply) to identify the variation in between the images 

Beyond this, this field will give you numerous Digital Image Processing Project Topics for current and upcoming scholars . Below, we have mentioned some research ideas that help you to classify analysis, represent and display the images or particular characteristics of an image.

Latest 11 Interesting Digital Image Processing Project Topics

  • Acoustic and Color Image Processing
  • Digital Video and Signal Processing
  • Multi-spectral and Laser Polarimetric Imaging
  • Image Processing and Sensing Techniques
  • Super-resolution Imaging and Applications
  • Passive and Active Remote Sensing
  • Time-Frequency Signal Processing and Analysis
  • 3-D Surface Reconstruction using Remote Sensed Image
  • Digital Image based Steganalysis and Steganography
  • Radar Image Processing for Remote Sensing Applications
  • Adaptive Clustering Algorithms for Image processing

Moreover, if you want to know more about Digital Image Processing Project Topics for your research, then communicate with our team. We will give detailed information on current trends, future developments, and real-time challenges in the research grounds of Digital Image Processing.

Why Work With Us ?

Senior research member, research experience, journal member, book publisher, research ethics, business ethics, valid references, explanations, paper publication, 9 big reasons to select us.

Our Editor-in-Chief has Website Ownership who control and deliver all aspects of PhD Direction to scholars and students and also keep the look to fully manage all our clients.

Our world-class certified experts have 18+years of experience in Research & Development programs (Industrial Research) who absolutely immersed as many scholars as possible in developing strong PhD research projects.

We associated with 200+reputed SCI and SCOPUS indexed journals (SJR ranking) for getting research work to be published in standard journals (Your first-choice journal).

PhDdirection.com is world’s largest book publishing platform that predominantly work subject-wise categories for scholars/students to assist their books writing and takes out into the University Library.

Our researchers provide required research ethics such as Confidentiality & Privacy, Novelty (valuable research), Plagiarism-Free, and Timely Delivery. Our customers have freedom to examine their current specific research activities.

Our organization take into consideration of customer satisfaction, online, offline support and professional works deliver since these are the actual inspiring business factors.

Solid works delivering by young qualified global research team. "References" is the key to evaluating works easier because we carefully assess scholars findings.

Detailed Videos, Readme files, Screenshots are provided for all research projects. We provide Teamviewer support and other online channels for project explanation.

Worthy journal publication is our main thing like IEEE, ACM, Springer, IET, Elsevier, etc. We substantially reduces scholars burden in publication side. We carry scholars from initial submission to final acceptance.

Related Pages

Our benefits, throughout reference, confidential agreement, research no way resale, plagiarism-free, publication guarantee, customize support, fair revisions, business professionalism, domains & tools, we generally use, wireless communication (4g lte, and 5g), ad hoc networks (vanet, manet, etc.), wireless sensor networks, software defined networks, network security, internet of things (mqtt, coap), internet of vehicles, cloud computing, fog computing, edge computing, mobile computing, mobile cloud computing, ubiquitous computing, digital image processing, medical image processing, pattern analysis and machine intelligence, geoscience and remote sensing, big data analytics, data mining, power electronics, web of things, digital forensics, natural language processing, automation systems, artificial intelligence, mininet 2.1.0, matlab (r2018b/r2019a), matlab and simulink, apache hadoop, apache spark mlib, apache mahout, apache flink, apache storm, apache cassandra, pig and hive, rapid miner, support 24/7, call us @ any time, +91 9444829042, [email protected].

Questions ?

Click here to chat with us

SPECIALTY GRAND CHALLENGE article

Grand challenges in image processing.

Frdric Dufaux

  • Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des signaux et Systèmes, Gif-sur-Yvette, France

Introduction

The field of image processing has been the subject of intensive research and development activities for several decades. This broad area encompasses topics such as image/video processing, image/video analysis, image/video communications, image/video sensing, modeling and representation, computational imaging, electronic imaging, information forensics and security, 3D imaging, medical imaging, and machine learning applied to these respective topics. Hereafter, we will consider both image and video content (i.e. sequence of images), and more generally all forms of visual information.

Rapid technological advances, especially in terms of computing power and network transmission bandwidth, have resulted in many remarkable and successful applications. Nowadays, images are ubiquitous in our daily life. Entertainment is one class of applications that has greatly benefited, including digital TV (e.g., broadcast, cable, and satellite TV), Internet video streaming, digital cinema, and video games. Beyond entertainment, imaging technologies are central in many other applications, including digital photography, video conferencing, video monitoring and surveillance, satellite imaging, but also in more distant domains such as healthcare and medicine, distance learning, digital archiving, cultural heritage or the automotive industry.

In this paper, we highlight a few research grand challenges for future imaging and video systems, in order to achieve breakthroughs to meet the growing expectations of end users. Given the vastness of the field, this list is by no means exhaustive.

A Brief Historical Perspective

We first briefly discuss a few key milestones in the field of image processing. Key inventions in the development of photography and motion pictures can be traced to the 19th century. The earliest surviving photograph of a real-world scene was made by Nicéphore Niépce in 1827 ( Hirsch, 1999 ). The Lumière brothers made the first cinematographic film in 1895, with a public screening the same year ( Lumiere, 1996 ). After decades of remarkable developments, the second half of the 20th century saw the emergence of new technologies launching the digital revolution. While the first prototype digital camera using a Charge-Coupled Device (CCD) was demonstrated in 1975, the first commercial consumer digital cameras started appearing in the early 1990s. These digital cameras quickly surpassed cameras using films and the digital revolution in the field of imaging was underway. As a key consequence, the digital process enabled computational imaging, in other words the use of sophisticated processing algorithms in order to produce high quality images.

In 1992, the Joint Photographic Experts Group (JPEG) released the JPEG standard for still image coding ( Wallace, 1992 ). In parallel, in 1993, the Moving Picture Experts Group (MPEG) published its first standard for coding of moving pictures and associated audio, MPEG-1 ( Le Gall, 1991 ), and a few years later MPEG-2 ( Haskell et al., 1996 ). By guaranteeing interoperability, these standards have been essential in many successful applications and services, for both the consumer and business markets. In particular, it is remarkable that, almost 30 years later, JPEG remains the dominant format for still images and photographs.

In the late 2000s and early 2010s, we could observe a paradigm shift with the appearance of smartphones integrating a camera. Thanks to advances in computational photography, these new smartphones soon became capable of rivaling the quality of consumer digital cameras at the time. Moreover, these smartphones were also capable of acquiring video sequences. Almost concurrently, another key evolution was the development of high bandwidth networks. In particular, the launch of 4G wireless services circa 2010 enabled users to quickly and efficiently exchange multimedia content. From this point, most of us are carrying a camera, anywhere and anytime, allowing to capture images and videos at will and to seamlessly exchange them with our contacts.

As a direct consequence of the above developments, we are currently observing a boom in the usage of multimedia content. It is estimated that today 3.2 billion images are shared each day on social media platforms, and 300 h of video are uploaded every minute on YouTube 1 . In a 2019 report, Cisco estimated that video content represented 75% of all Internet traffic in 2017, and this share is forecasted to grow to 82% in 2022 ( Cisco, 2019 ). While Internet video streaming and Over-The-Top (OTT) media services account for a significant bulk of this traffic, other applications are also expected to see significant increases, including video surveillance and Virtual Reality (VR)/Augmented Reality (AR).

Hyper-Realistic and Immersive Imaging

A major direction and key driver to research and development activities over the years has been the objective to deliver an ever-improving image quality and user experience.

For instance, in the realm of video, we have observed constantly increasing spatial and temporal resolutions, with the emergence nowadays of Ultra High Definition (UHD). Another aim has been to provide a sense of the depth in the scene. For this purpose, various 3D video representations have been explored, including stereoscopic 3D and multi-view ( Dufaux et al., 2013 ).

In this context, the ultimate goal is to be able to faithfully represent the physical world and to deliver an immersive and perceptually hyperrealist experience. For this purpose, we discuss hereafter some emerging innovations. These developments are also very relevant in VR and AR applications ( Slater, 2014 ). Finally, while this paper is only focusing on the visual information processing aspects, it is obvious that emerging display technologies ( Masia et al., 2013 ) and audio also plays key roles in many application scenarios.

Light Fields, Point Clouds, Volumetric Imaging

In order to wholly represent a scene, the light information coming from all the directions has to be represented. For this purpose, the 7D plenoptic function is a key concept ( Adelson and Bergen, 1991 ), although it is unmanageable in practice.

By introducing additional constraints, the light field representation collects radiance from rays in all directions. Therefore, it contains a much richer information, when compared to traditional 2D imaging that captures a 2D projection of the light in the scene integrating the angular domain. For instance, this allows post-capture processing such as refocusing and changing the viewpoint. However, it also entails several technical challenges, in terms of acquisition and calibration, as well as computational image processing steps including depth estimation, super-resolution, compression and image synthesis ( Ihrke et al., 2016 ; Wu et al., 2017 ). The resolution trade-off between spatial and angular resolutions is a fundamental issue. With a significant fraction of the earlier work focusing on static light fields, it is also expected that dynamic light field videos will stimulate more interest in the future. In particular, dense multi-camera arrays are becoming more tractable. Finally, the development of efficient light field compression and streaming techniques is a key enabler in many applications ( Conti et al., 2020 ).

Another promising direction is to consider a point cloud representation. A point cloud is a set of points in the 3D space represented by their spatial coordinates and additional attributes, including color pixel values, normals, or reflectance. They are often very large, easily ranging in the millions of points, and are typically sparse. One major distinguishing feature of point clouds is that, unlike images, they do not have a regular structure, calling for new algorithms. To remove the noise often present in acquired data, while preserving the intrinsic characteristics, effective 3D point cloud filtering approaches are needed ( Han et al., 2017 ). It is also important to develop efficient techniques for Point Cloud Compression (PCC). For this purpose, MPEG is developing two standards: Geometry-based PCC (G-PCC) and Video-based PCC (V-PCC) ( Graziosi et al., 2020 ). G-PCC considers the point cloud in its native form and compress it using 3D data structures such as octrees. Conversely, V-PCC projects the point cloud onto 2D planes and then applies existing video coding schemes. More recently, deep learning-based approaches for PCC have been shown to be effective ( Guarda et al., 2020 ). Another challenge is to develop generic and robust solutions able to handle potentially widely varying characteristics of point clouds, e.g. in terms of size and non-uniform density. Efficient solutions for dynamic point clouds are also needed. Finally, while many techniques focus on the geometric information or the attributes independently, it is paramount to process them jointly.

High Dynamic Range and Wide Color Gamut

The human visual system is able to perceive, using various adaptation mechanisms, a broad range of luminous intensities, from very bright to very dark, as experienced every day in the real world. Nonetheless, current imaging technologies are still limited in terms of capturing or rendering such a wide range of conditions. High Dynamic Range (HDR) imaging aims at addressing this issue. Wide Color Gamut (WCG) is also often associated with HDR in order to provide a wider colorimetry.

HDR has reached some levels of maturity in the context of photography. However, extending HDR to video sequences raises scientific challenges in order to provide high quality and cost-effective solutions, impacting the whole imaging processing pipeline, including content acquisition, tone reproduction, color management, coding, and display ( Dufaux et al., 2016 ; Chalmers and Debattista, 2017 ). Backward compatibility with legacy content and traditional systems is another issue. Despite recent progress, the potential of HDR has not been fully exploited yet.

Coding and Transmission

Three decades of standardization activities have continuously improved the hybrid video coding scheme based on the principles of transform coding and predictive coding. The Versatile Video Coding (VVC) standard has been finalized in 2020 ( Bross et al., 2021 ), achieving approximately 50% bit rate reduction for the same subjective quality when compared to its predecessor, High Efficiency Video Coding (HEVC). While substantially outperforming VVC in the short term may be difficult, one encouraging direction is to rely on improved perceptual models to further optimize compression in terms of visual quality. Another direction, which has already shown promising results, is to apply deep learning-based approaches ( Ding et al., 2021 ). Here, one key issue is the ability to generalize these deep models to a wide diversity of video content. The second key issue is the implementation complexity, both in terms of computation and memory requirements, which is a significant obstacle to a widespread deployment. Besides, the emergence of new video formats targeting immersive communications is also calling for new coding schemes ( Wien et al., 2019 ).

Considering that in many application scenarios, videos are processed by intelligent analytic algorithms rather than viewed by users, another interesting track is the development of video coding for machines ( Duan et al., 2020 ). In this context, the compression is optimized taking into account the performance of video analysis tasks.

The push toward hyper-realistic and immersive visual communications entails most often an increasing raw data rate. Despite improved compression schemes, more transmission bandwidth is needed. Moreover, some emerging applications, such as VR/AR, autonomous driving, and Industry 4.0, bring a strong requirement for low latency transmission, with implications on both the imaging processing pipeline and the transmission channel. In this context, the emergence of 5G wireless networks will positively contribute to the deployment of new multimedia applications, and the development of future wireless communication technologies points toward promising advances ( Da Costa and Yang, 2020 ).

Human Perception and Visual Quality Assessment

It is important to develop effective models of human perception. On the one hand, it can contribute to the development of perceptually inspired algorithms. On the other hand, perceptual quality assessment methods are needed in order to optimize and validate new imaging solutions.

The notion of Quality of Experience (QoE) relates to the degree of delight or annoyance of the user of an application or service ( Le Callet et al., 2012 ). QoE is strongly linked to subjective and objective quality assessment methods. Many years of research have resulted in the successful development of perceptual visual quality metrics based on models of human perception ( Lin and Kuo, 2011 ; Bovik, 2013 ). More recently, deep learning-based approaches have also been successfully applied to this problem ( Bosse et al., 2017 ). While these perceptual quality metrics have achieved good performances, several significant challenges remain. First, when applied to video sequences, most current perceptual metrics are applied on individual images, neglecting temporal modeling. Second, whereas color is a key attribute, there are currently no widely accepted perceptual quality metrics explicitly considering color. Finally, new modalities, such as 360° videos, light fields, point clouds, and HDR, require new approaches.

Another closely related topic is image esthetic assessment ( Deng et al., 2017 ). The esthetic quality of an image is affected by numerous factors, such as lighting, color, contrast, and composition. It is useful in different application scenarios such as image retrieval and ranking, recommendation, and photos enhancement. While earlier attempts have used handcrafted features, most recent techniques to predict esthetic quality are data driven and based on deep learning approaches, leveraging the availability of large annotated datasets for training ( Murray et al., 2012 ). One key challenge is the inherently subjective nature of esthetics assessment, resulting in ambiguity in the ground-truth labels. Another important issue is to explain the behavior of deep esthetic prediction models.

Analysis, Interpretation and Understanding

Another major research direction has been the objective to efficiently analyze, interpret and understand visual data. This goal is challenging, due to the high diversity and complexity of visual data. This has led to many research activities, involving both low-level and high-level analysis, addressing topics such as image classification and segmentation, optical flow, image indexing and retrieval, object detection and tracking, and scene interpretation and understanding. Hereafter, we discuss some trends and challenges.

Keypoints Detection and Local Descriptors

Local imaging matching has been the cornerstone of many analysis tasks. It involves the detection of keypoints, i.e. salient visual points that can be robustly and repeatedly detected, and descriptors, i.e. a compact signature locally describing the visual features at each keypoint. It allows to subsequently compute pairwise matching between the features to reveal local correspondences. In this context, several frameworks have been proposed, including Scale Invariant Feature Transform (SIFT) ( Lowe, 2004 ) and Speeded Up Robust Features (SURF) ( Bay et al., 2008 ), and later binary variants including Binary Robust Independent Elementary Feature (BRIEF) ( Calonder et al., 2010 ), Oriented FAST and Rotated BRIEF (ORB) ( Rublee et al., 2011 ) and Binary Robust Invariant Scalable Keypoints (BRISK) ( Leutenegger et al., 2011 ). Although these approaches exhibit scale and rotation invariance, they are less suited to deal with large 3D distortions such as perspective deformations, out-of-plane rotations, and significant viewpoint changes. Besides, they tend to fail under significantly varying and challenging illumination conditions.

These traditional approaches based on handcrafted features have been successfully applied to problems such as image and video retrieval, object detection, visual Simultaneous Localization And Mapping (SLAM), and visual odometry. Besides, the emergence of new imaging modalities as introduced above can also be beneficial for image analysis tasks, including light fields ( Galdi et al., 2019 ), point clouds ( Guo et al., 2020 ), and HDR ( Rana et al., 2018 ). However, when applied to high-dimensional visual data for semantic analysis and understanding, these approaches based on handcrafted features have been supplanted in recent years by approaches based on deep learning.

Deep Learning-Based Methods

Data-driven deep learning-based approaches ( LeCun et al., 2015 ), and in particular the Convolutional Neural Network (CNN) architecture, represent nowadays the state-of-the-art in terms of performances for complex pattern recognition tasks in scene analysis and understanding. By combining multiple processing layers, deep models are able to learn data representations with different levels of abstraction.

Supervised learning is the most common form of deep learning. It requires a large and fully labeled training dataset, a typically time-consuming and expensive process needed whenever tackling a new application scenario. Moreover, in some specialized domains, e.g. medical data, it can be very difficult to obtain annotations. To alleviate this major burden, methods such as transfer learning and weakly supervised learning have been proposed.

In another direction, deep models have been shown to be vulnerable to adversarial attacks ( Akhtar and Mian, 2018 ). Those attacks consist in introducing subtle perturbations to the input, such that the model predicts an incorrect output. For instance, in the case of images, imperceptible pixel differences are able to fool deep learning models. Such adversarial attacks are definitively an important obstacle to the successful deployment of deep learning, especially in applications where safety and security are critical. While some early solutions have been proposed, a significant challenge is to develop effective defense mechanisms against those attacks.

Finally, another challenge is to enable low complexity and efficient implementations. This is especially important for mobile or embedded applications. For this purpose, further interactions between signal processing and machine learning can potentially bring additional benefits. For instance, one direction is to compress deep neural networks in order to enable their more efficient handling. Moreover, by combining traditional processing techniques with deep learning models, it is possible to develop low complexity solutions while preserving high performance.

Explainability in Deep Learning

While data-driven deep learning models often achieve impressive performances on many visual analysis tasks, their black-box nature often makes it inherently very difficult to understand how they reach a predicted output and how it relates to particular characteristics of the input data. However, this is a major impediment in many decision-critical application scenarios. Moreover, it is important not only to have confidence in the proposed solution, but also to gain further insights from it. Based on these considerations, some deep learning systems aim at promoting explainability ( Adadi and Berrada, 2018 ; Xie et al., 2020 ). This can be achieved by exhibiting traits related to confidence, trust, safety, and ethics.

However, explainable deep learning is still in its early phase. More developments are needed, in particular to develop a systematic theory of model explanation. Important aspects include the need to understand and quantify risk, to comprehend how the model makes predictions for transparency and trustworthiness, and to quantify the uncertainty in the model prediction. This challenge is key in order to deploy and use deep learning-based solutions in an accountable way, for instance in application domains such as healthcare or autonomous driving.

Self-Supervised Learning

Self-supervised learning refers to methods that learn general visual features from large-scale unlabeled data, without the need for manual annotations. Self-supervised learning is therefore very appealing, as it allows exploiting the vast amount of unlabeled images and videos available. Moreover, it is widely believed that it is closer to how humans actually learn. One common approach is to use the data to provide the supervision, leveraging its structure. More generally, a pretext task can be defined, e.g. image inpainting, colorizing grayscale images, predicting future frames in videos, by withholding some parts of the data and by training the neural network to predict it ( Jing and Tian, 2020 ). By learning an objective function corresponding to the pretext task, the network is forced to learn relevant visual features in order to solve the problem. Self-supervised learning has also been successfully applied to autonomous vehicles perception. More specifically, the complementarity between analytical and learning methods can be exploited to address various autonomous driving perception tasks, without the prerequisite of an annotated data set ( Chiaroni et al., 2021 ).

While good performances have already been obtained using self-supervised learning, further work is still needed. A few promising directions are outlined hereafter. Combining self-supervised learning with other learning methods is a first interesting path. For instance, semi-supervised learning ( Van Engelen and Hoos, 2020 ) and few-short learning ( Fei-Fei et al., 2006 ) methods have been proposed for scenarios where limited labeled data is available. The performance of these methods can potentially be boosted by incorporating a self-supervised pre-training. The pretext task can also serve to add regularization. Another interesting trend in self-supervised learning is to train neural networks with synthetic data. The challenge here is to bridge the domain gap between the synthetic and real data. Finally, another compelling direction is to exploit data from different modalities. A simple example is to consider both the video and audio signals in a video sequence. In another example in the context of autonomous driving, vehicles are typically equipped with multiple sensors, including cameras, LIght Detection And Ranging (LIDAR), Global Positioning System (GPS), and Inertial Measurement Units (IMU). In such cases, it is easy to acquire large unlabeled multimodal datasets, where the different modalities can be effectively exploited in self-supervised learning methods.

Reproducible Research and Large Public Datasets

The reproducible research initiative is another way to further ensure high-quality research for the benefit of our community ( Vandewalle et al., 2009 ). Reproducibility, referring to the ability by someone else working independently to accurately reproduce the results of an experiment, is a key principle of the scientific method. In the context of image and video processing, it is usually not sufficient to provide a detailed description of the proposed algorithm. Most often, it is essential to also provide access to the code and data. This is even more imperative in the case of deep learning-based models.

In parallel, the availability of large public datasets is also highly desirable in order to support research activities. This is especially critical for new emerging modalities or specific application scenarios, where it is difficult to get access to relevant data. Moreover, with the emergence of deep learning, large datasets, along with labels, are often needed for training, which can be another burden.

Conclusion and Perspectives

The field of image processing is very broad and rich, with many successful applications in both the consumer and business markets. However, many technical challenges remain in order to further push the limits in imaging technologies. Two main trends are on the one hand to always improve the quality and realism of image and video content, and on the other hand to be able to effectively interpret and understand this vast and complex amount of visual data. However, the list is certainly not exhaustive and there are many other interesting problems, e.g. related to computational imaging, information security and forensics, or medical imaging. Key innovations will be found at the crossroad of image processing, optics, psychophysics, communication, computer vision, artificial intelligence, and computer graphics. Multi-disciplinary collaborations are therefore critical moving forward, involving actors from both academia and the industry, in order to drive these breakthroughs.

The “Image Processing” section of Frontier in Signal Processing aims at giving to the research community a forum to exchange, discuss and improve new ideas, with the goal to contribute to the further advancement of the field of image processing and to bring exciting innovations in the foreseeable future.

Author Contributions

The author confirms being the sole contributor of this work and has approved it for publication.

Conflict of Interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1 https://www.brandwatch.com/blog/amazing-social-media-statistics-and-facts/ (accessed on Feb. 23, 2021).

Adadi, A., and Berrada, M. (2018). Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE access 6, 52138–52160. doi:10.1109/access.2018.2870052

CrossRef Full Text | Google Scholar

Adelson, E. H., and Bergen, J. R. (1991). “The plenoptic function and the elements of early vision” Computational models of visual processing . Cambridge, MA: MIT Press , 3-20.

Google Scholar

Akhtar, N., and Mian, A. (2018). Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430. doi:10.1109/access.2018.2807385

Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. (2008). Speeded-up robust features (SURF). Computer Vis. image understanding 110 (3), 346–359. doi:10.1016/j.cviu.2007.09.014

Bosse, S., Maniry, D., Müller, K. R., Wiegand, T., and Samek, W. (2017). Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans. Image Process. 27 (1), 206–219. doi:10.1109/TIP.2017.2760518

PubMed Abstract | CrossRef Full Text | Google Scholar

Bovik, A. C. (2013). Automatic prediction of perceptual image and video quality. Proc. IEEE 101 (9), 2008–2024. doi:10.1109/JPROC.2013.2257632

Bross, B., Chen, J., Ohm, J. R., Sullivan, G. J., and Wang, Y. K. (2021). Developments in international video coding standardization after AVC, with an overview of Versatile Video Coding (VVC). Proc. IEEE . doi:10.1109/JPROC.2020.3043399

Calonder, M., Lepetit, V., Strecha, C., and Fua, P. (2010). Brief: binary robust independent elementary features. In K. Daniilidis, P. Maragos, and N. Paragios (eds) European conference on computer vision . Berlin, Heidelberg: Springer , 778–792. doi:10.1007/978-3-642-15561-1_56

Chalmers, A., and Debattista, K. (2017). HDR video past, present and future: a perspective. Signal. Processing: Image Commun. 54, 49–55. doi:10.1016/j.image.2017.02.003

Chiaroni, F., Rahal, M.-C., Hueber, N., and Dufaux, F. (2021). Self-supervised learning for autonomous vehicles perception: a conciliation between analytical and learning methods. IEEE Signal. Process. Mag. 38 (1), 31–41. doi:10.1109/msp.2020.2977269

Cisco, (20192019). Cisco visual networking index: forecast and trends, 2017-2022 (white paper) , Indianapolis, Indiana: Cisco Press .

Conti, C., Soares, L. D., and Nunes, P. (2020). Dense light field coding: a survey. IEEE Access 8, 49244–49284. doi:10.1109/ACCESS.2020.2977767

Da Costa, D. B., and Yang, H.-C. (2020). Grand challenges in wireless communications. Front. Commun. Networks 1 (1), 1–5. doi:10.3389/frcmn.2020.00001

Deng, Y., Loy, C. C., and Tang, X. (2017). Image aesthetic assessment: an experimental survey. IEEE Signal. Process. Mag. 34 (4), 80–106. doi:10.1109/msp.2017.2696576

Ding, D., Ma, Z., Chen, D., Chen, Q., Liu, Z., and Zhu, F. (2021). Advances in video compression system using deep neural network: a review and case studies . Ithaca, NY: Cornell university .

Duan, L., Liu, J., Yang, W., Huang, T., and Gao, W. (2020). Video coding for machines: a paradigm of collaborative compression and intelligent analytics. IEEE Trans. Image Process. 29, 8680–8695. doi:10.1109/tip.2020.3016485

Dufaux, F., Le Callet, P., Mantiuk, R., and Mrak, M. (2016). High dynamic range video - from acquisition, to display and applications . Cambridge, Massachusetts: Academic Press .

Dufaux, F., Pesquet-Popescu, B., and Cagnazzo, M. (2013). Emerging technologies for 3D video: creation, coding, transmission and rendering . Hoboken, NJ: Wiley .

Fei-Fei, L., Fergus, R., and Perona, P. (2006). One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach Intell. 28 (4), 594–611. doi:10.1109/TPAMI.2006.79

Galdi, C., Chiesa, V., Busch, C., Lobato Correia, P., Dugelay, J.-L., and Guillemot, C. (2019). Light fields for face analysis. Sensors 19 (12), 2687. doi:10.3390/s19122687

Graziosi, D., Nakagami, O., Kuma, S., Zaghetto, A., Suzuki, T., and Tabatabai, A. (2020). An overview of ongoing point cloud compression standardization activities: video-based (V-PCC) and geometry-based (G-PCC). APSIPA Trans. Signal Inf. Process. 9, 2020. doi:10.1017/ATSIP.2020.12

Guarda, A., Rodrigues, N., and Pereira, F. (2020). Adaptive deep learning-based point cloud geometry coding. IEEE J. Selected Top. Signal Process. 15, 415-430. doi:10.1109/mmsp48831.2020.9287060

Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., and Bennamoun, M. (2020). Deep learning for 3D point clouds: a survey. IEEE transactions on pattern analysis and machine intelligence . doi:10.1109/TPAMI.2020.3005434

Han, X.-F., Jin, J. S., Wang, M.-J., Jiang, W., Gao, L., and Xiao, L. (2017). A review of algorithms for filtering the 3D point cloud. Signal. Processing: Image Commun. 57, 103–112. doi:10.1016/j.image.2017.05.009

Haskell, B. G., Puri, A., and Netravali, A. N. (1996). Digital video: an introduction to MPEG-2 . Berlin, Germany: Springer Science and Business Media .

Hirsch, R. (1999). Seizing the light: a history of photography . New York, NY: McGraw-Hill .

Ihrke, I., Restrepo, J., and Mignard-Debise, L. (2016). Principles of light field imaging: briefly revisiting 25 years of research. IEEE Signal. Process. Mag. 33 (5), 59–69. doi:10.1109/MSP.2016.2582220

Jing, L., and Tian, Y. (2020). “Self-supervised visual feature learning with deep neural networks: a survey,” IEEE transactions on pattern analysis and machine intelligence , Ithaca, NY: Cornell University .

Le Callet, P., Möller, S., and Perkis, A. (2012). Qualinet white paper on definitions of quality of experience. European network on quality of experience in multimedia systems and services (COST Action IC 1003), 3(2012) .

Le Gall, D. (1991). Mpeg: A Video Compression Standard for Multimedia Applications. Commun. ACM 34, 46–58. doi:10.1145/103085.103090

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. nature 521 (7553), 436–444. doi:10.1038/nature14539

Leutenegger, S., Chli, M., and Siegwart, R. Y. (2011). “BRISK: binary robust invariant scalable keypoints,” IEEE International conference on computer vision , Barcelona, Spain , 6-13 Nov, 2011 ( IEEE ), 2548–2555.

Lin, W., and Jay Kuo, C.-C. (2011). Perceptual visual quality metrics: a survey. J. Vis. Commun. image representation 22 (4), 297–312. doi:10.1016/j.jvcir.2011.01.005

Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60 (2), 91–110. doi:10.1023/b:visi.0000029664.99615.94

Lumiere, L. (1996). 1936 the lumière cinematograph. J. Smpte 105 (10), 608–611. doi:10.5594/j17187

Masia, B., Wetzstein, G., Didyk, P., and Gutierrez, D. (2013). A survey on computational displays: pushing the boundaries of optics, computation, and perception. Comput. & Graphics 37 (8), 1012–1038. doi:10.1016/j.cag.2013.10.003

Murray, N., Marchesotti, L., and Perronnin, F. (2012). “AVA: a large-scale database for aesthetic visual analysis,” IEEE conference on computer vision and pattern recognition , Providence, RI , June, 2012 . ( IEEE ), 2408–2415. doi:10.1109/CVPR.2012.6247954

Rana, A., Valenzise, G., and Dufaux, F. (2018). Learning-based tone mapping operator for efficient image matching. IEEE Trans. Multimedia 21 (1), 256–268. doi:10.1109/TMM.2018.2839885

Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011). “ORB: an efficient alternative to SIFT or SURF,” IEEE International conference on computer vision , Barcelona, Spain , November, 2011 ( IEEE ), 2564–2571. doi:10.1109/ICCV.2011.6126544

Slater, M. (2014). Grand challenges in virtual environments. Front. Robotics AI 1, 3. doi:10.3389/frobt.2014.00003

Van Engelen, J. E., and Hoos, H. H. (2020). A survey on semi-supervised learning. Mach Learn. 109 (2), 373–440. doi:10.1007/s10994-019-05855-6

Vandewalle, P., Kovacevic, J., and Vetterli, M. (2009). Reproducible research in signal processing. IEEE Signal. Process. Mag. 26 (3), 37–47. doi:10.1109/msp.2009.932122

Wallace, G. K. (1992). The JPEG still picture compression standard. IEEE Trans. Consumer Electron.Feb 38 (1), xviii-xxxiv. doi:10.1109/30.125072

Wien, M., Boyce, J. M., Stockhammer, T., and Peng, W.-H. (20192019). Standardization status of immersive video coding. IEEE J. Emerg. Sel. Top. Circuits Syst. 9 (1), 5–17. doi:10.1109/JETCAS.2019.2898948

Wu, G., Masia, B., Jarabo, A., Zhang, Y., Wang, L., Dai, Q., et al. (2017). Light field image processing: an overview. IEEE J. Sel. Top. Signal. Process. 11 (7), 926–954. doi:10.1109/JSTSP.2017.2747126

Xie, N., Ras, G., van Gerven, M., and Doran, D. (2020). Explainable deep learning: a field guide for the uninitiated , Ithaca, NY: Cornell University ..

Keywords: image processing, immersive, image analysis, image understanding, deep learning, video processing

Citation: Dufaux F (2021) Grand Challenges in Image Processing. Front. Sig. Proc. 1:675547. doi: 10.3389/frsip.2021.675547

Received: 03 March 2021; Accepted: 10 March 2021; Published: 12 April 2021.

Reviewed and Edited by:

Copyright © 2021 Dufaux. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Frédéric Dufaux, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

2024 IEEE International Conference on Image Processing Explores the Latest Technical Innovations USA - English APAC - Traditional Chinese USA - español MEXICO - Spanish USA - Français USA - Deutsch Japan - Japanese Korea - 한국어 Middle East - Arabic

News provided by

Sep 17, 2024, 11:37 ET

Share this article

  ICIP 2024 focuses on trustworthy visual data processing and covers cutting-edge topics involving computer vision

PISCATAWAY, N.J. , Sept. 17, 2024 /PRNewswire/ -- IEEE, the world's largest technical professional organization dedicated to advancing technology for humanity, and the IEEE Signal Processing Society (SPS) will hold the 2024 IEEE International Conference on Image Processing (ICIP 2024) , 27-30 October 2024 at the Abu Dhabi National Exhibition Centre in Abu Dhabi , UAE. ICIP 2024 will be a hub of innovation and learning - examining how AI-based approaches in the field are bringing opportunities, as well as challenges.

The program delves into "Trustworthy Generalization in Visual Machine Learning," AI/ML learning on visual data, and examines the latest trends in data-driven, learning-based image and video coding standards.

ICIP 2024 will feature insightful tutorials, engaging exhibits and demonstrations, workshops by leading tech companies, career development opportunities, and networking events for practicing engineers, aspiring students, and young entrepreneurs.

"ICIP will provide thought leaders with a platform to reach diverse audiences and encourage a dialogue on topics making a large impact on our industry. We look forward to building connections between top researchers and practitioners in the field," said Kostas Plataniotis , IEEE Signal Processing Society President.

Plenary speakers include Dr. Touradj Ebrahimi , Convenor of the JPEG Standardization Committee, Dr. Gitta Kutyniok , Bavarian AI Chair for Mathematical Foundations of AI, LMU Munich, and Dr. Mohamad Sawan, Chair Professor, Westlake University ; Emeritus Professor, Polytechnique Montreal. View the Technical Program for a summary of conference activities.

ICIP 2024 will bring together corporate, government, and academic researchers, top global leaders in the field and more - register today .

About IEEE Signal Processing Society

Founded as IEEE's first society in 1948, the Signal Processing Society is the world's premier association for signal processing engineers and industry professionals.  

IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. Through its highly cited publications, conferences, technology standards, and professional and educational activities, IEEE is the trusted voice on a wide variety of areas ranging from aerospace systems, computers and telecommunications to biomedical engineering, electric power and consumer electronics.

Media Contact: Caroline Johnson - Director, Conferences, Marketing, and Data Analytics, [email protected]

SOURCE IEEE Signal Processing Society

WANT YOUR COMPANY'S NEWS FEATURED ON PRNEWSWIRE.COM?

icon3

Modal title

Also from this source.

2024 IEEE International Conference on Image Processing explora las últimas innovaciones técnicas

2024 IEEE International Conference on Image Processing explora las últimas innovaciones técnicas

IEEE, la organización profesional técnica más grande del mundo dedicada a promover la tecnología para la humanidad, y la IEEE Signal Processing...

La conférence internationale IEEE 2024 sur le traitement des images explore les dernières innovations techniques

La conférence internationale IEEE 2024 sur le traitement des images explore les dernières innovations techniques

IEEE, la plus grande organisation professionnelle technique au monde dédiée à l'avancement de la technologie pour l'humanité, et l'IEEE Signal...

Image1

Computer & Electronics

Image1

Artificial Intelligence

Image1

STEM (Science, Tech, Engineering, Math)

Image1

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Sensors (Basel)

Logo of sensors

Modern Trends and Applications of Intelligent Methods in Biomedical Signal and Image Processing

Jan kubicek.

1 Department of Cybernetics and Biomedical Engineering, VŠB-Technical University of Ostrava, 17. listopadu 15, 70 833 Ostrava-Poruba, Czech Republic; [email protected]

Marek Penhaker

Ondrej krejcar.

2 Center for Basic and Applied Research, Faculty of Informatics and Management, University of Hradec Kralove, Rokitanskeho 62, 50 003 Hradec Kralove, Czech Republic; [email protected]

Ali Selamat

3 Malaysia Japan International Institute of Technology (MJIIT), Universiti Teknologi Malaysia, Jalan Sultan Yahya Petra, Kuala Lumpur 54100, Malaysia; ym.mtu@tamalesa

There are various modern systems for the measurement and consequent acquisition of valuable patient’s records in the form of medical signals and images, which are supposed to be processed to provide significant information about the state of biological tissues. Therefore, in the modern age of digital technologies in the healthcare sector, we are surrounded with big data of clinical patients containing valuable information about actual state and future prediction, which needs to be extracted from biomedical signals and images. Thus, the current trends in this area of biomedical engineering are focused on the design and development of intelligent methods, containing elements of artificial intelligence, allowing the extraction, classification, and optimization of clinical information from various medical data. Such methods significantly facilitate the workload of medical staff, and at the same time, serve to provide effective feedback for clinicians as decision-making systems. Particularly, such intelligent methods are employed for data smoothing, feature extraction, segmentation, identification, and classification. Such tasks require participation clinical specialists, mathematicians, and information experts, who together develop the intelligent systems which can be employed in the health care sector as a support to medical staff.

The Special Issue “Modern Trends and Applications of Intelligent Methods in Biomedical Signal and Image Processing” is aimed at the new proposals and intelligent solutions that constitute the state of the art of the intelligent methods for biomedical data processing from selected areas of signal and image processing. This Special Issue brings together research works from various fields that are related to the area of biomedical engineering to describe the recent trends and advances in this area.

For this Special Issue, we received 20 contributions in total. After judging scientific impact and novelty, we selected the 10 contributions included herein. The published papers include nine research papers and one review.

We appreciate all the authors who have decided to publish their research in this Special Issue. Thanks to these authors, we could provide the state of the art of the recent research of intelligent techniques in applications of biomedical engineering. Below, we summarize the individual contributions published in this Special Issue.

In recent years, image-guided navigation systems (IGNS) have become an important tool for various surgical operations. In the preparations for planning a surgical path, verifying the location of a lesion, etc., it is an essential tool; in operations such as bronchoscopy, which is the procedure for the inspection and retrieval of diagnostic samples for lung-related surgeries, it is even more so. In Reference [ 1 ], the authors propose a novel registration method to match real bronchoscopy images with virtual bronchoscope images from a 3D bronchial tree model built using computed tomography (CT) image stacks in order to obtain the current 3D position of the bronchoscope in the airways. This method represents a combination of a novel position-tracking method using the current frames from the bronchoscope and the verification of the position of the real bronchoscope image against an image extracted from the 3D model using an adaptive-network-based fuzzy inference system (ANFIS)-based image matching method. Experimental results show that the proposed method performs better than the other methods used in the comparison.

Heart problems are responsible for the majority of deaths worldwide. The use of intelligent techniques to assist in the identification of existing patterns in these diseases can facilitate treatments and decision making in the field of medicine. In Reference [ 2 ], authors extract knowledge from a dataset based on heart noise behaviors in order to determine whether heart murmur predilection exists or not in the analyzed patients. A heart murmur can be pathological due to defects in the heart, so the use of an evolving hybrid technique can assist in detecting this comorbidity team, and at the same time, extract knowledge through fuzzy linguistic rules, facilitating the understanding of the nature of the evaluated data. Heart disease detection tests were performed to compare the proposed hybrid model’s performance with the state of the art for the subject. The results obtained showed 90.75% accuracy, in addition to great assertiveness in detecting heart murmurs.

In recent years, research has focused on generating mechanisms to assess the levels of subjects’ cognitive workload when performing various activities that demand high concentration levels, such as driving a vehicle. These mechanisms have involved the implementation of several tools for analyzing the cognitive workload, and electroencephalographic (EEG) signals have been most frequently used due to their high precision. In Reference [ 3 ], the authors present a new feature selection model that is focused on pattern recognition using information from EEG signals based on machine learning techniques called GALoRIS (Genetic algorithms and logistic regression). This method utilizes genetic algorithms and logistic regression with the aim to make a new fitness function that identifies and selects the critical EEG features that contribute to recognizing high and low cognitive workloads and structures.

In the area of medical data processing, wavelet transformation is frequently used for various applications, including data decomposition, smoothing, feature extraction, and image segmentation. One of the essential steps is the selection of suitable wavelet settings, including the mother wavelet and the decomposition level. Since wavelet transformation offers plenty of settings, it is usually a complicated task to select the most appropriate settings. In Reference [ 4 ], the authors propose a novel scheme that is able to simultaneously evaluate the effectivity of selected wavelet settings via the form of the spatial 2D maps. The authors also study the effect of dynamical noise influence within wavelet smoothing by using the volumetric mapping. The authors report of the testing of these techniques on both 1D EMG signals and 2D medical images from various imaging modalities.

In Reference [ 5 ], the authors present a novel approach intended for the periodical testing of the function evaluation of fetal heart rate monitors. The proposed simulator was designed to be compliant with the standard requirements for the accurate assessment and measurement of medical devices. The accuracy of the simulated signals was evaluated, and it was shown to be stable and reliable. The generated frequencies showed an error of about 0.5% with respect to the nominal one, while the accuracy of the test equipment was within ±3% of the test signal set frequency. The proposed device ensures easy and fast testing of fetal heart rate monitors. Hence, it provides an effective way to evaluate and test the correlation of commercial devices.

The invasive method of fetal electrocardiogram (fECG) monitoring is widely used, with electrodes directly attached to the fetal scalp. There are potential risks, such as infection, and thus it is usually carried out during labor when required. Recent advances in electronics and technologies have enabled fECG monitoring from the early stages of pregnancy through fECG extraction from the combined fetal/maternal ECG (f/mECG) signal recorded noninvasively in the abdominal area of the mother. In Reference [ 6 ], the authors propose an end-to-end deep learning model which is aimed at the detection of fetal QRS complexes. The proposed model also contains the residual network (resNet) architecture. This net is able to adopt a novel 1D octave convolution (OctConv), which is focused on multiple temporal frequency features. This fact predetermines the memory reduction and computational demands.

Time-of-flight (ToF) sensors are the source of various errors, including the multicamera interference artifact caused by the parallel scanning mode of the sensors. In Reference [ 7 ], the authors present a novel importance map, which is based on the median filtration algorithm with the aim of suppressing interference artifacts. The proposed method is based on the processing of multiple depth frames. This method uses the interference region and application of the interpolation. Performance of the algorithm was evaluated on a dataset consisting of the real-world objects with different textures and morphologies against popular filtering methods based on neural networks and statistics.

In Reference [ 8 ], the authors present a proposal of using electrodes for the continual measurement of the glucose concentration for the purpose of specifying further hemodynamic parameters. The proposal includes the design of the electronic measuring system, the construction of the electrodes themselves, and the functionality of the entire system, verified experimentally using various electrode materials. The proposed circuit works based on the microammeter measuring the size of the flowing electric current, and the electrochemical measurement method is used for specifying the glucose concentration. The electrode system is comprised of two electrodes embedded in a silicon tube. The authors present testing indicating that even if the Ag/AgCl electrode appears to be the most suitable, showing high stability, gold-plated electrodes showed stability throughout the measurement, similarly to Ag/AgCl electrodes, but did not achieve the same qualities in sensitivity and readability of the measured results.

The next study [ 9 ] proposes a novel multinetwork intelligent architecture, containing a multiscale convolutional neural network (MSCNN) with a fully connected graph convolution network (GCN), named MSCNN-GCN, for the detection of musculoskeletal abnormalities via musculoskeletal radiographs. The effectiveness of this model was verified by comparing the performance of radiologists and three popular CNN models (DenseNet169, CapsNet, and MSCNN) with three evaluation metrics (accuracy, F1 score, and kappa score) using the MURA dataset (a large dataset of bone X-rays).

The intake of microbially contaminated food poses severe health issues due to outbreaks of serious foodborne diseases. Therefore, there is a need for the precise detection and identification of pathogenic microbes and toxins in food to prevent these concerns. Thus, understanding the concept of biosensing has enabled researchers to develop nano-biosensors with different nanomaterials and composites to improve the sensitivity as well as the specificity of pathogen detection. In Reference [ 10 ], the authors publish a review that summarizes various sensing methods used in foodborne pathogen detection; in addition, the authors focus on the design, technical principles, and advances in sensing systems.

Author Contributions

Conceptualization, J.K.; methodology, M.P.; validation, O.K.; formal analysis, A.S. All authors have read and agreed to the published version of the manuscript.

The work and the contributions were supported by the project SV450994 Biomedical Engineering Systems XV’. This study was supported by the research project The Czech Science Foundation (TACR) ETA No. TL01000302 Medical Devices development as an effective investment for public and private entities.

Conflicts of Interest

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

  • All Categories
  • Image Recognition Software

What is Image Processing? Examples, Types, and Benefits

latest research areas in image processing

In this post

Analog versus digital image processing

Types of image processing, how are digital images processed, how image processing is used in the real world, benefits of image processing, how has ai changed and enhanced image processing, top 5 image recognition software.

We see thousands of images every day, online and out in the real world. It’s likely that the images have been changed in some way before being released into the wild. 

Whether someone simply brightened or sharpened the visuals or performed more extensive edits to extract critical information, many industries rely on the technique of image processing to complete their work.

What is image processing?

Image processing is a group of methods used to understand, interpret, and alter visual data. Two-dimensional digital images are made up of pixels, a small unit that contains bits of information about the shade, color, and opacity of that specific part of the visual. Together, all of the pixels make up the full image. This data is then processed to enhance the image or to extract information contained within it.

While image processing has been around for at least 80 years in some form, technological developments over the last decade have seen an increase in the use of artificial intelligence (AI) tools. Algorithms have been developed to replicate how the human brain would process these images. Image recognition software , also known as computer vision, completes the processing functions that the machine has been trained to perform. 

Most forms of image processing these days are digital, which sees pixelated graphics processed through a computer using an algorithm. With AI, these algorithms elevate the precision and sophistication of identification and modification.

Analog image processing still happens, though. Special types of optical computers are used to process physical images using light waves generated by the object. Hard copying, like printing or photocopying, stands as the most common application of analog image processing.

Want to learn more about Image Recognition Software? Explore Image Recognition products.

The goal for most image processing is to either improve the quality of the visual itself or to gain a better understanding of different elements in the image. Different objectives call for different types of processing. 

Some of the most common types of image processing are:

  • Image enhancement. Not every picture comes out perfectly in its original form. Image processing tools can alter the quality of images by doing things like adjusting the brightness, sharpness, clarity, and contrast. 
  • Object detection and classification. The practice of object detection identifies different elements within an image. You can find patterns when they’re cleanly separated in a visual or you can quickly highlight specific objects when the visual is scanned.
  • Image segmentation . Images may need to be divided into different sections for object detection or other purposes. After that, you can analyze the separate regions independently from each other. This happens a lot in medical imaging like MRIs, which shows different shades of gray and black to represent solid masses around fluid.
  • Image compression. This type reduces the file size of an image while still preserving its original quality. Compression makes uploading images to websites faster, improves page loading times, and minimizes storage needs for businesses that keep numerous image files.
  • Image restoration. Images of any kind can lose their quality over time. Physical photos especially degrade over decades and iImage processing is a good way to restore the original look and feel, especially for physical photographs.

What is annotation in image processing?

The practice of image annotation labels elements within digital images. This refers to when it’s done manually by humans or digitally by computers. It lets computers interpret an image and extract important information.

When AI functions as the primary method of image processing, machine learning (ML) engineers typically predetermine the labels entered into a digital image processing algorithm, helping introduce the computer to different objects.

This is an essential part of the object detection and classification process, as any mistakes here become difficult to fix as the machine learning tool grows. Precision and accuracy at this early stage of training are non negotiable.

For any image processing project, there are several key steps that must happen for the image to be thoroughly altered (if necessary) and reviewed before a better output can be generated. Not every image will need to go through all of these steps, but this sequence is the most commonly used in image processing.

1. Acquisition 

The first simple step is taking a photo on a camera or converting an analog image to a digital one. Also known as pre-processing, acquiring the image moves the image from its original source and uploads it to a computer.

2. Enhancement or restoration

Edits to the image can start right away. This could include sharpening the image to remove blurry features, increasing the contrast to better see different parts of the image, or restoring areas of the image that may have been damaged.

3. Color processing

When color visuals, you might need corrections at this stage to match the final colors of the image as accurately as possible to a standardized color chart.

4. Wavelets and multi-resolution processing

Wavelets represent different parts of the image at various resolution levels. When an image is divided into its wavelets for compression and analysis, the computer has an easier time working on a smaller scale.

5. Compression

Reducing the size of the image at this point in the process scales down the file size and simultaneously keeps the image quality as high as possible.

6. Morphological processing

Different elements of the image may be morphed together during processing if they’re not needed for analysis or extraction. This reduces overall processing times.

7. Segmentation

At this important step, each region of the graphic is broken down into groups based on characteristics in the pixels. This helps discern different areas of the image.

8. Representation and description

This step helps find borders in segmented regions of the image. Attributes of these segmented regions are assigned during the description phase, which distinguishes one group from another.

9. Object detection

Once all of the image segments have been described and assigned, labels are added to let human users identify the different parts of the image. For instance, in a street scene, object detection differentiates between cars and street lamps and then labels them accordingly.

Hundreds of applications for image processing exist – from healthcare and agriculture to security and legal services.

of all business-related tasks are performed by machines.

Source: World Economic Forum

Face and text recognition

Facial recognition software looks for comparisons between two images, usually between a person, or a live image of the person, and an ID, like a passport or driver’s license. This software can also be used for multi-factor authentication (MFA) for unlocking a phone, along with automatic tagging in photos on social media platforms.

This technology doesn’t just help with images. You can also turn to these tools to scan for recognizable patterns, both in type- and handwritten text. The documents can also be entered into natural language processing (NLP) software for extraction, annotation, and review, just like with visuals.

Reverse image search

Have you ever done a reverse Google Images search? That’s powered by image processing technology. Reverse image searches assess the features in the original image and scan the web for similar or exact matches of that image elsewhere online.

Autonomous vehicle object detection

Self-driving vehicles must immediately and constantly sense possible hazards like pedestrians, buildings, and other cars to keep everyone safe from them. Object detection algorithms can quickly identify specific objects within the vehicle’s viewing radius, which triggers the car’s safety functions.

Medical imaging

From research to diagnosis to recovery, medical professionals apply image processing technology extensively. Healthcare workers detect tumors and other anomalies while 3D image processing empowers surgeons to navigate the most complex parts of our anatomy.  

Professionals across fields have found many benefits from using image processing tools. Just a few are mentioned here.

Increased accuracy

Image processing tools detect even the smallest detail, which makes finding errors much easier. Automating many of the steps in the image processing pipeline reduces human error. Many industries, like medicine and agriculture, put a lot of trust in the high level of precision that modern image processing offers.

Cost savings

Catching issues early in the process, like in product manufacturing or retail, means that businesses save money on correcting these later with recalls or returns. Image processing can be used for quality control to identify possible defects in products as they’re made, along with verifying information such as batch numbers or expiration dates. If errors are made during manufacturing but are spotted straight away, they can be fixed before going out to customers.

Real-time updates

When image processing tools are used in industries like security and surveillance, their ability to communicate real-time data can mark the difference between a criminal’s success or failure. This allows security teams to act quickly when responding to incidents.

Improved customer experience

Customer-facing fields, such as retail and hospitality, use image processing in a number of ways. This includes comparing a digital capture of inventory in a stockroom or warehouse against system inventory levels. 

This ensures that stock counts are accurate and gives managers the okay to reorder. Now, customers don’t have to wait as long for their items.

The introduction of AI to image processing has significantly changed the way many industries use this technology in their day-to-day. As algorithms become more sophisticated at training machines to think and process like humans, the applications for this technology continue to grow.

Using deep learning with image processing has cleared the path for computers to detect objects within an image and recognize patterns more accurately. The models we have today process and understand visual data much faster than traditional digital or analog image processing techniques. 

For many of the industries that already count on image processing, AI has improved efficiency by automating even the most complex tasks like segmentation and image enhancement.

Facial and object recognition exists as one of the most used applications of AI image processing. Image generation also takes up space in this field by creating new work based on information from previously created visuals. 

The process of digital image processing using AI

Engineers use ML techniques to harness the power of AI algorithms for interpreting visual data. Neural networks, The core functionality behind this process consists of neural networks, interconnected nodes placed together in a layered structure to mimic the way a human brain understands data. After they’re in position, the algorithm can conduct its image processing, using the following method.

  • Data collection. The first stage is gathering a large dataset of labeled or annotated images to train the algorithm on. They should relate closely to your project or task; more relevant data upfront increases the odds of accurate outcomes later. At this stage, images are processed to resize them for consistency.
  • Pattern recognition. Ahead of training, the model begins to identify and distinguish patterns within the dataset.
  • Model training . Here, the neural network starts reviewing the input dataset and all elements within it, like image labels or patterns. This information will help develop the neural network’s intelligence for use in future projects.
  • Feature extraction. Trained models should reach a point where they can start doing work on their own, including identifying the features of new, previously unseen images. Based on what the algorithm learned during the train phase, relevant features should now be recognizable. For instance, in facial recognition , neural networks should be able to pinpoint facial features like noses or eyes at this stage.
  • Validation. Think of this as the testing stage for all of the completed steps. You compare a separate validation dataset to the model’s performance so far to find inaccuracies and areas that need fine-tuning. 
  • Inference. At this point, you introduce new images to the model for continuing training once errors have been corrected. This builds on the previously-learned patterns and allows the model to start making its own predictions about new visuals
  • Learning and improvement. The process continues even after fully-trained models have been deployed. Continual improvement through additional cycles of training with new data improves performance and raises accuracy over time.

Image processors or recognition tools are used by data scientists to train image recognition models and to help engineers adapt existing software to have image processing capabilities. These software are an important part of machine learning and enable businesses to do more with their visual media.

To be included in the image recognition software category, platforms must:

  • Provide a deep learning algorithm specifically for image recognition 
  • Connect with image data pools to learn a specific solution or function 
  • Consume the image data as an input and provide an outputted solution 
  • Provide image recognition capabilities to other applications, processes, or services

* Below are the top five leading image recognition software solutions from G2’s Summer 2024 Grid Report. Some reviews may be edited for clarity.

1. Google Cloud Vision API

Google Cloud’s Vision API is an image processing tool that can detect and classify multiple objects within images and helps developers leverage the power of machine learning. With pre-trained ML models, developers are able to classify images into millions of predefined categories for more efficient image processing.

What users like best:

“The best thing about the API is it is trained on a very huge dataset which makes the lives of developers easy as we can build great image recognition models with a very high accuracy without even having big data available with us.”

- Google Cloud Vision API Review , Saurabh D .

What users dislike:

“For low quality images, it sometimes gives the wrong answer as some food has the same color. It does not provide us the option to customize or train the model for our specific use case.”

- Google Cloud Vision API Review , Badal O .

2. Gesture Recognition Toolkit

With the Gesture Recognition Toolkit , developers can use existing datasets to complete real-time image processing quickly and easily. The toolkit is cross platform and open source, making it easy for both new and experienced developers to benefit from others working on similar projects.

“I like how it is designed to work with real time sensor data and at the same time the traditional offline machine learning task. I like that it has a double precision float and can easily be changed to single precision, making it a very flexible tool.”

- Gesture Recognition Toolkit Review , Diana Grace Q .

“Gesture Recognition Toolkit has occasional lag and a less smooth implementation process.”

- Gesture Recognition Toolkit Review , Civic V .

3. SuperAnnotate

SuperAnnotate is a leading image annotation software, helping businesses to build, fine-tune, and iterate AI models with high-quality training data. The advanced annotation technology, data curation, automated features, and data governance tools enable you to build large scale AI models with predetermined datasets.

“The platform is very easy and intuitive to use. The user interface is friendly and everything is easy to find.”

- SuperAnnotate Review , Dani S .

“We have had some issues with custom workflows that the team implemented for specific projects on their platform.”

- SuperAnnotate Review , Rohan K .

Syte is a visual AI product discovery platform that uses camera search, personalization engine, and in-store tools to help eCommerce and brick-and-mortar retail businesses connect shoppers with their products. The tools are instant and intuitive, making it easy for shoppers to discover and purchase products.

“The visual search discovery button is a great addition to our ecommerce site. I like that it helps customers find similar items visually for products that might not be in their size, thereby increasing conversion and the overall shopping experience. I also like that customers can adjust the visual search selection to encourage cross-shopping with other items featured in our images.”

- Syte Review , Lexis K .

“The backend merch platform is not the most intuitive as other platforms. The “complete the look” function doesn't showcase the exact products part of the look, only lookalikes.”

- Syte Review , Cristina F .

5. Dataloop

Dataloop allows developers to build custom algorithms and train data throughout all parts of the AI lifecycle. From management and annotation to model selection and deployment, Dataloop uses intuitive features to help you get the most out of your AI systems.

“DataLoop excels at constructing quality data infrastructure for unstructured data, streamlining computer-vision pipelines, and ensuring seamless integration with robust security measures.”

- Dataloop Review , George M .

“I have had challenges with some steep learning curves, infrastructure dependency, and customization limitations. These have in a way limited me in its usage.”

- Dataloop Review , Dennis R .

Click to chat with G2s Monty-AI

Picture this: perfect pixels every time!

Using AI to label, classify, and process your image can save your team time every month. Train your machine with the right functions and datasets so it becomes a customized worker that improves performance with accuracy and efficiency. 

Find the right data labeling software for your business and industry to turn unlabeled datasets into comprehensive inputs for your AI training.

Holly Landis

Holly Landis is a freelance writer for G2. She also specializes in being a digital marketing consultant, focusing in on-page SEO, copy, and content writing. She works with SMEs and creative businesses that want to be more intentional with their digital strategies and grow organically on channels they own. As a Brit now living in the USA, you'll usually find her drinking copious amounts of tea in her cherished Anne Boleyn mug while watching endless reruns of Parks and Rec.

Explore More G2 Articles

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals

Medical imaging articles from across Nature Portfolio

Medical imaging comprises different imaging modalities and processes to image human body for diagnostic and treatment purposes. It is also used to follow the course of a disease already diagnosed and/or treated.

Related Subjects

  • Bone imaging
  • Brain imaging
  • Magnetic resonance imaging
  • Molecular imaging
  • Radiography
  • Radionuclide imaging
  • Three-dimensional imaging
  • Ultrasonography
  • Whole body imaging

Latest Research and Reviews

latest research areas in image processing

Automated gall bladder cancer detection using artificial gorilla troops optimizer with transfer learning on ultrasound images

  • Sana Alazwari
  • Jamal Alsamri
  • Ahmed S. Salama

latest research areas in image processing

AMD-SD: An Optical Coherence Tomography Image Dataset for wet AMD Lesions Segmentation

latest research areas in image processing

Integrating neural networks with advanced optimization techniques for accurate kidney disease diagnosis

  • Samar Elbedwehy
  • Esraa Hassan
  • Rady Elmonier

latest research areas in image processing

A comparative evaluation of deep learning approaches for ophthalmology

  • Glenn Linde
  • Waldir Rodrigues de Souza Jr
  • Sheng Chiong Hong

latest research areas in image processing

Interactive computer-aided diagnosis on medical image using large language models

Wang et al. developed a machine learning strategy for improving large language model to understand and analyse visual medical information. Their framework seamlessly integrates medical image computer-aided diagnosis networks with large language models, converting medical image inputs into a clear and concise textual summary of the patient’s condition.

  • Dinggang Shen

latest research areas in image processing

Non-invasive in vivo imaging of changes in Collagen III turnover in myocardial fibrosis

  • Nadia Chaher
  • Sara Lacerda
  • Alkystis Phinikaridou

Advertisement

News and Comment

Midas: a new platform for quality-graded health data for ai-enabled healthcare in india.

  • Dibyajyoti Maity
  • Rohit Satish
  • Debnath Pal

BrECADD raises the bar in classical Hodgkin lymphoma

  • David Killock

latest research areas in image processing

Shielding sensitive medical imaging data

Differential privacy offers protection in medical image processing but is traditionally thought to hinder accuracy. A recent study offers a reality check on the relationship between privacy measures and the ability of an artificial intelligence (AI) model to accurately analyse medical images.

  • Gaoyang Liu

latest research areas in image processing

Mapping fibrosis pathways with MRI and genetic association analyses

We quantified liver, pancreas, heart and kidney fibrosis using MRI T1 mapping in over 40,000 individuals. Using genetic association analyses, we identified a total of 58 loci, 10 of which overlapped across organs. A high burden of fibrosis in three or more organs was associated with an increased risk of mortality.

Refractive shifts in astronauts during spaceflight: mechanisms, countermeasures, and future directions for in-flight measurements

  • Kelsey Vineyard
  • Andrew G. Lee

latest research areas in image processing

Adapting vision–language AI models to cardiology tasks

Vision–language models can be trained to read cardiac ultrasound images with implications for improving clinical workflows, but additional development and validation will be required before such models can replace humans.

  • Rima Arnaout

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

latest research areas in image processing

M.Tech/Ph.D Thesis Help in Chandigarh | Thesis Guidance in Chandigarh

latest research areas in image processing

[email protected]

latest research areas in image processing

+91-9465330425

What is Digital Image Processing?

Digital image processing is the process of using computer algorithms to perform image processing on digital images. Latest topics in digital image processing for research and thesis are based on these algorithms. Being a subcategory of digital signal processing, digital image processing is better and carries many advantages over analog image processing. It permits to apply multiple algorithms to the input data and does not cause the problems such as the build-up of noise and signal distortion while processing. As images are defined over two or more dimensions that make digital image processing “a model of multidimensional systems”. The history of digital image processing dates back to early 1920s when the first application of digital image processing came into news. Many students are going for this field for their  m tech thesis  as well as for Ph.D. thesis. There are various thesis topics in digital image processing for M.Tech, M.Phil and Ph.D. students. The list of thesis topics in image processing is listed here. Before going into  topics in image processing , you should have some basic knowledge of image processing.

image-processing

Latest research topics in image processing for research scholars:

  • The hybrid classification scheme for plant disease detection in image processing
  • The edge detection scheme in image processing using ant and bee colony optimization
  • To improve PNLM filtering scheme to denoise MRI images
  • The classification method for the brain tumor detection
  • The CNN approach for the lung cancer detection in image processing
  • The neural network method for the diabetic retinopathy detection
  • The copy-move forgery detection approach using textual feature extraction method
  • Design face spoof detection method based on eigen feature extraction and classification
  • The classification and segmentation method for the number plate detection
  • Find the link at the end to download the latest thesis and research topics in Digital Image Processing

Formation of Digital Images

Firstly, the image is captured by a camera using sunlight as the source of energy. For the acquisition of the image, a sensor array is used. These sensors sense the amount of light reflected by the object when light falls on that object. A continuous voltage signal is generated when the data is being sensed. The data collected is converted into a digital format to create digital images. For this process, sampling and quantization methods are applied. This will create a 2-dimensional array of numbers which will be a digital image.

Why is Image Processing Required?

  • Image Processing serves the following main purpose:
  • Visualization of the hidden objects in the image.
  • Enhancement of the image through sharpening and restoration.
  • Seek valuable information from the images.
  • Measuring different patterns of objects in the image.
  • Distinguishing different objects in the image.

Applications of Digital Image Processing

  • There are various applications of digital image processing which can also be a good topic for the thesis in image processing. Following are the main applications of image processing:
  • Image Processing is used to enhance the image quality through techniques like image sharpening and restoration. The images can be altered to achieve the desired results.
  • Digital Image Processing finds its application in the medical field for gamma-ray imaging, PET Scan, X-ray imaging, UV imaging.
  • It is used for transmission and encoding.
  • It is used in color processing in which processing of colored images is done using different color spaces.
  • Image Processing finds its application in machine learning for pattern recognition.

List of topics in image processing for thesis and research

  • There are various in digital image processing for thesis and research. Here is the list of latest thesis and research topics in digital image processing:
  • Image Acquisition
  • Image Enhancement
  • Image Restoration
  • Color Image Processing
  • Wavelets and Multi Resolution Processing
  • Compression
  • Morphological Processing
  • Segmentation
  • Representation and Description
  • Object recognition
  • Knowledge Base

1. Image Acquisition:

Image Acquisition is the first and important step of the digital image of processing . Its style is very simple just like being given an image which is already in digital form and it involves preprocessing such as scaling etc. It starts with the capturing of an image by the sensor (such as a monochrome or color TV camera) and digitized. In case, the output of the camera or sensor is not in digital form then an analog-to-digital converter (ADC) digitizes it. If the image is not properly acquired, then you will not be able to achieve tasks that you want to. Customized hardware is used for advanced image acquisition techniques and methods. 3D image acquisition is one such advanced method image acquisition method. Students can go for this method for their master’s thesis and research.

2. Image Enhancement:

Image enhancement is one of the easiest and the most important areas of digital image processing. The core idea behind image enhancement is to find out information that is obscured or to highlight specific features according to the requirements of an image. Such as changing brightness & contrast etc. Basically, it involves manipulation of an image to get the desired image than original for specific applications. Many algorithms have been designed for the purpose of image enhancement in image processing to change an image’s contrast, brightness, and various other such things. Image Enhancement aims to change the human perception of the images. Image Enhancement techniques are of two types: Spatial domain and Frequency domain.

3. Image Restoration:

Image restoration involves improving the appearance of an image. In comparison to image enhancement which is subjective, image restoration is completely objective which makes the sense that restoration techniques are based on probabilistic or mathematical models of image degradation. Image restoration removes any form of a blur, noise from images to produce a clean and original image. It can be a good choice for the M.Tech thesis on image processing. The image information lost during blurring is restored through a reversal process. This process is different from the image enhancement method. Deconvolution technique is used and is performed in the frequency domain. The main defects that degrade an image are restored here.

4. Color Image Processing:

Color image processing has been proved to be of great interest because of the significant increase in the use of digital images on the Internet. It includes color modeling and processing in a digital domain etc. There are various color models which are used to specify a color using a 3D coordinate system. These models are RGB Model, CMY Model, HSI Model, YIQ Model. The color image processing is done as humans can perceive thousands of colors. There are two areas of color image processing full-color processing and pseudo color processing. In full-color processing, the image is processed in full colors while in pseudo color processing the grayscale images are converted to colored images. It is an interesting topic in image processing.

latest research areas in image processing

IMAGES

  1. Latest Digital Image Processing Topics [Worldwide Research Guidance ]

    latest research areas in image processing

  2. Matlab Simulation

    latest research areas in image processing

  3. Innovative Medical Image Processing Thesis Topics [Research Help]

    latest research areas in image processing

  4. Canon shows their impressive new Deep learning image processing

    latest research areas in image processing

  5. Research Topics in Image Processing (PhD Projects

    latest research areas in image processing

  6. Application areas of image processing technology.

    latest research areas in image processing

VIDEO

  1. Intersection Workshop

  2. Source of Secondary Data in Research I Case Study Method I Data Processing Research Methodology

  3. "To solve technical problems it's a high amount of creativity that is required"

  4. Biotechnology Analysis: The Sample Processing You Need #research

  5. (Image processing(lab5

  6. Digital Image Processing with Artificial Intelligence (AI)

COMMENTS

  1. Image processing

    Image processing is manipulation of an image that has been digitised and uploaded into a computer. ... Latest Research and Reviews. ... are having a substantial impact on many areas of applied ...

  2. Recent Trends in Image Processing and Pattern Recognition

    The 5th International Conference on Recent Trends in Image Processing and Pattern Recognition (RTIP2R) aims to attract current and/or advanced research on image processing, pattern recognition, computer vision, and machine learning. The RTIP2R will take place at the Texas A&M University—Kingsville, Texas (USA), on November 22-23, 2022, in ...

  3. Deep learning models for digital image processing: a review

    This article surveys various deep learning methods for image restoration, enhancement, segmentation, feature extraction, and classification. It compares and contrasts the strengths and limitations of different models, and provides evaluation metrics and applications for each domain.

  4. Image processing

    Read the latest Research articles in Image processing from Nature Methods. ... We also speculate about potential approaches and areas of focus to overcome these challenges and thus build the ...

  5. Developments in Image Processing Using Deep Learning and Reinforcement

    This article reviews the developments and applications of artificial intelligence (AI) methods, such as deep learning and reinforcement learning, in image processing. It covers the challenges, benefits, and future directions of using AI to analyze and extract information from images in various domains.

  6. Editorial: Current Trends in Image Processing and Pattern Recognition

    This article is part of the Research Topic Current Trends in Image Processing and Pattern Recognition View all ... authors introduced a new deep learned quantization-based coding for 3D airborne LiDAR point cloud image. In their experimental results, authors showed that their model compressed an image into constant 16-bits of data and ...

  7. Image processing articles within Scientific Reports

    Read the latest Research articles in Image processing from Scientific Reports. ... Impact of acquisition area on deep-learning-based glaucoma detection in different plexuses in OCTA.

  8. Current Trends in Image Processing and Pattern Recognition

    The international conference on Recent Trends in Image Processing and Pattern Recognition (RTIP2R) aims to attract researchers working on promising areas of image processing, pattern recognition, computer vision, artificial intelligence, and machine learning. This special Research Topic, part of Frontiers in Robotics and AI, welcomes original ...

  9. Research Topics

    Research Topics. Biomedical Imaging. The current plethora of imaging technologies such as magnetic resonance imaging (MR), computed tomography (CT), position emission tomography (PET), optical coherence tomography (OCT), and ultrasound provide great insight into the different anatomical and functional processes of the human body. Computer Vision.

  10. Recent trends in image processing and pattern recognition

    The Call for Papers of the special issue was initially sent out to the participants of the 2018 conference (2nd International Conference on Recent Trends in Image Processing and Pattern Recognition). To attract high quality research articles, we also accepted papers for review from outside the conference event.

  11. Research Areas in Computer Vision: Trends and Challenges

    Learn about the key concepts, technologies, applications, and challenges of computer vision, a field of artificial intelligence that trains computers to interpret and understand the visual world. Explore the research areas of augmented reality, robotic language-vision models, and autonomous mobile robots in 2024.

  12. (PDF) Advances in Artificial Intelligence for Image Processing

    AI has had a substantial influence on image processing, allowing cutting-edge methods and uses. The foundations of image processing are covered in this chapter, along with representation, formats ...

  13. Trends and Advancements of Image Processing and Its Applications

    The authors include new concepts of color space transformation like color interpolation, among others. Also, the concept of Shearlet Transform and Wavelet Transform and their implementation are discussed. ... Presents developments of current research in various areas of image processing; Includes applications of image processing in remote ...

  14. Image Processing: Research Opportunities and Challenges

    Interest in digital image processing methods stems from two principal application areas: improvement of pictorial information for human interpretation; and processing of image data for storage ...

  15. Recent advances and clinical applications of deep learning in medical

    Abstract. Deep learning has received extensive research interest in developing new medical image processing algorithms, and deep learning based models have been remarkably successful in a variety of medical imaging tasks to support disease detection and diagnosis. Despite the success, the further improvement of deep learning models in medical ...

  16. Image Processing

    Learn about various processes and applications of image processing in engineering, such as enhancement, segmentation, registration, and classification. Explore chapters and articles on topics like firefly algorithm, demons registration, and neutron radiography.

  17. Top 10 Digital Image Processing Project Topics

    Beyond this, this field will give you numerous Digital Image Processing Project Topics for current and upcoming scholars. Below, we have mentioned some research ideas that help you to classify analysis, represent and display the images or particular characteristics of an image. Latest 11 Interesting Digital Image Processing Project Topics

  18. Grand Challenges in Image Processing

    Introduction. The field of image processing has been the subject of intensive research and development activities for several decades. This broad area encompasses topics such as image/video processing, image/video analysis, image/video communications, image/video sensing, modeling and representation, computational imaging, electronic imaging, information forensics and security, 3D imaging ...

  19. 2024 IEEE International Conference on Image Processing Explores the

    ICIP 2024 focuses on trustworthy visual data processing and covers cutting-edge topics involving computer vision. PISCATAWAY, N.J., Sept. 17, 2024 /PRNewswire/ -- IEEE, the world's largest ...

  20. Modern Trends and Applications of Intelligent Methods in Biomedical

    The Special Issue "Modern Trends and Applications of Intelligent Methods in Biomedical Signal and Image Processing" is aimed at the new proposals and intelligent solutions that constitute the state of the art of the intelligent methods for biomedical data processing from selected areas of signal and image processing. This Special Issue ...

  21. What are the latest research topics in Image Processing nowadays?

    Popular answers (1) There are a number of "latest" research topics in the broad areas of object tracking, facial expression recognition, retrieval from image database, medical imaging ...

  22. What is Image Processing? Examples, Types, and Benefits

    What is image processing? Image processing is a group of methods used to understand, interpret, and alter visual data. Two-dimensional digital images are made up of pixels, a small unit that contains bits of information about the shade, color, and opacity of that specific part of the visual. Together, all of the pixels make up the full image.

  23. Medical imaging

    Research Open Access 10 Sept 2024 Scientific Reports Volume: 14, P: 21054 Label-free functional imaging of vagus nerve stimulation-evoked potentials at the cortical surface

  24. Latest thesis topics in digital image processing| Research ...

    Latest research topics in image processing for research scholars: The hybrid classification scheme for plant disease detection in image processing. The edge detection scheme in image processing using ant and bee colony optimization. To improve PNLM filtering scheme to denoise MRI images. The classification method for the brain tumor detection.

  25. What are the new research areas in Image Processing and Machine

    1 Recommendation. Hiba J. Aleqabie. University of Kerbala. the new area is image processing and overlapping with machine learning. in terms of methods of extracting features and cutting using deep ...