• Registration
  • Local information

Call for papers

  • Call for workshops
  • Sponsorship Prospectus

ICDM 2024 Logo

  • Attending Venue Registration Local information
  • Calls Call for papers Call for workshops Sponsorship Prospectus

Aims and Scope

The IEEE International Conference on Data Mining (ICDM) has established itself as the world’s premier research conference in data mining. It provides an international forum for sharing original research results, as well as exchanging and disseminating innovative and practical development experiences. The conference covers all aspects of data mining, including algorithms, software, systems, and applications. ICDM draws researchers, application developers, and practitioners from a wide range of data mining related areas such as big data, deep learning, pattern recognition, statistical and machine learning, databases, data warehousing, data visualization, knowledge-based systems, high-performance computing, and large models. By promoting novel, high-quality research findings, and innovative solutions to challenging data mining problems, the conference seeks to advance the state-of-the-art in data mining.

Topics of interest

Topics of interest include, but are not limited to

  • Foundations, algorithms, models, and theory of data mining, including big data mining.
  • Deep learning and statistical methods for data mining.
  • Mining from heterogeneous data sources, including text, semi-structured, spatio-temporal, streaming, graph, web, and multimedia data.
  • Data mining systems and platforms, and their efficiency, scalability, security, and privacy.
  • Data mining for modelling, visualization, personalization, and recommendation.
  • Data mining for cyber-physical systems and complex, time-evolving networks.
  • Advantages and potential limitations of data mining with large models.
  • Applications of data mining in social sciences, physical sciences, engineering, life sciences, climate science, web, marketing, finance, precision medicine, health informatics, and other domains.

We particularly encourage submissions in emerging topics of high importance such as ethical data analytics, automated data analytics, data-driven reasoning, interpretable modeling, modeling with evolving environments, multi-modal data mining, and heterogeneous data integration and mining.

Submission Guidelines

Authors are invited to submit original papers, which have not been published elsewhere and which are not currently under consideration for another journal, conference or workshop.

Paper submissions should be limited to a maximum of ten (10) pages, in the IEEE 2-column format ( https://www.ieee.org/conferences/publishing/templates.html ), including the bibliography and any possible appendices. Submissions longer than 10 pages will be rejected without review. All submissions will be triple-blind reviewed by the Program Committee on the basis of technical quality, relevance to scope of the conference, originality, significance, and clarity. The following sections give further information for authors.

Triple-blind submission guidelines

Since 2011, ICDM has imposed a triple-blind submission and review policy for all submissions. Authors must hence not use identifying information in the text of the paper and bibliographies must be referenced to preserve anonymity.

What is triple-blind reviewing?

The traditional blind paper submission hides the referee names from the authors, and the double-blind paper submission also hides the author names from the referees. The triple-blind reviewing further hides the referee names among referees during paper discussions before their acceptance decisions. The names of authors and referees remain known only to the PC Co-Chairs, and the author names are disclosed only after the ranking and acceptance of submissions are finalized. It is imperative that all authors of ICDM submissions conceal their identity and affiliation information in their paper submissions. It does not suffice to simply remove the author names and affiliations from the first page, but also in the content of each paper submission.

How to prepare your submissions

The authors shall omit their names from the submission. For formatting templates with author and institution information, simply replace all these information items in the template by “Anonymous”.

In the submission, the authors should refer to their own prior work like the prior work of any other author, and include all relevant citations. This can be done either by referring to their prior work in the third person or referencing papers generically. For example, if your name is Smith and you have worked on clustering, instead of saying “We extend our earlier work on distance-based clustering (Smith 2005),” you might say “We extend Smith’s earlier work (Smith 2005) on distance-based clustering.” The authors shall exclude citations to their own work which is not fundamental to understanding the paper, including prior versions (e.g., technical reports, unpublished internal documents) of the submitted paper. Hence, do not write: “In our previous work [3]” as it reveals that citation 3 is written by the current authors. The authors shall remove mention of funding sources, personal acknowledgments, and other such auxiliary information that could be related to their identities. These can be reinstituted in the camera-ready copy once the paper is accepted for publication. The authors shall make statements on well-known or unique systems that identify an author, as vague in respect to identifying the authors as possible. The submitted files should be named with care to ensure that author anonymity is not compromised by the file names. For example, do not name your submission “Smith.pdf”, instead give it a name that is descriptive of the title of your paper, such as “ANewApproachtoClustering.pdf” (or a shorter version of the same).

Algorithms and resources used in a paper should be described as completely as possible to allow reproducibility. This includes experimental methodology, empirical evaluations, and results. Authors are strongly encouraged to make their code and data publicly available whenever possible. In addition, authors are strongly encouraged to also report, whenever possible, results for their methods on publicly available datasets.

Accepted papers will be published in the conference proceedings by the IEEE Computer Society Press. All manuscripts are submitted as full papers and are reviewed based on their scientific merit. There is no separate abstract submission step. There are no separate industrial, application, short paper or poster tracks during submission. Manuscripts must be submitted electronically in the online submission system ( https://www.wi-lab.com/cyberchair/2024/icdm24/scripts/submit.php?subarea=DM ). We do not accept email submissions.

Reproducibility guidelines

Authors must complete a reproducibility checklist at the time of paper submission the questions in PDF format . Authors are strongly recommended to start thinking about these questions already when writing the paper and to fill in the questionnaire well in time before the submission deadline. These responses will become part of each paper submission and will be shared with the area chairs and/or reviewers to help them in the evaluation process. Authors are encouraged to include in their papers all technical details (proofs, descriptions of assumptions, algorithm pseudocode) as well as information about each reproducibility criterion, as appropriate. Reviewers will be asked to assess the degree to which the results reported in a paper are reproducible, and this assessment will be weighed when making final decisions about each paper.

Best Paper Awards

Awards will be conferred at the conference to the authors of the best paper and the best student paper. A selected number of best papers will be invited for possible inclusion, in an expanded and revised form, in the Knowledge and Information Systems journal ( http://kais.bigke.org/ ) published by Springer.

ICDM is a premier forum for presenting and discussing current research in data mining. Therefore, at least one author of each accepted paper must complete the conference registration and present the paper at the conference, in order for the paper to be included in the proceedings and conference program. The exact format of the conference (in person, online, or hybrid) will be decided later.

For queries regarding this call, please contact: Dr. Mingming Gong ( [email protected] ).

  • IEEE Xplore Digital Library
  • IEEE Standards
  • IEEE Spectrum

Collabratec

  • Technical Committees
  • Data Mining

Data Mining and Big Data Analytics Technical Committee

The Data Mining and Big Data Analytics Technical Committee (DMTC) is established to: (1) promote the research, development, education and understanding the principles and applications of data mining and big data analytics and (2) to help researchers whose background is primarily in computational intelligence in increasing their contributions to this area.

The DMTC shall engage in various activities in order to advance the goals described above, including but not limited to the following:

  • Identify and promote new areas of research
  • Propose special sessions to the CIS-sponsored conference organizers
  • Publicize success stories on solving real data mining problems
  • Participate in paper review and selection for CIS-sponsored conferences and publications
  • Recommend candidates/papers for awards and collaborate on production of tutorials and book series with the Multimedia Committee
  • The DMTC will also assist in soliciting proposals for focused workshops or special sessions and actively work with the organizers of CIS-sponsored conferences to ensure their technical excellence

Committee Membership

  • Grant Scott, Chair
  • Ata Kaban and Handing Wang, Vice Chairs

Current directory of Chairs and members: Data Mining and Big Data Analytics Technical Committee Members

DMTC members are appointed to serve one year term. Reappointment is allowed.

Chair is appointed by CIS President.  2 Vice Chairs are appointed by Chair and endorsed by CIS VP of Technical Activities.  TC Members are appointed by Chair and endorsed by CIS VP of Technical Activities.  (Number of members varies, typically less than 20)

If you are interested in being considered for a position on the  Data Mining and Big Data Analytics Technical Committee , or would like more information on the activities of the committee, send an email to [email protected] .   If you are interested in being considered for a position on the committee, please include a link to your CV.

Task Forces

  • Chair: Longzhi Yang
  • Chair: Longbin Cao
  • Chair: Gang Li
  • Chair:  Sandra Ortega-Martorell
  • Chair:  Boudewijn van Dongen
  • Chair: Weiping Ding
  • Chair: Kohei Nakajima
  • Chair:  Alfredo Vellido
  • Chair: Ata Kaban

Updated: 5/9/2024

  • EXCOM-Executive Committee
  • ADCOM-Administrative Committee
  • Strategic Planning Committee
  • Constitution & Bylaws Committee
  • Diversity & Inclusion
  • Standards Committee
  • History Committee
  • Finance Committee
  • Fellows Evaluation Committee
  • Other Standing Committees

SSCI 2023

IEEE Symposium on Computational Intelligence in Data Mining (CIDM)

IEEE CIDM 2023 organized by the IEEE Computational Intelligence Society Data Mining Technical Committee is one of the largest and best attended symposia of the of the IEEE Symposium Series of Computational Intelligence (IEEE SSCI 2023). IEEE CIDM 2023 will bring together researchers and practitioners from around the world to discuss the latest advances in the field of computational intelligence applied to data mining and will act as a major forum for the presentation of recent results in theory, algorithms, systems and applications.

Topics related to all aspects of data mining and machine learning, such as theories, algorithms, systems and applications, particularly those based on computational intelligence technologies, are welcome; these include, but are not limited to:

  • Neural networks for data mining
  • Evolutionary algorithms for data mining
  • Fuzzy sets for data mining
  • Data mining with soft computing
  • Foundations of data mining
  • Mining with big data
  • Classification, Clustering, Regression
  • Association
  • Feature learning and feature engineering
  • Machine learning algorithms
  • Mining from streaming data
  • Deep learning
  • Data mining from nonstationary and drifting environments
  • Multimedia data mining
  • Text mining
  • Link and graph mining
  • Social media mining
  • Collaborative filtering
  • Crowd sourcing
  • Personalization
  • Security, privacy and social impact of data mining
  • Data mining applications

Symposium Chairs

  • Zhen Ni zhenni@fau.edu Florida Atlantic University

Programme Committee

  • Tufan Kumbasar, Istanbul Technical University
  • Wil van der Aalst, Eindhoven University of Technology
  • Sansanee Auephanwiriyakul, Chiang Mai University
  • Ahmad Taher Azar, Benha University
  • Giacomo Boracchi, Politecnico di Milano
  • Qi Chen, Victoria University of Wellington
  • Keeley Crockett, Manchester Metropolitan University
  • Weiping Ding, Nantong University
  • Gregory Ditzler, The University of Arizona
  • Haibo He, University of Rhode Island
  • Bach Nguyen Hoai, Victoria University of Wellington
  • Ting Hu, Queen’s University
  • Yonghong (Catherine) Huang, McAfee AI Research
  • Ata Kaban, University of Birmingham
  • Gang Li, Deakin University
  • Yun Li, i4AI Ltd
  • Jane Jing Liang, Zhengzhou University
  • Simone Ludwig, North Dakota State University
  • Paulo Lisboa, Liverpool John Moores University
  • Patricia Melin, Tijuana Institute of Technology
  • Sanaz Mostaghim, Otto von Guericke University of Magdeburg
  • Su Nguyen, Hoa Sen University
  • Yonghong Peng, University of Sunderland
  • Robi Polikar, Rowan University
  • Kai Alex Qin, Swinburne University of Technology
  • Marek Reformat, University of Alberta, Canada
  • Manuel Roveri, Politecnico di Milano
  • Antonio Tallon
  • Alfredo Vellido, Universitat Politècnica de Catalunya (UPC BarcelonaTech)
  • Handing Wang, Xidian University
  • Lipo Wang, Nanyang Technological University
  • Anna M. Wilbik, Eindhoven University of Technology
  • Guandong Xu, University of Technology Sydney
  • Gary G. Yen, Oklahoma State University
  • Yang Yu, Nanjing University
  • Zhiwen Yu, South China University of Technology
  • Mengjie Zhang, Victoria University of Wellington
  • Dongbin Zhao, Institute of Automation, Chinese Academy of Sciences

Copyright © 2024 | WordPress Theme by MH Themes

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals

Data mining articles from across Nature Portfolio

Data mining is the process of extracting potentially useful information from data sets. It uses a suite of methods to organise, examine and combine large data sets, including machine learning, visualisation methods and statistical analyses. Data mining is used in computational biology and bioinformatics to detect trends or patterns without knowledge of the meaning of the data.

latest ieee research papers on data mining

Discrete latent embeddings illuminate cellular diversity in single-cell epigenomics

CASTLE, a deep learning approach, extracts interpretable discrete representations from single-cell chromatin accessibility data, enabling accurate cell type identification, effective data integration, and quantitative insights into gene regulatory mechanisms.

Latest Research and Reviews

latest ieee research papers on data mining

Expression characteristics of lipid metabolism-related genes and correlative immune infiltration landscape in acute myocardial infarction

  • Xiaorong Hu

latest ieee research papers on data mining

Multi role ChatGPT framework for transforming medical data analysis

  • Haoran Chen
  • Shengxiao Zhang

latest ieee research papers on data mining

A tensor decomposition reveals ageing-induced differences in muscle and grip-load force couplings during object lifting

  • Seyed Saman Saboksayr
  • Ioannis Delis

latest ieee research papers on data mining

Research on coal mine longwall face gas state analysis and safety warning strategy based on multi-sensor forecasting models

  • Haoqian Chang
  • Xiangrui Meng

latest ieee research papers on data mining

PDE1B, a potential biomarker associated with tumor microenvironment and clinical prognostic significance in osteosarcoma

  • Qingzhong Chen
  • Chunmiao Xing
  • Zhongwei Qian

latest ieee research papers on data mining

A real-world pharmacovigilance study on cardiovascular adverse events of tisagenlecleucel using machine learning approach

  • Juhong Jung
  • Ju Hwan Kim
  • Ju-Young Shin

Advertisement

News and Comment

latest ieee research papers on data mining

Discovering cryptic natural products by substrate manipulation

Cryptic halogenation reactions result in natural products with diverse structural motifs and bioactivities. However, these halogenated species are difficult to detect with current analytical methods because the final products are often not halogenated. An approach to identify products of cryptic halogenation using halide depletion has now been discovered, opening up space for more effective natural product discovery.

  • Ludek Sehnal
  • Libera Lo Presti
  • Nadine Ziemert

latest ieee research papers on data mining

Chroma is a generative model for protein design

  • Arunima Singh

latest ieee research papers on data mining

Efficient computation reveals rare CRISPR–Cas systems

A study published in Science develops an efficient mining algorithm to identify and then experimentally characterize many rare CRISPR systems.

latest ieee research papers on data mining

SEVtras characterizes cell-type-specific small extracellular vesicle secretion

Although single-cell RNA-sequencing has revolutionized biomedical research, exploring cell states from an extracellular vesicle viewpoint has remained elusive. We present an algorithm, SEVtras, that accurately captures signals from small extracellular vesicles and determines source cell-type secretion activity. SEVtras unlocks an extracellular dimension for single-cell analysis with diagnostic potential.

Protein structural alignment using deep learning

Quick links.

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

latest ieee research papers on data mining

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Data Mining

Featured article.

Guide to the top data mining algorithms

In today’s ever-expanding technological environment, companies—for instance, in banking, retail, and social media—store large batches of data online and across many systems. Companies can make use of this data and benefit from it through data mining . This article explores how data mining algorithms work and how you can use them. It also looks at some of the top data mining algorithms available today.

To start, data mining is an important step in the larger process of knowledge discovery. It is the process of exploring and analyzing large data sets for patterns, relationships, and trends. Companies engage in data mining to gain useful business insights. For example, a company might use data mining to analyze a group’s buying habits, bank transactions, or medical history to predict the group’s future actions.

People sometimes confuse data mining with data harvesting . However, data harvesting is the process of extracting and analyzing data from online sources. Data mining does not involve “harvesting” data. Instead, it centers on examining data to produce new information.

To do so, data mining typically uses a machine learning method called supervised learning . Supervised learning “teaches” algorithms new processes in data review and analysis. Typically, a supervised learning algorithm views data, applies conditions, acts on the data, and produces results. It then applies the same process and reasoning to new data.

What is an algorithm in data mining?

In general, algorithms employ a series of steps or rules to process data and produce a specific outcome, result, or prediction. Within data mining, algorithms perform functions such as analyzing, classifying, and forecasting data and monitoring data trends .

Data mining algorithms are types of supervised learning algorithms. They use learning algorithm elements like statistics, probability, and artificial intelligence to explore and generate results that benefit companies, industries, and organizations all over the world.

Supervised learning algorithms and other supervised learning methods depend on labeled data. The labeled data includes the algorithm’s expected data output. A simple example is a picture of a dog labeled with the word “dog.” Labeled data helps the algorithm “learn” patterns in the data and later apply these patterns to unlabeled sets of data. In the above example, a labeled picture of a dog could help an algorithm recognize other images of dogs.

Despite their use of labeled data, supervised learning algorithms can predict or estimate unknown data quantities in the future. This is possible as long as the calculations are based on prior patterns in known data.

What is the role of the algorithm in data mining?

Data mining algorithms process large groups of data to produce certain statistical analyses or results for businesses, industries, or organizations. As such, they are a vital part of the data mining process.

A data mining algorithm’s role depends on the expectations of a user, creator, or investor. As we noted previously, many data mining algorithms conduct data analysis on large data sets. Across many fields, data mining algorithms can analyze audio, textual, and visual data according to demographic factors such as age, gender, and income.

For instance, a shoe company might develop a data mining algorithm to uncover the percentage of the company’s stock that women between the ages of twenty-five and thirty own. An organization within the medical field might use data mining algorithms to conduct research on certain diseases and their impacts on different groups of patients. A social media company might use a data mining algorithm to provide facial tagging suggestions.

All of these examples rely on data points, or criteria, that work with a data mining algorithm to produce the best or closest desired outcome.

Many types of data mining algorithms exist to analyze and interpret data and help achieve desired results. Examples include decision trees , support vector machines, k-nearest neighbors, and neural networks. We will discuss these in more depth later.

However, despite this variety in data mining algorithms, the basic underlying process for all of them is similar. Regardless of its specific purpose, a data mining algorithm’s process—taking data and producing a result—remains the same.

What are the main components of an algorithm for data mining?

At a basic level, data mining algorithms contain different elements that, in combination, lead to a result. The main components of data mining algorithms include data, conditions, and expectations (i.e., end goals).

As we discussed previously, a data mining algorithm relies on data to operate. This data usually comes in the form of large data sets that the algorithm reviews and breaks down into smaller data sets. The algorithm breaks down and analyzes data in relationship to variables. Examples of variables include age, gender, salary, and location.

Different types of variables produce different results. Three types of variables that algorithms use are discrete, continuous, and categorical.

Discrete variables consist of finite (countable) numbers. An example is the number of people who attended a concert or the length of a piece of equipment. Continuous variables, in contrast, have an infinite number of values. An example is the date and time a company receives a payment. Categorical variables contain a finite number of categories or groups. These variables aren’t dependent on order. Examples include material type and payment method.

In an algorithm, multiple variables work together to create conditions. When the algorithm applies certain conditions, it produces specific results.

For example, say a company wants to see how many elderly customers buy a certain type of toothpaste. Conditions of the data mining algorithm would include variables such as customer age, toothpaste type, and purchase confirmation. When the algorithm applies these conditions, it can generate the company’s desired result.

Many data mining algorithms also use conditional probability to generate outcomes. Conditional probability involves events and if/then instances. We can look at a coin toss to illustrate this. If I toss a coin, then it will land as either heads or tails. The algorithm “learns” how to understand and use this conditional “language” in order to seek out a specific outcome. Through such language, for example, an algorithm can identify the face of a specific person in a database.

Often, data mining algorithms incorporate Bayes’s theorem of conditional probability and predictive analysis into their data mining processes. Bayes’s eighteenth-century theorem hinges on the fact that one event will likely happen because another event has already happened. Although the theorem is dated, its if/then concepts and approaches remain useful to determine outcomes today. Likewise, predictive analytics offer another way to estimate future impacts on other sets of large data.

How do you write an algorithm for data mining?

Data experts and programmers create data mining algorithms through careful thought, planning, and execution. By establishing input variables, conditions, and output variables, they create algorithms that produce models from data. These models can then predict future data outcomes based on past incidences.

It is important to keep in mind that data programmers write different types of algorithms to create data mining models with specific end goals in mind. Examples of data mining models include the following:

  • A classification model to label loan applicants as low, medium, or high credit risk
  • A decision tree to predict whether a particular consumer will like a product and describe how factors like age and gender will determine product popularity
  • A mathematical model to forecast product sales
  • A set of rules to explain the probabilities that a consumer will purchase a group of products together

Classifications are ways of breaking down and comparing data points. For example, the solar system breaks down into classifications such as planets, moons, and stars. If an algorithm tried to label a specific object in our solar system, it would likely consider these different classifications and their connections to each other in its analytical process.

Decision trees resemble classification models. They start with a main idea that breaks down into several other related ideas when an algorithm applies certain factors. In turn, those ideas break down further as the algorithm applies more conditions. Eventually, these “branches” of ideas lead to an end result.

Both human and machine learning use decision trees as part of the decision-making process. Decision trees present data simply and linearly. For this reason, they represent a key approach to data mining.

Mathematical processes are key to identifying correlations in large data sets and then creating predictions. Linear algebra and probability, for example, play an important role in some data mining models.

Rules are also important in data mining models. These rules tell a data mining algorithm where it should act first. Consider, for example, a situation in which you need to know the probability of event A to predict the likelihood that event B will happen because of event A. An important rule in the equation would instruct the algorithm to discern event A’s probability before proceeding to any other calculations.

Once operating, many data mining algorithms work independently, without human supervision. That’s what makes them part of the machine learning family. However, someone must first set up the algorithm and make adjustments as necessary. This is why we categorize data mining algorithms as supervised learning algorithms.

How to use data mining algorithms

Various industries use data mining algorithms for research, investigation, and analytical purposes. These algorithms produce useful insights from the large data sets that companies have at their disposal.

An example of a field employing data mining algorithms for research today is the medical field. Often, doctors and other medical professionals use different data mining algorithms to predict the prevalence of certain diseases, such as heart disease, among a population.

In contrast, law enforcement agencies and social media companies might use data mining algorithms for investigation and analytical purposes. Although for different reasons, both types of organizations might conduct facial recognition searches to confirm a person’s identity.

What should you look for in an algorithm for data mining?

It is important to choose a data mining algorithm that meets your specific needs and goals. As we have discussed above, data mining algorithms vary according to their purpose. If you are considering data mining, you want to ensure that you choose algorithms that fit with your intended purpose.

The ultimate goal of data mining is actionable insight. Finding patterns among large data sets alone might be interesting to an individual or company. But the true value of a data mining algorithm comes from the user’s ability to act on the new information that data mining produces. You should always keep this in mind when evaluating data mining algorithms.

What do you need to write an algorithm for data mining?

Before developers create a data mining algorithm, they must first know the purpose of the algorithm and what it will analyze in terms of both data type and data format. Will the algorithm examine handwriting? Will it examine cell phone photographs? Will it examine shopping tendencies?

In addition to knowing what an algorithm will examine, developers also need an appropriate set of data. Based on the application, data could vary from a collection of sample handwriting or cell phone photos to a large database, such as the history of transactions in a group of retail stores.

Finally, developers need to write an equation that enables the algorithm to test the data. This equation often includes probability and predictive analysis.

How do you measure the efficacy of an algorithm for data mining?

Different algorithms have different levels of efficacy. Testing efficacy sometimes means running data through multiple data mining algorithms in order to see which one produces the best results.

One study in the medical field compared different data mining algorithms’ ability to predict heart disease. When scientists ran data through various algorithms to test for heart disease prevalence, the algorithms produced different results. Some algorithms produced more accurate information and thus proved more useful than others.

Some researchers recommend high-utility itemset mining as a very efficient data mining technique. In this type of data mining, an algorithm searches sets of data for items of high importance to the user. Highly important items might include, for example, specific business transactions, exact medical files, or personal security information.

The development of this type of data mining points to the advancing functionality and promising future of the field. As the world becomes more technologically reliant, more and more data become available. This creates more opportunity for data analysis solutions.

To stay up to date on the latest developments in data mining solutions, check out the IEEE Xplore digital library . Xplore is one of the world’s largest collections of technical literature in engineering, computer science, and related technologies, with five million documents now available in its vast repository. You can search through this library to find out more about ongoing advances in data mining.

Best algorithms for data mining

As mentioned earlier, data mining algorithms fit within the broader category of learning algorithms. Typically, learning algorithms depend on either classification or regression to produce results.

What are the most-used data mining algorithms?

Classification and regression algorithms remain the most-used data mining algorithms available today.

Classification algorithms take data and separate it into groups. Usually the groups correspond to answers to questions, such as “yes” or “no.” Spam filters in email provide a good example of a classification algorithm at work. As an email comes in, an algorithm analyzes its contents (such as sender, subject, and message). Then, the algorithm files the email into either a “yes spam” or “no spam” category.

Examples of classification algorithms are naive Bayes and k-nearest neighbors. (However, you can also use a k-nearest neighbors algorithm as part of a regression model.)

Naive Bayes algorithms use Bayes’s theorem of probability to review data and assign certain classifications to it. For instance, a naive Bayes algorithm might analyze a text to determine its main theme. It might determine, for example, that a text is discussing cats or dogs.

K-nearest neighbors algorithms are some of the simplest and most easy-to-use data mining algorithms today. They have been around since the 1970s. Their main goal is to place a data point into a certain category based on the data around it.

Examples of systems using k-nearest neighbors algorithms include recommendation lists from streaming services such as Netflix or Hulu. These lists take data points (such as movies or TV shows) and recommend similar/related content to users.

Regression algorithms, on the other hand, answer more complex questions related to a data set. Their goal is to discern a relationship between different data points. For example, facial recognition software uses a regression algorithm that gathers and analyzes different data points to verify a person’s identity.

An example of a regression algorithm is a neural network. Neural networks mimic the human brain’s neural paths. Thousands or millions of pieces of information form these complex computer systems. Neural networks use linear regression algorithms to arrive at key decisions.

Both regression and classification models get support from support vector machines (SVMs). An SVM is another type of algorithm. It takes data from regression and classification models and creates graphs from the data. This lends a visual component to the algorithm. SVMs also help separate data into different classifications.

What makes a data mining algorithm popular?

Companies use data mining algorithms to solve many different problems. Consequently, a wide variety of data mining algorithms exist today. You can fine-tune each algorithm to solve a particular problem.

Generally speaking, a data mining algorithm’s popularity hinges on its ability to provide detailed answers to questions concerning big data. These answers can help users predict an event or trend or, more broadly, the future of an industry. But they also help users with tasks such as avoiding spam in their email inboxes or choosing a nightly TV show.

How do algorithms vary from data mining project to project?

All in all, algorithms are versatile. Likewise, their use varies across many projects in different industries. Some projects call for specific types of algorithms. For example, one project might require an algorithm that can test for classification-based outcomes. Another might require an algorithm that can test for regression-based outcomes.

Additionally, some projects depend on multiple algorithms to work. For instance, the results from one algorithm might help produce results that are used by a second algorithm.

Top software packages for using data mining algorithms

Today, companies often choose to invest in software packages that make data mining easy and approachable. Many of these software packages offer the added bonus of providing data managing and storage services in addition to data mining algorithms and tools.

As we have stated above, data mining algorithms vary according to their intended purpose. As such, users should choose a data mining software package that fits with their specific needs.

What software is available for using data mining algorithms?

Software packages reduce the need to produce algorithms from scratch. Likewise, they provide different data analytics tools that aid algorithms and help the user get desired results. Examples of such tools include artificial intelligence and predictive analytics.

Popular software packages such as Alteryx Analytics, Orange, and KNIME contain data analytics tools like these. They also contain additional features that appeal to users. These include, for example, data visualization and display features and accessibility across multiple platforms.

What should you look for in software for using data mining algorithms?

You should keep your goals in mind when considering software options. When you choose software, you should make sure its offerings match your data mining vision. For example, you might want a system that creates visual displays, such as charts and graphs, from a data mining algorithm’s output information. In this case, you want to make sure the software you invest in includes data visualization among its features.

Likewise, you should consider the package’s accessibility options. For instance, can Mac and PC users access the software equally as easily? Is there a cloud-based storage system or a Software-as-a-Service (SaaS) option? What does the package’s interface look like? How would the interface affect your ability to explore and utilize the software?

Furthermore, you might benefit from a software package that you can add paid or free features to over time. Some software packages allow users with a valid product license to freely download or purchase additional features. The future of data mining looks promising. Because of this, having the ability to add features might be especially important going forward.

What are the best free and paid options for data mining algorithm software?

Software packages and their offerings vary according to their monetary value. Often paid-for packages include more high-tech, innovative, and appealing elements. In contrast, free versions generally contain fewer features. However, quality free options do exist.

According to a conference paper on free software tools for data mining, the best free offerings include RapidMiner, Weka, R, KNIME, Orange, and scikit-learn. Many of the companies behind these free tools also offer data mining services.

Paid-for options include Sisense, Neural Designer, and Alteryx Analytics. These companies focus on different data mining tools, such as analytics, machine learning, and business intelligence, respectively.

Ultimately, as technology continues to improve, the variety of data mining algorithms and software packages will likely continue to grow. So too will the importance and potential value of data mining as a field continue to grow in the future.

Interested in becoming an IEEE member ? Joining this community of over 420,000 technology and engineering professionals will give you access to the resources and opportunities you need to keep on top of changes in technology, as well as help you get involved in standards development, network with other professionals in your local area or within a specific technical interest, mentor the next generation of engineers and technologists, and so much more.

Related topics

Top conferences on data mining, top videos on data mining.

The eXtensible Event Stream (XES) standard

Xplore Articles related to Data Mining

Periodicals related to data mining, e-books related to data mining, courses related to data mining, top organizations on data mining, most published xplore authors for data mining.

latest ieee research papers on data mining

Data Mining and Knowledge Discovery

  • Publishes original research papers and practice in data mining and knowledge discovery.
  • Provides surveys and tutorials of important areas and techniques.
  • Offers detailed descriptions of significant applications.
  • Addresses theory, foundational issues, data mining methods, and knowledge discovery processes.
  • Renowned for its focus on techniques and applications.

This is a transformative journal , you may have access to funding.

  • Eyke Hüllermeier

latest ieee research papers on data mining

Latest issue

Volume 38, Issue 3

Latest articles

Robust explainer recommendation for time series classification.

  • Thu Trang Nguyen
  • Thach Le Nguyen
  • Georgiana Ifrim

latest ieee research papers on data mining

Series2vec: similarity-based self-supervised representation learning for time series classification

  • Navid Mohammadi Foumani
  • Chang Wei Tan
  • Mahsa Salehi

latest ieee research papers on data mining

GeoRF: a geospatial random forest

  • Margot Geerts
  • Seppe vanden Broucke
  • Jochen De Weerdt

latest ieee research papers on data mining

Modelling event sequence data by type-wise neural point process

  • Bingqing Liu

latest ieee research papers on data mining

The impact of variable ordering on Bayesian network structure learning

  • Neville K. Kitson
  • Anthony C. Constantinou

latest ieee research papers on data mining

Journal updates

Call for papers: special issue on explainable and interpretable machine learning and data mining.

Guest Editors: Martin Atzmueller, Johannes Fuernkranz, Tomas Klieger, Ute Schmid Submission Deadline: extended to June 15th, 2021

Call for Papers: Special issue on Bias and Fairness in AI

Guest Editors: T. Calders, E. Ntoutsi, M. Pechenizkiy, B. Rosenhahn, S. Ruggieri Submission deadline: August 31, 2021

Call for Papers: ECMLPKDD 2022

Submission deadline: 11 February, 2022

Call for Papers: Special Issue on Programming Language Processing

Submission Deadline: November 30, 2021 Guest Editors: Chang Xu, Siqi Ma, David Lo

Journal information

  • ACM Digital Library
  • Current Contents/Engineering, Computing and Technology
  • Current Index to Statistics
  • EI Compendex
  • Google Scholar
  • Japanese Science and Technology Agency (JST)
  • Mathematical Reviews
  • OCLC WorldCat Discovery Service
  • Science Citation Index Expanded (SCIE)
  • TD Net Discovery Service
  • UGC-CARE List (India)

Rights and permissions

Editorial policies

© Springer Science+Business Media, LLC, part of Springer Nature

  • Find a journal
  • Publish with us
  • Track your research

DHS Informatics

IEEE 2024-2025 : Data Science Projects

onlineClass

For Outstation Students, we are having online project classes both technical and coding using net-meeting software

For details, call: 9886692401/9845166723.

DHS Informatics  providing  latest 2024-2025 IEEE projects  on Data science for the final year engineering students. DHS Informatics trains all students to develop their project with good idea what they need to submit in college to get good marks. DHS Informatics offers placement training in Bangalore and the program name is  OJT  –  On Job Training , job seekers as well as final year college students can join in this placement training program and job opportunities in their dream IT companies. We are providing IEEE projects for B.E / B.TECH, M.TECH, MCA, BCA, DIPLOMA students from more than two decades.

Python  Final year CSE projects in Bangalore

  • Python 2024 – 2025 IEEE PYTHON PROJECTS CSE | ECE | ISE
  • Python 2024 – 2025 IEEE PYTHON MACHINE LEARNING PROJECTS
  • Python 2024 – 2025 IEEE PYTHON IMAGE PROCESSING PROJECTS
  • Python 2024 – 2025 IEEE IOT PYTHON RASPBERRY PI PROJECTS

DATA SCIENCE PROJECTS

A data mining based model for detection of fraudulent behaviour in water consumption.

Abstract:  Fraudulent behavior in drinking water consumption is a significant problem facing water supplying companies and agencies. This behavior results in a massive loss of income and forms the highest percentage of non-technical loss. Finding efficient measurements for detecting fraudulent activities has been an active research area in recent years. Intelligent data mining techniques can help water supplying companies to detect these fraudulent activities to reduce such losses. This research explores the use of two classification techniques (SVM and KNN) to detect suspicious fraud water customers. The main motivation of this research is to assist Yarmouk Water Company (YWC) in Irbid city of Jordan to overcome its profit loss. The SVM based approach uses customer load profile attributes to expose abnormal behavior that is known to be correlated with non-technical loss activities. The data has been collected from the historical data of the company billing system. The accuracy of the generated model hit a rate of over 74% which is better than the current manual prediction procedures taken by the YWC. To deploy the model, a decision tool has been built using the generated model. The system will help the company to predict suspicious water customers to be inspected on site.                                                                                                                                                                                                                                   

Correlated Matrix Factorization for Recommendation with Implicit Feedback

Abstract:  As a typical latent factor model, Matrix Factorization (MF) has demonstrated its great effectiveness in recommender systems. Users and items are represented in a shared low-dimensional space so that the user preference can be modeled by linearly combining the item factor vector V using the user-specific coefficients U. From a generative model perspective, U and V are drawn from two independent Gaussian distributions, which is not so faithful to the reality. Items are produced to maximally meet users’ requirements, which makes U and V strongly correlated. Meanwhile, the linear combination between U and V forces a bisection (one-to-one mapping), which thereby neglects the mutual correlation between the latent factors. In this paper, we address the upper drawbacks, and propose a new model, named Correlated Matrix Factorization (CMF). Technically, we apply Canonical Correlation Analysis (CCA) to map U and V into a new semantic space. Besides achieving the optimal fitting on the rating matrix, one component in each vector (U or V) is also tightly correlated with every single component in the other. We derive efficient inference and learning algorithms based on variational EM methods. The effectiveness of our proposed model is comprehensively verified on four public data sets. Experimental results show that our approach achieves competitive performance on both prediction accuracy and efficiency compared with the current state of the art.                                                                                                                                                                                         

Heterogeneous Information Network Embedding for Recommendation

Abstract:  Due to the flexibility in modelling data heterogeneity, heterogeneous information network (HIN) has been adopted to characterize complex and heterogeneous auxiliary data in recommended systems, called HIN based recommendation. It is challenging to develop effective methods for HIN based recommendation in both extraction and exploitation of the information from HINs. Most of HIN based recommendation methods rely on path based similarity, which cannot fully mine latent structure features of users and items. In this paper, we propose a novel heterogeneous network embedding based approach for HIN based recommendation, called HERec. To embed HINs, we design a meta-path based random walk strategy to generate meaningful node sequences for network embedding. The learned node embeddings are first transformed by a set of fusion functions, and subsequently integrated into an extended matrix factorization (MF) model. The extended MF model together with fusion functions are jointly optimized for the rating prediction task. Extensive experiments on three real-world datasets demonstrate the effectiveness of the HERec model. Moreover, we show the capability of the HERec model for the cold-start problem, and reveal that the transformed embedding information from HINs can improve the recommendation performance.                                                         

NetSpam: A Network-Based Spam Detection Framework for Reviews in Online Social Media

Abstract:  Nowadays, a big part of people rely on available content in social media in their decisions (e.g., reviews and feedback on a topic or product). The possibility that anybody can leave a review provides a golden opportunity for spammers to write spam reviews about products and services for different interests. Identifying these spammers and the spam content is a hot topic of research, and although a considerable number of studies have been done recently toward this end, but so far the methodologies put forth still barely detect spam reviews, and none of them show the importance of each extracted feature type. In this paper, we propose a novel framework, named NetSpam, which utilizes spam features for modeling review data sets as heterogeneous information networks to map spam detection procedure into a classification problem in such networks. Using the importance of spam features helps us to obtain better results in terms of different metrics experimented on real-world review data sets from Yelp and Amazon Web sites. The results show that NetSpam outperforms the existing methods and among four categories of features, including review-behavioral, user-behavioral, review-linguistic, and user-linguistic, the first type of features performs better than the other categories.                                                                                                                                                                                         

Comparative Study to Identify the Heart Disease Using Machine Learning Algorithms

Abstract: Nowadays, heart disease is a common and frequently present disease in the human body and it’s also hunted lots of humans from this world. Especially in the USA, every year mass people are affected by this disease after that in India also. Doctor and clinical research said that heart disease is not a suddenly happen disease it’s the cause of continuing irregular lifestyle and different body’s activity for a long period after then it’s appeared in sudden with symptoms. After appearing those symptoms people seek for a treat in hospital for taken different test and therapy but these are a little expensive. So awareness before getting appeared in this disease people can get an idea about the patient condition from this research result. This research collected data from different sources and split that data into two parts like 80% for the training dataset and the rest 20% for the test dataset. Using different classifier algorithms tried to get better accuracy and then summarize that accuracy. These algorithms are namely Random Forest Classifier, Decision Tree Classifier, Support Vector Machine, k-nearest neighbor, Logistic Regression, and Naive Bayes. SVM, Logistic Regression, and KNN gave the same and better accuracy as other algorithms. This paper proposes a development that which factor is vulnerable to heart disease given basic prefix like sex, glucose, Blood pressure, Heart rate, etc. The future direction of this paper is using different devices and clinical trials for the real-life experiment.

A machine learning approach for opinion mining online customer reviews

Abstract :This study was conducted to apply supervised machine learning methods in opinion mining online customer reviews. First, the study automatically collected 39,976 traveler reviews on hotels in Vietnam on Agoda.com website, then conducted the training with machine learning models to find out which model is most compatible with the training dataset and apply this model to forecast opinions for the collected dataset. The results showed that Logistic Regression (LR), Support Vector Machines (SVM) and Neural Network (NN) methods have the best performance in opinion mining in Vietnamese language. This study is valuable as a reference for applications of opinion mining in the field of business.

Hybrid Machine Learning Classification Technique for Improve Accuracy of Heart Disease

Abstract: The area of medical science has attracted great attention from researchers. Several causes for human early mortality have been identified by a decent number of investigators. The related literature has confirmed that diseases are caused by different reasons and one such cause is heart-based sicknesses. Many researchers proposed idiosyncratic methods to preserve human life and help health care experts to recognize, prevent and manage heart disease. Some of the convenient methodologies facilitate the expert’s decision but every successful scheme has its own restrictions. The proposed approach robustly analyze an act of Hidden Markov Model (HMM), Artificial Neural Network (ANN), Support Vector Machine (SVM), and Decision Tree J48 along with the two different feature selection methods such as Correlation Based Feature Selection (CFS) and Gain Ratio. The Gain Ratio accompanies the Ranker method over a different group of statistics. After analyzing the procedure the intended method smartly builds Naive Bayes processing that utilizes the operation of two most appropriate processes with suitable layered design. Initially, the intention is to select the most appropriate method and analyzing the act of available schemes executed with different features for examining the statistics.

Novel Supervised Machine Learning Classification Technique for Improve Accuracy of Multi-Valued Datasets in Agriculture

Abstract: In the modern era, many reasons for agricultural plant disease due to unfavorable weather conditions. Many reasons that influence disease in agricultural plants include variety/hybrid genetics, the lifetime of plants at the time of infection, environment(soil, climate), weather (temperature, wind, rain, hail, etc), single versus mixed infections, and genetics of the pathogen populations. Due to these factors, diagnosis of plant diseases at the early stages can be a difficult task. Machine Learning (ML) classification techniques such as Naïve Bayes (NB) and Neural Network (NN) techniques were compared to develop a novel technique to improve the level of accuracy

Machine Learning and Deep Learning Approaches for Brain Disease Diagnosis: Principles and Recent Advances

Abstract: Brain is the controlling center of our body. With the advent of time, newer and newer brain diseases are being discovered. Thus, because of the variability of brain diseases, existing diagnosis or detection systems are becoming challenging and are still an open problem for research. Detection of brain diseases at an early stage can make a huge difference in attempting to cure them. In recent years, the use of artificial intelligence (AI) is surging through all spheres of science, and no doubt, it is revolutionizing the field of neurology. Application of AI in medical science has made brain disease prediction and detection more accurate and precise. In this study, we present a review on recent machine learning and deep learning approaches in detecting four brain diseases such as Alzheimer’s disease (AD), brain tumor, epilepsy, and Parkinson’s disease. 147 recent articles on four brain diseases are reviewed considering diverse machine learning and deep learning approaches, modalities, datasets etc. Twenty-two datasets are discussed which are used most frequently in the reviewed articles as a primary source of brain disease data. Moreover, a brief overview of different feature extraction techniques that are used in diagnosing brain diseases is provided. Finally, key findings from the reviewed articles are summarized and a number of major issues related to machine learning/deep learning-based brain disease diagnostic approaches are discussed. Through this study, we aim at finding the most accurate technique for detecting different brain diseases which can be employed for future betterment.

Prediction of Chronic Kidney Disease - A Machine Learning Perspective

Abstract: Chronic Kidney Disease is one of the most critical illness nowadays and proper diagnosis is required as soon as possible. Machine learning technique has become reliable for medical treatment. With the help of a machine learning classifier algorithms, the doctor can detect the disease on time. For this perspective, Chronic Kidney Disease prediction has been discussed in this article. Chronic Kidney Disease dataset has been taken from the UCI repository. Seven classifier algorithms have been applied in this research such as artificial neural network, C5.0, Chi-square Automatic interaction detector, logistic regression, linear support vector machine with penalty L1 & with penalty L2 and random tree. The important feature selection technique was also applied to the dataset. For each classifier, the results have been computed based on (i) full features, (ii) correlation-based feature selection, (iii) Wrapper method feature selection, (iv) Least absolute shrinkage and selection operator regression, (v) synthetic minority over-sampling technique with least absolute shrinkage and selection operator regression selected features, (vi) synthetic minority over-sampling technique with full features. From the results, it is marked that LSVM with penalty L2 is giving the highest accuracy of 98.86% in synthetic minority over-sampling technique with full features. Along with accuracy, precision, recall, F-measure, area under the curve and GINI coefficient have been computed and compared results of various algorithms have been shown in the graph. Least absolute shrinkage and selection operator regression selected features with synthetic minority over-sampling technique gave the best after synthetic minority over-sampling technique with full features. In the synthetic minority over-sampling technique with least absolute shrinkage and selection operator selected features, again linear support vector machine gave the highest accuracy of 98.46%. Along with machine learning models one deep neural network has been applied on the same dataset and it has been noted that deep neural network achieved the highest accuracy of 99.6%

Potato Disease Detection Using Machine Learning

Abstract: In Bangladesh potato is one of the major crops. Potato cultivation has been very popular in Bangladesh for the last few decades. But potato production is being hampered due to some diseases which are increasing the cost of farmers in potato production. However, some potato diseases are hampering potato production that is increasing the cost of farmers. Which is disrupting the life of the farmer. An automated and rapid disease detection process to increase potato production and digitize the system. Our main goal is to diagnose potato disease using leaf pictures that we are going to do through advanced machine learning technology. This paper offers a picture that is processing and machine learning based automated systems potato leaf diseases will be identified and classified. Image processing is the best solution for detecting and analyzing these diseases. In this analysis, picture division is done more than 2034 pictures of unhealthy potato and potato’s leaf, which is taken from openly accessible plant town information base and a few pre-prepared models are utilized for acknowledgment and characterization of sick and sound leaves. Among them, the program predicts with an accuracy of 99.23% in testing with 25% test data and 75% train data. Our output has shown that machine learning exceeds all existing tasks in potato disease detection.

A Comparative Evaluation of Traditional Machine Learning and Deep Learning Classification Techniques for Sentiment Analysis

Abstract :With the technological advancement in the field of digital transformation, the use of the internet and social media has increased immensely. Many people use these platforms to share their views, opinions and experiences. Analyzing such information is significant for any organization as it apprises the organization to understand the need of their customers. Sentiment analysis is an intelligible way to interpret the emotions from the textual information and it helps to determine whether that emotion is positive or negative. This paper outlines the data cleaning and data preparation process for sentiment analysis and presents experimental findings that demonstrates the comparative performance analysis of various classification algorithms. In this context, we have analyzed various machine learning techniques (Support Vector Machine, and Multinomial Naive Bayes) and deep learning techniques (Bidirectional Encoder Representations from Transformers, and Long Short-Term Memory) for sentiment analysis

A Comprehensive Review on Email Spam Classification using Machine Learning Algorithms

Abstract: Email is the most used source of official communication method for business purposes. The usage of the email continuously increases despite of other methods of communications. Automated management of emails is important in the today’s context as the volume of emails grows day by day. Out of the total emails, more than 55 percent is identified as spam. This shows that these spams consume email user time and resources generating no useful output. The spammers use developed and creative methods in order to fulfil their criminal activities using spam emails, Therefore, it is vital to understand different spam email classification techniques and their mechanism. This paper mainly focuses on the spam classification approached using machine learning algorithms. Furthermore, this study provides a comprehensive analysis and review of research done on different machine learning techniques and email features used in different Machine Learning approaches. Also provides future research directions and the challenges in the spam classification field that can be useful for future researchers.

Heart Disease Prediction using Hybrid machine Learning Model

Abstract: Heart disease causes a significant mortality rate around the world, and it has become a health threat for many people. Early prediction of heart disease may save many lives; detecting cardiovascular diseases like heart attacks, coronary artery diseases etc., is a critical challenge by the regular clinical data analysis. Machine learning (ML) can bring an effective solution for decision making and accurate predictions. The medical industry is showing enormous development in using machine learning techniques. In the proposed work, a novel machine learning approach is proposed to predict heart disease. The proposed study used the Cleveland heart disease dataset, and data mining techniques such as regression and classification are used. Machine learning techniques Random Forest and Decision Tree are applied. The novel technique of the machine learning model is designed. In implementation, 3 machine learning algorithms are used, they are 1. Random Forest, 2. Decision Tree and 3. Hybrid model (Hybrid of random forest and decision tree). Experimental results show an accuracy level of 88.7% through the heart disease prediction model with the hybrid model. The interface is designed to get the user’s input parameter to predict the heart disease, for which we used a hybrid model of Decision Tree and Random Forest

Heart Failure Prediction by Feature Ranking Analysis in Machine Learning

Abstract: Heart disease is one of the major cause of mortality in the world today. Prediction of cardiovascular disease is a critical challenge in the field of clinical data analysis. With the advanced development in machine learning (ML), artificial intelligence (AI) and data science has been shown to be effective in assisting in decision making and predictions from the large quantity of data produced by the healthcare industry. ML approaches has brought lot of improvements and broadens the study in medical field which recognizes patterns in the human body by using various algorithms and correlation techniques. One such reality is coronary heart disease, various studies gives impression into predicting heart disease with ML techniques. Initially ML was used to find degree of heart failure, but also used to identify significant features that affects the heart disease by using correlation techniques. There are many features/factors that lead to heart disease like age, blood pressure, sodium creatinine, ejection fraction etc. In this paper we propose a method to finding important features by applying machine learning techniques. The work is to design and develop prediction of heart disease by feature ranking machine learning. Hence ML has huge impact in saving lives and helping the doctors, widening the scope of research in actionable insights, drive complex decisions and to create innovative products for businesses to achieve key goals.

Design of face detection and recognition system to monitor students during online examinations using Machine Learning algorithms

Abstract: Today’s pandemic situation has transformed the way of educating a student. Education is undertaken remotely through online platforms. In addition to the way the online course contents and online teaching, it has also changed the way of assessments. In online education, monitoring the attendance of the students is very important as the presence of students is part of a good assessment for teaching and learning. Educational institutions have adopting online examination portals for the assessments of the students. These portals make use of face recognition techniques to monitor the activities of the students and identify the malpractice done by them. This is done by capturing the students’ activities through a web camera and analyzing their gestures and postures. Image processing algorithms are widely used in the literature to perform face recognition. Despite the progress made to improve the performance of face detection systems, there are issues such as variations in human facial appearance like varying lighting condition, noise in face images, scale, pose etc., that blocks the progress to reach human level accuracy. The aim of this study is to increase the accuracy of the existing face recognition systems by making use of SVM and Eigenface algorithms. In this project, an approach similar to Eigenface is used for extracting facial features through facial vectors and the datasets are trained using Support Vector Machine (SVM) algorithm to perform face classification and detection. This ensures that the face recognition can be faster and be used for online exam monitoring.

IEEE DATA SCIENCE PROJECTS (2024-2025)

1. IEEE : Deep Air Learning: Interpolation, Prediction, and Feature Analysis of Fine-grained Air Quality
2. IEEE : Classification Of A Bank Data Set On Various  Data Mining Platforms  Bir Banka Müşteri Verilerinin Farklı Veri  Madenciliği Platformlarında Sınıflandırılması
3. IEEE : A Data Mining based Model for Detection of  Fraudulent Behaviour in Water Consumption
4. IEEE : Collaborative Filtering Algorithm Based on Rating Difference and User Interest
5. IEEE : A Framework for Real-Time Spam Detection in Twitter
6. IEEE : Serendipitous Recommendation in E-Commerce Using Innovator-Based Collaborative Filtering
7. IEEE : Review Spam Detection using Machine  Learning
8. IEEE : NetSpam: a Network-based Spam Detection Framework for Reviews in Online Social Media
9. IEEE : SociRank: Identifying and Ranking Prevalent News Topics Using Social Media Factors

DHS Informatics believes in students’ stratification, we first brief the students about the technologies and type of Data Science projects and other domain projects. After complete concept explanation of the IEEE Data Science projects, students are allowed to choose more than one IEEE Data Science projects for functionality details. Even students can pick one project topic from Data Science and another two from other domains like Data Science,Data mining, image process, information forensic, big data, Data Mining, block chain etc. DHS Informatics is a pioneer institute in Bangalore / Bengaluru; we are supporting project works for other institute all over India. We are the leading final year project centre in Bangalore / Bengaluru and having office in five different main locations Jayanagar, Yelahanka, Vijayanagar, RT Nagar & Indiranagar.

We allow the ECE, CSE, ISE final year students to use the lab and assist them in project development work; even we encourage students to get their own idea to develop their final year projects for their college submission.

DHS Informatics first train students on project related topics then students are entering into practical sessions. We have well equipped lab set-up, experienced faculties those who are working in our client projects and friendly student coordinator to assist the students in their college project works.

We appreciated by students for our Latest IEEE projects & concepts on final year Data Mining projects for ECE, CSE, and ISE departments.

Latest IEEE 2024-2025 projects on Data Mining with real time concepts which are implemented using Java, MATLAB, and NS2 with innovative ideas. Final year students of computer Data Mining, computer science, information science, electronics and communication can contact our corporate office located at Jayanagar, Bangalore for Data Science project details.

DATA SCIENCE

Data Science is mining knowledge from data, Involving methods at the intersection of machine learning, statistics, and database systems. Its the powerful new technology with great potential to help companies focus on the most important information in their data warehouses. We have the best in class infrastructure, lab set up , Training facilities, And experienced research and development team for both educational and corporate sectors.

Data Science is the process of searching huge amount of data from different aspects and summarize it to useful information. Data Science is logical than physical subset. Our concerns usually implicate mining and text based classification on Data Science projects for Students.

The usages of variety of tools associated to data analysis for identifying relationships in data are the process for Data Science. Our concern support data mining projects for IT and CSE students to carry out their academic research projects.

Data Science is the process of searching huge amount of data from different aspects and summarize it to useful information. Data Science is logical than physical subset. Our concerns usually implicate mining and text based classification on data Science projects for Students. The usages of variety of tools associated to data analysis for identifying relationships in data are the process for data Science. Our concern support data Science projects for IT and CSE students to carry out their academic research projects.

Relational Statics

The popularity of the term “data science” has exploded in business environments and academia, as indicated by a jump in job openings. However, many critical academics and journalists see no distinction between data science and statistics. Writing in Forbes, Gil Press argues that data science is a buzzword without a clear definition and has simply replaced “business analytics” in contexts such as graduate degree programs.In the question-and-answer section of his keynote address at the Joint Statistical Meetings of American Statistical Association, noted applied statistician Nate Silver said, “I think data-scientist is a sexed up term for a statistician….Statistics is a branch of science. Data scientist is slightly redundant in some way and people shouldn’t berate the term statistician.”Similarly, in business sector, multiple researchers and analysts state that data scientists alone are far from being sufficient in granting companies a real competitive advantage and consider data scientists as only one of the four greater job families companies require to leverage big data effectively, namely: data analysts, data scientists, big data developers and big data engineers.

On the other hand, responses to criticism are as numerous. In a 2014 Wall Street Journal article, Irving Wladawsky-Berger compares the data science enthusiasm with the dawn of computer science. He argues data science, like any other interdisciplinary field, employs methodologies and practices from across the academia and industry, but then it will morph them into a new discipline. He brings to attention the sharp criticisms computer science, now a well respected academic discipline, had to once face.Likewise, NYU Stern’s Vasant Dhar, as do many other academic proponents of data science,argues more specifically in December 2013 that data science is different from the existing practice of data analysis across all disciplines, which focuses only on explaining data sets. Data science seeks actionable and consistent pattern for predictive uses.This practical engineering goal takes data science beyond traditional analytics. Now the data in those disciplines and applied fields that lacked solid theories, like health science and social science, could be sought and utilized to generate powerful predictive models.

Java Final year CSE projects in Bangalore

  • Java Information Forensic / Block Chain B.E Projects
  • Java  Cloud Computing B.E Projects
  • Java  Big Data with Hadoop B.E Projects
  • Java  Networking & Network Security B.E Pr ojects
  • Java  Data Mining / Web Mining / Cyber Secu rity B.E Projects
  • Java DataScience / Machine Learning  B.E Projects
  •  Java Artificaial Inteligence B.E Projects
  • Java  Wireless Sensor Network B.E Projects
  • Java  Distributed & Parallel Networking B.E Projects
  • Java Mobile Computing B.E Projects

Android Final year CSE projects in Bangalore

  • Android  GPS, GSM, Bluetooth & GPRS B.E Projects
  • Android  Embedded System Application Projetcs for B.E
  • Android  Database Applications Projects for B.E Students
  • Android  Cloud Computing Projects for Final Year B.E Students
  • Android  Surveillance Applications B.E Projects
  • Android  Medical Applications Projects for B.E

Embedded  Final year CSE projects in Bangalore

  • Embedded  Robotics Projects for M.tech Final Year Students
  • Embedded  IEEE Internet of Things Projects for B.E Students
  • Embedded   Raspberry PI Projects for B.E Final Year Students
  • Embedded  Automotive Projects for Final Year B.E Students
  • Embedded  Biomedical Projects for B.E Final Year Students
  • Embedded  Biometric Projects for B.E Final Year Students
  • Embedded  Security Projects for B.E Final Year

MatLab  Final year CSE projects in Bangalore

  • Matlab  Image Processing Projects for B.E Students
  • MatLab  Wireless Communication B.E Projects
  • MatLab  Communication Systems B.E Projects
  • MatLab  Power Electronics Projects for B.E Students
  • MatLab  Signal Processing Projects for B.E
  • MatLab  Geo Science & Remote Sensors B.E Projects
  • MatLab  Biomedical Projects for B.E Students

DATA MINING IEEE PAPERS AND PROJECTS-2020

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems

Data stream mining , as its name suggests, is connected with two basic fields of computer science, ie data mining and data streams. Data mining [1 4] is an interdisciplinary subfield of computer science whose main aim is to develop tools and methods for exploring

FREE IEEE PAPER AND PROJECTS

Ieee projects 2022, seminar reports, free ieee projects ieee papers.

Topics For Seminar

  • cloud computing
  • Computer Science
  • Data Analysis
  • data mining
  • Data mining techniques
  • ieee seminar topics
  • machine learning
  • research papers
  • seminar topics

Data Mining: Research Papers | Seminar Topics | IEEE

  • Share to Facebook
  • Share to Twitter

Data Mining

Data mining seminar topics | ieee research papers, share this article, subscribe via email, related post.

  • Like on Facebook
  • Follow on Twitter
  • Follow on Slideshare
  • Follow on Pinterest
  • Subscribe on Youtube

Trending Seminar Topics

  • 100+ Seminar Topics for Youth, Teenagers, College Students Young people are on a never-ending quest for transcendence, which drives them to want to improve the environment, countries, communities,...
  • 100 PowerPoint Presentation Topics in Hindi (Download PPT) विद्यार्थियों के लिए प्रेजेंटेशन का महत्व प्रेजेंटेशन (presentation) देना शैक्षणिक पाठ्यक्रम का एक महत्वपूर्ण व्यावहारिक पाठ्यक्रम है, ...
  • 30+ Technical Seminar Topics for Presentation: Latest Tech Trends Technology is rapidly evolving today, allowing for faster change and progress and accelerating the rate of change. However, it is not just t...
  • 100+ Interesting Biology Presentation Topics with PPT Biology Topics for Presentation & Research Biology is a topic that every school student studies and university student who does major in...
  • 100 Interesting Fun Topics for Presentations Fun Topics for Presentations We have prepared for you a fantastic collection of fun topics for presentation with relevant links to the artic...

Recent Seminar Topics

Seminar topics.

  • 💻 Seminar Topics for CSE Computer Science Engineering
  • ⚙️ Seminar Topics for Mechanical Engineering ME
  • 📡 Seminar Topics for ECE Electronics and Communication
  • ⚡️ Seminar Topics for Electrical Engineering EEE
  • 👷🏻 Seminar Topics for Civil Engineering
  • 🏭 Seminar Topics for Production Engineering
  • 💡 Physics Seminar Topics
  • 🌎 Seminar Topics for Environment
  • ⚗️ Chemistry Seminar Topics
  • 📈 Business Seminar Topics
  • 👦🏻 Seminar Topics for Youth

Investigatory Projects Topics

  • 👨🏻‍🔬 Chemistry Investigatory Projects Topics
  • 📧 Contact Us For Seminar Topics
  • 👉🏼Follow us in Slideshare

Presentation Topics

  • 🌍 Environment Related Presentation Topics
  • ⚗️ Inorganic Chemistry Presentation Topics
  • 👨🏻‍🎓 General Presentation Topics
  • 🦚 Hindi Presentation Topics
  • 🪐 Physics Presentation Topics
  • 🧪 Chemistry: Interesting Presentation Topics
  • 🌿 Biology Presentation Topics
  • 🧬 Organic Chemistry Presentation Topics

Speech Topics and Ideas

  • 🦁 Informative and Persuasive Speech Topics on Animals
  • 🚗 Informative and Persuasive Speech Topics on Automotives
  • 💡 Ideas to Choose Right Informative Speech
  • 👩🏻‍🎓 Informative Speech Topics For College Students
  • 🔬 Informative Speech Topics on Science and Technology

latest ieee research papers on data mining

IMAGES

  1. (PDF) A review on Data Mining & Big Data Analytics

    latest ieee research papers on data mining

  2. (PDF) IEEE Paper

    latest ieee research papers on data mining

  3. IEEE Paper Format

    latest ieee research papers on data mining

  4. Data Mining with big data total ieee project and entire files.

    latest ieee research papers on data mining

  5. Template For Ieee Paper Format In Word

    latest ieee research papers on data mining

  6. (PDF) Identification of data mining research frontier based on

    latest ieee research papers on data mining

VIDEO

  1. Image Processing Course in 2 hours

  2. Lecture 16: Data Mining CSE 2020 Fall

  3. How to download IEEE research/Journals for FREE!#india #education #trending #trendingvideo #students

  4. Final Year Projects 2015

  5. Video presentation of summary papers

  6. Final Year Projects

COMMENTS

  1. Data Mining: Data Mining Concepts and Techniques

    Data mining is a field of intersection of computer science and statistics used to discover patterns in the information bank. The main aim of the data mining process is to extract the useful information from the dossier of data and mold it into an understandable structure for future use. There are different process and techniques used to carry out data mining successfully.

  2. Data mining techniques and applications

    This paper reviews data mining techniques and its applications such as educational data mining (EDM), finance, commerce, life sciences and medical etc. We group existing approaches to determine how the data mining can be used in different fields. Our categorization specifically focuses on the research that has been published over the period ...

  3. Paper Review On Data Mining, components, And Big Data

    Recent progress in software and hardware has allowed different data measurements in a variety of fields to be captured. These measures are produced continuously at very fluctuating data rates. For example, network sensors, web logs and computer network traffic. Computer-intensive activities are the collection, query and removal of these data sets. Mining data sources include the recovery of ...

  4. Call for papers

    The IEEE International Conference on Data Mining (ICDM) has established itself as the world's premier research conference in data mining. It provides an international forum for sharing original research results, as well as exchanging and disseminating innovative and practical development experiences. The conference covers all aspects of data ...

  5. Data mining

    Mountainous amounts of data records are now available in science, business, industry and many other areas. Such data can provide a rich resource for knowledge discovery and decision support. Data mining is the process of identifying interesting patterns from large databases. Data mining is the core part of the knowledge discovery in database (KDD) process. The KDD process may consist of the ...

  6. Perspectives on Test Data Mining from Industrial Experience

    This paper offers some perspectives on the practice of data mining based on recent experimental research work to establish a link between system-level failures and structural scan test patterns. Beyond the obvious goal to obtain accurate results, knowledge discovery and data insights deserve equal if not higher emphasis. Domain knowledge plays a crucial role in guiding the use of multiple ...

  7. A New Technique Research on Data Mining

    Classification is a basic problem in the field of data mining. It is one of the key steps that intelligent systems take when extracting meaningful information from complex and massive data. This paper introduces a new classification approach based on the theory of human vision from the perspective of bionics. The experimental results show that the new algorithm is efficient for the classification.

  8. Call for Papers

    The IEEE International Conference on Data Mining (ICDM) has established itself as the world's premier research conference in data mining. It provides an international forum for presentation of original research results, as well as exchange and dissemination of innovative and practical development experiences.

  9. (PDF) Trends in data mining research: A two-decade review using topic

    Trends in data mining research: A two-decade review using topic analysis ... is the set of all papers in year . y. ... 2001 IEEE International Conference on Data Mining (ICDM), San Jose, CA, USA ...

  10. Data Mining

    Data Mining and Big Data Analytics Technical Committee Who We Are. The Data Mining and Big Data Analytics Technical Committee (DMTC) is established to: (1) promote the research, development, education and understanding the principles and applications of data mining and big data analytics and (2) to help researchers whose background is primarily in computational intelligence in increasing their ...

  11. (PDF) IEEE Access Special Section Editorial: Advanced Data Mining

    He has published over 50 research papers in top-tier journals and conferences, including IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (TNNLS), IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ...

  12. IEEE Symposium on Computational Intelligence in Data Mining (CIDM)

    IEEE CIDM 2023 will bring together researchers and practitioners from around the world to discuss the latest advances in the field of computational intelligence applied to data mining and will act as a major forum for the presentation of recent results in theory, algorithms, systems and applications.

  13. Recent advances in domain-driven data mining

    Data mining research has been significantly motivated by and benefited from real-world applications in novel domains. This special issue was proposed and edited to draw attention to domain-driven data mining and disseminate research in foundations, frameworks, and applications for data-driven and actionable knowledge discovery. Along with this special issue, we also organized a related ...

  14. Data mining

    Data mining is the process of extracting potentially useful information from data sets. It uses a suite of methods to organise, examine and combine large data sets, including machine learning ...

  15. Research on Big Data Mining Application of Internet of ...

    In order to scientifically deal with the problem of network extension, optimize the application performance of the Internet of Things technology and equipment, and truly meet the requirements of big data application of the Internet of Things, in the development of modern economic construction, scholars from various countries in the integration of their own research experience on the basis of ...

  16. Data Mining on IEEE Technology Navigator

    According to a conference paper on free software tools for data mining, the best free offerings include RapidMiner, Weka, R, KNIME, Orange, and scikit-learn. Many of the companies behind these free tools also offer data mining services. Paid-for options include Sisense, Neural Designer, and Alteryx Analytics.

  17. 345193 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on DATA MINING. Find methods information, sources, references or conduct a literature review on DATA MINING

  18. Home

    Data Mining and Knowledge Discovery is a leading technical journal focusing on the extraction of information from vast databases. Publishes original research papers and practice in data mining and knowledge discovery. Provides surveys and tutorials of important areas and techniques. Offers detailed descriptions of significant applications.

  19. (PDF) Data mining techniques and applications

    Data Mining Algorithms and Techniques. Various algorithms and techniques like Classification, Clustering, Regression, Artificial. Intelligence, Neural Networks, Association Rules, Decision Trees ...

  20. IEEE Data Science projects

    For details, Call: 9886692401/9845166723. DHS Informatics providing latest 2024-2025 IEEE projects on Data science for the final year engineering students. DHS Informatics trains all students to develop their project with good idea what they need to submit in college to get good marks. DHS Informatics offers placement training in Bangalore and ...

  21. 50 selected papers in Data Mining and Machine Learning

    Active Sampling for Feature Selection, S. Veeramachaneni and P. Avesani, Third IEEE Conference on Data Mining, 2003. Heterogeneous Uncertainty Sampling for Supervised Learning, D. Lewis and J. Catlett, In Proceedings of the 11th International Conference on Machine Learning, 148-156, 1994. Learning When Training Data are Costly: The Effect of ...

  22. Data Mining Ieee Papers and Projects-2020

    DATA MINING IEEE PAPERS AND PROJECTS-2020. Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data stream mining , as its name suggests, is connected with two basic fields of computer science, ie data mining and data streams.

  23. Data Mining: Research Papers

    The wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge. In this article, we have listed a few research papers related to Data Mining. It will help the students to select seminar topics for CSE and computer science engineering projects. Download the PDF papers to study and ...