- Artificial Intelligence
- Generative AI
- Business Operations
- IT Leadership
- Application Security
- Business Continuity
- Cloud Security
- Critical Infrastructure
- Identity and Access Management
- Network Security
- Physical Security
- Risk Management
- Security Infrastructure
- Vulnerabilities
- Software Development
- Enterprise Buyer’s Guides
- United States
- Deutschland (Germany)
- United Kingdom
- Spotlight: AI in Enterprise
- Newsletters
- Foundry Careers
- Terms of Service
- Privacy Policy
- Cookie Policy
- Member Preferences
- About AdChoices
- E-commerce Links
- Your California Privacy Rights
Our Network
- Computerworld
- Network World
The 18 biggest data breaches of the 21st century
Data breaches affecting millions of users are far too common. here are some of the biggest, baddest breaches in recent memory..
In today’s data-driven world, data breaches can affect hundreds of millions or even billions of people at a time. Digital transformation has increased the supply of data moving, and data breaches have scaled up with it as attackers exploit the data-dependencies of daily life. How large cyberattacks of the future might become remains speculation, but as this list of the biggest data breaches of the 21 st Century indicates, they have already reached enormous magnitudes.
For transparency, this list has been calculated by the number of users impacted, records exposed, or accounts affected. We have also made a distinction between incidents where data was actively stolen or reposted maliciously and those where an organization has inadvertently left data unprotected and exposed, but there has been no significant evidence of misuse. The latter have purposefully not been included in the list.
So, here it is – an up-to-date list of the 15 biggest data breaches in recent history, including details of those affected, who was responsible, and how the companies responded (as of July 2021).
Date: August 2013 Impact: 3 billion accounts
Securing the number one spot – almost seven years after the initial breach and four since the true number of records exposed was revealed – is the attack on Yahoo. The company first publicly announced the incident – which it said took place in 2013 – in December 2016. At the time, it was in the process of being acquired by Verizon and estimated that account information of more than a billion of its customers had been accessed by a hacking group. Less than a year later, Yahoo announced that the actual figure of user accounts exposed was 3 billion. Yahoo stated that the revised estimate did not represent a new “security issue” and that it was sending emails to all the “additional affected user accounts.”
Despite the attack, the deal with Verizon was completed, albeit at a reduced price. Verizon’s CISO Chandra McMahon said at the time: “Verizon is committed to the highest standards of accountability and transparency, and we proactively work to ensure the safety and security of our users and networks in an evolving landscape of online threats. Our investment in Yahoo is allowing that team to continue to take significant steps to enhance their security, as well as benefit from Verizon’s experience and resources.” After investigation, it was discovered that, while the attackers accessed account information such as security questions and answers, plaintext passwords, payment card and bank data were not stolen.
2. Aadhaar [tie with Alibaba]
Date: January 2018 Impact: 1.1 billion Indian citizens’ identity/biometric information exposed
In early 2018, news broke that malicious actors has infiltrated the world’s largest ID database, Aadhaar , exposing information on more than 1.1 billion Indian citizens including names, addresses, photos, phone numbers, and emails, as well as biometric data like fingerprints and iris scans. What’s more, since the database – established by the Unique Identification Authority of India (UIDAI) in 2009 – also held information about bank accounts connected with unique 12-digit numbers, it became a credit breach too. This was despite the UIDAI initially denying that the database held such data
The actors infiltrated the Aadhaar database through the website of Indane, a state-owned utility company connected to the government database through an application programming interface that allowed applications to retrieve data stored by other applications or software. Unfortunately, Indane’s API had no access controls, thus rendering its data vulnerable. Hackers sold access to the data for as little as $7 via a WhatsApp group. Despite warnings from security researchers and tech groups, it took Indian authorities until March 23, 2018, to take the vulnerable access point offline.
2. Alibaba [tie with Aadhaar]
Date: November 2019 Impact: 1.1 billion pieces of user data
Over an eight-month period, a developer working for an affiliate marketer scraped customer data, including usernames and mobile numbers, from the Alibaba Chinese shopping website, Taobao, using crawler software that he created. It appears the developer and his employer were collecting the information for their own use and did not sell it on the black market, although both were sentenced to three years in prison.
A Taobao spokesperson said in a statement : “Taobao devotes substantial resources to combat unauthorized scraping on our platform, as data privacy and security is of utmost importance. We have proactively discovered and addressed this unauthorized scraping. We will continue to work with law enforcement to defend and protect the interests of our users and partners.”
4. LinkedIn
Date: June 2021 Impact: 700 million users
Professional networking giant LinkedIn saw data associated with 700 million of its users posted on a dark web forum in June 2021, impacting more than 90% of its user base. A hacker going by the moniker of “God User” used data scraping techniques by exploiting the site’s (and others’) API before dumping a first information data set of around 500 million customers. They then followed up with a boast that they were selling the full 700 million customer database. While LinkedIn argued that as no sensitive, private personal data was exposed, the incident was a violation of its terms of service rather than a data breach, a scraped data sample posted by God User contained information including email addresses, phone numbers, geolocation records, genders and other social media details, which would give malicious actors plenty of data to craft convincing, follow-on social engineering attacks in the wake of the leak, as warned by the UK’s NCSC .
5. Sina Weibo
Date: March 2020 Impact: 538 million accounts
With over 600 million users, Sina Weibo is one of China’s largest social media platforms. In March 2020, the company announced that an attacker obtained part of its database, impacting 538 million Weibo users and their personal details including real names, site usernames, gender, location, and phone numbers. The attacker is reported to have then sold the database on the dark web for $250.
China’s Ministry of Industry and Information Technology (MIIT) ordered Weibo to enhance its data security measures to better protect personal information and to notify users and authorities when data security incidents occur. In a statement , Sina Weibo argued that an attacker had gathered publicly posted information by using a service meant to help users locate the Weibo accounts of friends by inputting their phone numbers and that no passwords were affected. However, it admitted that the exposed data could be used to associate accounts to passwords if passwords are reused on other accounts. The company said it strengthened its security strategy and reported the details to the appropriate authority.
6. Facebook
Date: April 2019 Impact: 533 million users
In April 2019, it was revealed that two datasets from Facebook apps had been exposed to the public internet. The information related to more than 530 million Facebook users and included phone numbers, account names, and Facebook IDs. However, two years later (April 2021) the data was posted for free, indicating new and real criminal intent surrounding the data. In fact, given the sheer number of phone numbers impacted and readily available on the dark web as a result of the incident, security researcher Troy Hunt added functionality to his HaveIBeenPwned (HIBP) breached credential checking site that would allow users to verify if their phone numbers had been included in the exposed dataset.
“I’d never planned to make phone numbers searchable,” Hunt wrote in blog post . “My position on this was that it didn’t make sense for a bunch of reasons. The Facebook data changed all that. There’s over 500 million phone numbers but only a few million email addresses so >99% of people were getting a miss when they should have gotten a hit.”
7. Marriott International (Starwood)
Date: September 2018 Impact: 500 million customers
Hotel Marriot International announced the exposure of sensitive details belonging to half a million Starwood guests following an attack on its systems in September 2018. In a statement published in November the same year, the hotel giant said: “On September 8, 2018, Marriott received an alert from an internal security tool regarding an attempt to access the Starwood guest reservation database. Marriott quickly engaged leading security experts to help determine what occurred.”
Marriott learned during the investigation that there had been unauthorized access to the Starwood network since 2014. “Marriott recently discovered that an unauthorized party had copied and encrypted information and took steps towards removing it. On November 19, 2018, Marriott was able to decrypt the information and determined that the contents were from the Starwood guest reservation database,” the statement added.
The data copied included guests’ names, mailing addresses, phone numbers, email addresses, passport numbers, Starwood Preferred Guest account information, dates of birth, gender, arrival and departure information, reservation dates, and communication preferences. For some, the information also included payment card numbers and expiration dates, though these were apparently encrypted.
Marriot carried out an investigation assisted by security experts following the breach and announced plans to phase out Starwood systems and accelerate security enhancements to its network. The company was eventually fined £18.4 million (reduced from £99 million) by UK data governing body the Information Commissioner’s Office (ICO) in 2020 for failing to keep customers’ personal data secure. An article by New York Times attributed the attack to a Chinese intelligence group seeking to gather data on US citizens.
Date: 2014 Impact: 500 million accounts
Making its second appearance in this list is Yahoo, which suffered an attack in 2014 separate to the one in 2013 cited above. On this occasion, state-sponsored actors stole data from 500 million accounts including names, email addresses, phone numbers, hashed passwords, and dates of birth. The company took initial remedial steps back in 2014, but it wasn’t until 2016 that Yahoo went public with the details after a stolen database went on sale on the black market.
9. Adult Friend Finder
Date: October 2016 Impact: 412.2 million accounts
The adult-oriented social networking service The FriendFinder Network had 20 years’ worth of user data across six databases stolen by cyber-thieves in October 2016. Given the sensitive nature of the services offered by the company – which include casual hookup and adult content websites like Adult Friend Finder, Penthouse.com, and Stripshow.com – the breach of data from more than 414 million accounts including names, email addresses, and passwords had the potential to be particularly damning for victims. What’s more, the vast majority of the exposed passwords were hashed via the notoriously weak algorithm SHA-1, with an estimated 99% of them cracked by the time LeakedSource.com published its analysis of the data set on November 14, 2016.
10. MySpace
Date: 2013 Impact: 360 million user accounts
Though it had long stopped being the powerhouse that it once was, social media site MySpace hit the headlines in 2016 after 360 million user accounts were leaked onto both LeakedSource.com and put up for sale on dark web market The Real Deal with an asking price of 6 bitcoin (around $3,000 at the time).
According to the company , lost data included email addresses, passwords and usernames for “a portion of accounts that were created prior to June 11, 2013, on the old Myspace platform. In order to protect our users, we have invalidated all user passwords for the affected accounts created prior to June 11, 2013, on the old Myspace platform. These users returning to Myspace will be prompted to authenticate their account and to reset their password by following instructions.”
It’s believed that the passwords were stored as SHA-1 hashes of the first 10 characters of the password converted to lowercase.
11. NetEase
Date: October 2015 Impact: 235 million user accounts
NetEase, a provider of mailbox services through the likes of 163.com and 126.com, reportedly suffered a breach in October 2015 when email addresses and plaintext passwords relating to 235 million accounts were being sold by dark web marketplace vendor DoubleFlag. NetEase has maintained that no data breach occurred and to this day HIBP states : “Whilst there is evidence that the data itself is legitimate (multiple HIBP subscribers confirmed a password they use is in the data), due to the difficulty of emphatically verifying the Chinese breach it has been flagged as “unverified.”
12. Court Ventures (Experian)
Date: October 2013 Impact: 200 million personal records
Experian subsidiary Court Ventures fell victim in 2013 when a Vietnamese man tricked it into giving him access to a database containing 200 million personal records by posing as a private investigator from Singapore. The details of Hieu Minh Ngo’s exploits only came to light following his arrest for selling personal information of US residents (including credit card numbers and Social Security numbers) to cybercriminals across the world, something he had been doing since 2007. In March 2014, he pleaded guilty to multiple charges including identity fraud in the US District Court for the District of New Hampshire. The DoJ stated at the time that Ngo had made a total of $2 million from selling personal data.
13. LinkedIn
Date: June 2012 Impact: 165 million users
With its second appearance on this list is LinkedIn, this time in reference to a breach it suffered in 2012 when it announced that 6.5 million unassociated passwords (unsalted SHA-1 hashes) had been stolen by attackers and posted onto a Russian hacker forum. However, it wasn’t until 2016 that the full extent of the incident was revealed. The same hacker selling MySpace’s data was found to be offering the email addresses and passwords of around 165 million LinkedIn users for just 5 bitcoins (around $2,000 at the time). LinkedIn acknowledged that it had been made aware of the breach, and said it had reset the passwords of affected accounts.
14. Dubsmash
Date: December 2018 Impact: 162 million user accounts
In December 2018, New York-based video messaging service Dubsmash had 162 million email addresses, usernames, PBKDF2 password hashes, and other personal data such as dates of birth stolen, all of which was then put up for sale on the Dream Market dark web market the following December. The information was being sold as part of a collected dump also including the likes of MyFitnessPal (more on that below), MyHeritage (92 million), ShareThis, Armor Games, and dating app CoffeeMeetsBagel.
Dubsmash acknowledged the breach and sale of information had occurred and provided advice around password changing. However, it failed to state how the attackers got in or confirm how many users were affected.
Date: October 2013 Impact: 153 million user records
In early October 2013, Adobe reported that hackers had stolen almost three million encrypted customer credit card records and login data for an undetermined number of user accounts. Days later, Adobe increased that estimate to include IDs and encrypted passwords for 38 million “active users.” Security blogger Brian Krebs then reported that a file posted just days earlier “appears to include more than 150 million username and hashed password pairs taken from Adobe.” Weeks of research showed that the hack had also exposed customer names, password, and debit and credit card information. An agreement in August 2015 called for Adobe to pay $1.1 million in legal fees and an undisclosed amount to users to settle claims of violating the Customer Records Act and unfair business practices. In November 2016, the amount paid to customers was reported to be $1 million.
16. National Public Data
Date: December 2023 Impact: 270 million people
A breach of background checking firm National Public Data exposed the data of hundreds of millions of people through the disclosure of an estimated 2.9 billion records. As a result of the December 2023 hack, stolen data was up for sale of on the dark web by hacking group USDoD in April 2024. Much of the stolen data was leaked and made freely available in a 4TB dump onto a cybercrime forum July 2024.
The incident, which only became public knowledge after a class action was filed in August 2024, exposed social security numbers, names, mailing addresses, emails, and phone numbers of 270 million people, mostly US citizens. Much of the data, which also includes information pertaining to Canadian and British residents, appears to be outdated or inaccurate but the impact of the exposure of so much personal information is nonetheless severe. An estimated 70 million rows of records cover US criminal records.
The mechanism of the initial breach remains unconfirmed but investigative reporter Brian Krebs reports that up until early August 2024 an NPD property, recordscheck.net, contained the usernames and password for the site’s administrator in a plain text archive.
In a statement , Jericho Pictures (which trades as National Public Data) advised people to closely monitor their financial accounts for unauthorised activity. National Public Data said it was working with law enforcement and governmental investigators adding that it is reviewing potentially affected records to understand the scope of the breach. It will “try to notify” affected parties if there are “further significant developments”.
Experts advise consumers to consider freezing credit with the three major bureaus (Equifax, Experian, and TransUnion) and using identity theft protection services as potential precautions.
17. Equifax
Date: 2017 Impact: 159 million records
Credit reference agency Equifax suffered a data breach in 2017 that affected 147 million US citizens and 15 million Britons. Names, social security numbers, birth dates, addresses as well as driver’s licenses of more than 10 million were exposed after attackers took advantage of a web security vulnerability to break into Equifax’s systems. The breach also exposed the credit card data of a smaller group of 209,000 people.
Attackers broke into Equifax’s systems between May and July 2017 by taking advantage of an unpatched Apache Struts vulnerability to hack into the credit reference agency’s dispute resolution portal. Patches for the exploited vulnerability had been available since March 2017, months before the attack. Struts is a popular framework for creating Java-based web applications.
Cybercriminals moved laterally through their ingress points before stealing credentials that allowed them to query its databases, systematically siphoning off stolen data. US authorities charged four named members of the Chinese military with masterminding the hack. Chinese authorities have denied any involvement in the attack.
Equifax faced numerous lawsuits and government investigations in the wake of the breach. The credit reference agency was left an estimated $1.7 billion out of pocket because of the breach without taking into account the effect on its stock price. Equifax spent an estimated $337 million on improving its technology and data security, legal and computer forensic fees and other direct costs alone.
Date: 2014 Impact: 145 million records
A breach on online marketplace eBay between late February and early March 2014 exposed sensitive personal information of an estimated 145 million user accounts. Cybercriminals gained access to eBay’s systems after compromising a small number of employee login credentials.
The hack allowed miscreants access to sensitive information including encrypted passwords, email addresses, mailing addresses, phone numbers and dates of birth. Financial information, including data on PayPal accounts, was stored on separate system and therefore not affected by the breach. In response to the incident, eBay applied a forced reset to user passwords.
More news-making data breaches:
- Hackers steal data of 200k Lulu customers in an alleged breach
- Evolve data breach impacted upward of 7.64 million consumers
The biggest data breach fines, penalties, and settlements so far
- Dell data breach exposes data of 49 million customers
- Sensitive US government data exposed after Space-Eyes data breach
Related content
Most interesting products to see at rsac 2024, google launches google threat intelligence at rsa conference, accenture, ibm, mandiant join elite cyber defenders program to secure critical infrastructure, from our editors straight to your inbox.
Michael Hill is the UK editor of CSO Online. He has spent the past five-plus years covering various aspects of the cybersecurity industry, with particular interest in the ever-evolving role of the human-related elements of information security. A keen storyteller with a passion for the publishing process, he enjoys working creatively to produce media that has the biggest possible impact on the audience.
More from this author
What is the cost of a data breach, cyber insurance explained: costs, terms, how to know it’s right for your business, notable post-quantum cryptography initiatives paving the way toward q-day, if you don’t already have a generative ai security policy, there’s no time to lose, bigid unveils new data risk remediation guidance feature, generative ai could erode customer trust, half of business leaders say, us launches “shields ready” campaign to secure critical infrastructure.
Dan Swinhoe is UK Editor of CSO Online. Previously he was Senior Staff Writer at IDG Connect.
EU’s DORA regulation explained: New risk management requirements for financial firms
Man-in-the-middle (mitm) attack definition and examples, how decision-making psychology can improve incident response, how stagecoach stops bec attacks with security training, email controls, interim data deal and brexit: what cisos need to know now the uk has left the eu, uk telecommunications security bill aims to improve telco security for 5g rollouts, what is typosquatting a simple but effective attack technique, what cisos need to know about europe’s gaia-x cloud initiative.
John Leyden is a senior writer for CSO Online. He has written about computer networking and cyber-security for more than 20 years. Prior to the advent of the web, he worked as a crime reporter at a local newspaper in Manchester, UK. John holds an honors degree in electronic engineering from City, University of London.
7 reasons security breach sources remain unknown
Enterprises look to ai to bridge cyber skills gap — but will still fall short, ot security becoming a mainstream concern, security outsourcing on the rise as cisos seek cyber relief, security researchers circumvent microsoft azure ai content safety, the ransomware negotiation playbook adds new chapters, low turnover leaves job-seeking cisos with nowhere to go, eu’s nis2 directive for cybersecurity resilience enters full enforcement, show me more, from mfa mandates to locked-down devices, microsoft posts a year of sfi milestones at ignite.
EDR buyer’s guide: How to pick the best endpoint detection and response solution
Musk's anticipated cost-cutting hacks could weaken American cybersecurity
CSO Executive Sessions: Guardians of the Games - How to keep the Olympics and other major events cyber safe
CSO Executive Session India with Dr Susil Kumar Meher, Head Health IT, AIIMS (New Delhi)
CSO Executive Session India with Charanjit Bhatia, Head of Cybersecurity, COE, Bata Brands
CSO Executive Sessions: Open Source Institute’s Eric Nguyen on supply chain risks to critical infrastructure (Part 2)
CSO Executive Sessions: Open Source Institute’s Eric Nguyen on supply chain risks to critical infrastructure (Part 1)
CSO Executive Sessions: Standard Chartered’s Alvaro Garrido on cybersecurity in the financial services industry
Sponsored Links
- Zscaler ThreatLabz 2024 Mobile, IoT, and OT Threat Report
Advertisement
Advancing database security: a comprehensive systematic mapping study of potential challenges
- Open access
- Published: 17 July 2023
- Volume 30 , pages 6399–6426, ( 2024 )
Cite this article
You have full access to this open access article
- Asif Iqbal 1 ,
- Siffat Ullah Khan 1 ,
- Mahmood Niazi 2 , 3 ,
- Mamoona Humayun 4 ,
- Najm Us Sama 5 ,
- Arif Ali Khan 6 &
- Aakash Ahmad 7
5531 Accesses
2 Citations
Explore all metrics
The value of data to a company means that it must be protected. When it comes to safeguarding their local and worldwide databases, businesses face a number of challenges. To systematically review the literature to highlight the difficulties in establishing, implementing, and maintaining secure databases. In order to better understand database system problems, we did a systematic mapping study (SMS). We’ve analyzed 100 research publications from different digital libraries and found 20 issues after adopting inclusion and exclusion criteria. This SMS study aimed to identify the most up-to-date research in database security and the different challenges faced by users/clients using various databases from a software engineering perspective. In total, 20 challenges were identified related to database security. Our results show that “weak authorization system”, “weak access control”, “privacy issues/data leakage”, “lack of NOP security”, and “database attacks” as the most frequently cited critical challenges. Further analyses were performed to show different challenges with respect to different phases of the software development lifecycle, venue of publications, types of database attacks, and active research institutes/universities researching database security. The organizations should implement adequate mitigation strategies to address the identified database challenges. This research will also provide a direction for new research in this area.
Similar content being viewed by others
Database Security: Attacks and Solutions
Reverse Engineering of Database Security Policies
Automated Generation of Multi-tenant Database Systems Based upon System Design Diagrams
Avoid common mistakes on your manuscript.
1 Introduction
Companies’ databases (DBs) are repositories of their most significant and high-value data. As DB utilization has surged, so has the frequency of attacks on these databases. A DB attack is characterized as an event that jeopardizes a resource by altering or destroying vital data [ 1 , 2 ]. The common goal of DB attacks is to access critical information. Illicitly acquiring sensitive data such as credit card details, banking data, and personal identifiers is another prevalent motive behind DB hacks. In our interconnected global society, several technologies provide avenues for DB attacks to exploit vulnerabilities in DB architecture, as per common understanding [ 1 , 3 , 4 ].
Many enterprises confront challenges like data piracy, data replication, and denial of service attacks. To infiltrate a company’s DBs, cybercriminals scout for system vulnerabilities and exploit them using specialized tools [ 5 , 6 ].
The aspect of security should be prioritized during the development of information systems, particularly DBs. In terms of software development, security concerns must be addressed at every stage of the development cycle [ 7 ]. As illustrated in Fig. 1 , security breaches, including the loss of critical data, have become commonplace in recent years. Given the importance of data security to numerous businesses, a range of measures and methodologies are required to safeguard the DB [ 8 , 9 , 10 ]. A secure DB is designed to react appropriately in the event of a potential DB attack [ 11 ].
Total data breaches cost in different countries [ 5 ]
In the current world, the impact of cyber-attacks on the commercial landscape must be addressed. To succeed in the globalized environment, businesses must ensure the protection of their vital data. DBs can be safeguarded from unauthorized access [ 12 , 13 , 14 ]. When a DB is outsourced to the cloud, cloud platforms introduce security challenges such as unreliable service providers, malicious cloud employees, data protection, consistency, and scalability. With cloud DBs becoming increasingly susceptible to both external and internal threats, traditional and conventional security measures are insufficient for their protection [ 15 , 16 ].
While extensive work has been done in this field, much of it focuses on a few specific DB platforms or problems, typically explored through standard literature reviews. We aim to provide a more holistic view by conducting a systematic mapping study (SMS) to identify security concerns in DB architecture, development, and maintenance from a software engineering perspective. This SMS will help us identify the ongoing research challenges and priorities.
The following research questions (RQs) will guide our SMS to achieve our study objectives:
RQ1 What is the current state of the art in the development and implementation of secure DBs?
RQ2 What are the security issues in building, implementing, and maintaining secure DBs, as reported in the literature?
1.1 Paper contribution
The contributions of the intended work are as follows:
The proposed research undertakes a systematic mapping study (SMS) to identify and emphasize the challenges associated with developing and maintaining secure databases.
In addition to showing the difficulties experienced by users using various databases from a software engineering standpoint, our SMS survey sheds light on some of the most current database security studies.
It also highlights the importance of maintaining careful attention to database security and suggests a direction for future research in this field.
1.2 Motivation for the paper
Several research in the literature seeks to give a solution for database security. However, before moving forward with new solutions, it is necessary to synthesize current knowledge to offer security practitioners the most up-to-date information. We must identify the cutting-edge in constructing, implementing, and maintaining dependable databases, as well as security challenges, so BD’s design, development, and maintenance may be secure. The motivation behind this research is to provide in-depth solutions to these problems.
1.3 Paper organization
The remainder of the article is arranged in the following manner.
In Sect. 2 , we discussed the background of DB security, and Sect. 3 , illustrated the research methodology in detail. The results of our conducted SMS are given in Sect. 4 . In Sect. 5 , the Implication of our findings is discussed. Finally, the conclusion and future work are discussed in Sect. 6 . Other supportive information is provided in the rest of the sections at the bottom of this paper.
2 Background
There are a number of studies that look at database security from different angles. In their study [ 17 ], Mai et al. suggest using cloud-based security measures to safeguard power system databases. Using an RSA encryption method, public and private keys are generated for database encryption; a huge prime integer is chosen randomly from the cloud platform’s Simple Storage Service and used as the client key. When the database receives a verification key, it compares it to the public key and private key established by the RSA encryption method. If the database determines that the access is legitimate, it provides feedback on the access. According to the findings of the tests, the database can be protected against threats as the threat situation value is always less than 0.50 once the design technique has been implemented.
A data encryption algorithm was developed by Ibrahim et al., which provides an encryption-based solution for DB security. In this system, information is encrypted using standard ASCII characters. They encrypted all of the data in the database and used three keys to access the primary formula. Numbers and text both work for the data. The suggested formula may restore the data’s original format by combining another coordinator with the aforementioned three keys. In order to achieve a comparable data size to when the data is encrypted at a decent pace, the algorithm prioritizes data size and recording speed [ 18 ].
The article offers a lightweight cryptosystem based on the Rivest Cypher 4 (RC4) algorithm [ 19 ] as a solution to the widespread problem of insecure database transfer between sender and recipient. This cryptosystem safeguards sensitive information by encrypting it before sending it through a network and then decrypting it upon its safe return. Database tables have an encapsulating system that ciphers symbolize hens.
The continual improvements in digitizing have enhanced the prominence of online services. Enterprises must store essential data in corporate DB systems, including bank records, activities, the history of patient paperwork, personal data, agreements, etc. The institutions also must maintain the data’s authenticity, privacy, and availability. Any intrusion in security procedures or data may cause severe economic loss and damage the company’s reputation [ 20 ]. The remarkable development in the deployment of DB’s is the required architecture to cope with information that can be attributed to the rising big data. Every 1.2 years, according to research, the entire quantity of institutional information doubles [ 21 ].
Most of the latest studies provide encryption-based solutions for DB security. However, before proceeding towards these solutions, there is a need to find out the flaws that lead to security breaches.
One or more of the following sources can lead to a security flaw:
Interior Internal origins of attack originate from inside the corporation. Human resources—organization supervisors, admins, workers, and interns—all fall within this category of insiders. Almost all insiders are recognized in a particular way, and just a few IT professionals have significant access levels.
Exterior Exterior attacks originate from entities outside the organization instance, cybercriminals, illegal parties of established ways, and government agencies. Usually, no confidence and trust, or benefit is offered for external sources.
Collaborator Any third party involved in a business connection with the organization, firm, or group is considered a partner in many companies. This significant collection of partners, distributors, vendors, contractual labor, and customers is known as the entire enterprise. There must be some level of confidence and privilege of accessibility or record among colleagues in the entire enterprise; therefore, this is often advised.
2.1 Secure databases
With incredibly high secure data and an expanded online presence, the worries concerning DB security are high at all-time. As more systems are connected and brought online to improve access, the sensitivity towards attacks is also increased, estimated to be about $1.3 million in massive financial losses; these mischievous attacks are also liable for public reputation and client relations with the association [ 21 , 22 ]. All users can boundlessly get information from the DB server in an un-secure DB system. All hosts are allowed to associate with the server from any IP address and link with the DB server, making everyone’s information accessible in the storage engine [ 23 , 24 ].
Hence, the DB system is retained with numerous security mechanisms which contain anticipation of unauthorized access to data from an insider or outsider of an organization. Proper encryption techniques should be applied to secure the DB’s [ 25 ]. The most comprehensive secure DB model is the multilevel model, which allows the arrangement of information according to its privacy and deals with mandatory access control MAC [ 7 ]. DB services are intended to ensure that client DB’s are secure by implementing backup and recovery techniques [ 26 ].
The DB can be protected from the third party, which is not authorized by the procedure called cryptography and utilizing other related techniques. The primary motivation behind DB security is ensuring data privacy from unauthorized outsiders. The essential techniques in DB security are authentication, confidentiality, and integrity, which are utilized to secure the DB’s [ 27 ]. DB construction, in particular, must consider security as the main goal while developing a data system. In this respect, security should be addressed at all stages of the software development process [ 7 , 28 , 29 , 30 ].
2.2 Related work
Various articles examine the importance of security controls from the perspective of software engineering [ 31 ]. For instance, MÁRQUEZ et al. [ 32 ] conducted a systematic survey concentrating on the telemedicine platform’s safety from the software engineering viewpoint. The key focus of this article is investigating how Software development assists in designing a reliable telehealth platform. However, the proposed work is just restricted to, particularly telehealth systems.
Al-Sayid et al. [ 1 ] notably studied the challenges of data stores and proposed DB security issues. To prevent unauthorized access to or alteration of the DB’s critical material, they observed a wide variety of DB security issues. Another research by Zeb focuses on identifying potential attacks on the DB system using a standard research study. Mousa et al. [ 33 ] discover the various risks to DB safety in their analysis through the unstructured research study. Moghadam et al. [ 15 ] did an investigation on cloud servers to figure out all conceivable threats.
Nevertheless, this analysis is solely restricted to the cloud DB environment. The researchers Segundo Toapanta et al. [ 5 ] uncovered real-world examples of cybercrime. Apart from that, their research is restricted to cyberattacks.
The authors in [ 21 ] have suggested an innovative technique for spotting distinct threats to DB systems by assessing the risk for incoming new activities. Their research discovered various harmful attacks that could harm the DB system. The emphasis of their research is only confined to security assessment involving DB’s. Experts in [ 32 ] present a comprehensive mapping analysis, and their observations are only limited to the Telehealth system’s privacy from the software engineering point of view. They did not define the security problems in creating, implementing, and managing safe DB’s. Furthermore, with the rapid development of ICTs, it is essential to be up to date on the most recent developments in this field.
The primary goal of this research is to gain a greater understanding of this topic by conducting a Systematic Mapping Survey to identify the problems in building, managing, and sustaining reliable DB’s.
3 Research methodology
The goal of this study was accomplished by evaluating the current state of DB privacy and suggesting areas that needed further research work. With the SMS, researchers may better connect the data from literary research to a series of questions [ 34 , 35 ]. SMS is a descriptive investigation that involves picking and putting combine all published research articles associated with a particular challenge and gives a broad summery of existing materials relating to the particular questions. In the near future, software engineers will benefit significantly from SMS because it provides a comprehensive overview of the research in the field. Figure 2 outlines the process that was followed to conduct the mapping study.
SMS process
3.1 Research questions
Our primary objective is to find the obstacles in planning, creating, and managing data protection. To achieve this objective, relevant study questions have been devised.
RQ 1 What is the current state of the art in the development and implementation of secure DBs?
To address RQ1, we have studied the material depending on the sub-questions mentioned above:
RQ 1.1 n terms of reliable data modeling, development, and maintenance, which stage has received the most attention in the research?
RQ 1.2 What are the primary sites for robust DB design?
RQ 1.3 What are the ongoing research organizations working in robust data modeling?
RQ 1.4 What kinds of DB attacks have been described in the research?
RQ 1.5 According to the research, what are the various categories of DB's?
RQ 1.6 What kinds of DBMS platforms are often employed, as stated in the literature.
RQ 2 What are the security issues in building, implementing, and maintaining secure DBs, as reported in the literature?
3.2 Search strategy
The scholars in [ 36 , 37 , 38 ] employed the PICO (Population, Intervention, Comparison, and Outcomes) framework to develop a list of terms and then drew search terms from research questions.
Population DB’s and software development in general.
Intervention Security Strategies.
Comparison No assessments proceed for the ongoing investigation.
Outcomes Reliable DB’s.
3.3 Search strings
After several tries, the following two search terms were selected to link the PICO aspects by utilizing Boolean connector (AND):
((“Database security” OR “Secure Databases” OR “Database protection” OR “Guarding Database” OR “Database intrusion” OR “Database prevention”) AND (“Security Mechanisms” OR “Security Models” OR “Security methods” OR “Security policies” OR “Security techniques” OR “Security Guidelines”)).
For Science Direct online repository, we compressed the above search term due to space limits. As a result, the accompanying keywords were entered into the ScienceDirect database:
((“Database security” OR “Secure Databases” OR “Database protection” OR “Guarding Database” OR “Database prevention”) AND (“Security Mechanisms” OR “Security methods” OR “Security techniques” OR “Security guidelines”)).
3.4 Literature resources
We choose below digital repositories (A to F) to do our SMS and execute the search stings for acquiring publications.
IEEE xplore–B
Springer link–C
AIS electronic library (AiSel)–D
Science direct–E
Wiley online library–F
3.5 Research evaluation criteria
Titles, abstracts, entire readings, and quality assessments were all factors in our selection of research publications. The primary goal of the selection process is to compile an appropriate collection of papers by imposing inclusion and exclusion standards on submissions. We have set the accompanying inclusion and exclusion criteria to perform our SMS effectively. The same inclusion and exclusion criteria have been used in other studies [ 39 , 40 , 41 ]
3.5.1 Inclusion criteria
Only articles that meet one or more of the below criteria were considered for inclusion in our collection.
I1 Research involving the design and implementation of database security measures.
I2 Research that explains how to protect DB’s.
I3 Research the difficulties and dangers of creating, implementing, and maintaining safe DB’s.
I4 Research on the planning, development, and management of reliable DB's included in this category.
3.5.2 Exclusion criteria
The preceding exclusion criteria were considered to find relevant articles.
E1 Publications that are not published in the English language.
E2 No consideration will be given to materials that haven’t been published in any journal, magazine, or conference proceedings, such as unpublished books and grey material.
E3 Books as well as non-peer-reviewed articles, including briefs, proposals, keynotes, evaluations, tutorials, and forum discussions.
E4 Articles that aren’t published in their whole digital.
E5 Publications that don’t meet the inclusion requirements.
E6 Research is only provided as abstracts or PowerPoint slides.
We used the snowballing approach [ 42 , 43 , 44 ] in addition to the previous inclusion/exclusion criteria for our concluding decision. The snowball method was used to choose seven articles from various research repositories. Appendix 1 contains the papers selected using the snowballing approach, from 94 to 100. In the latest research, scholars have employed the same method [ 45 , 46 ].
3.6 Quality evaluation
All articles chosen in the selection have been evaluated for quality. Criteria for quality evaluation include:
To evaluate the papers, we used a three-point Likert scale (yes, partially, no) for every element of the quality evaluation criteria. We awarded each element of quality assessment criteria a score of 2 (yes), 1 (partially), or 0 (no) to achieve notable findings. Including an article in the SMS is permitted if it gained an average standard score of > or = 0.5. Many other scholars [ 45 , 47 , 48 , 49 ] have employed a similar approach. A list of all of the questions from Table 1 is included in the quality ranking.
3.7 Article selection
Employing Afzal et al. tollgate’s technique, we adjusted the key publication selection in our SMS analysis upon executing the search terms (Sect. 3.3 ) and online DB’s (Sect. 3.4 ) [ 50 ]. The five stages of this method are as follows: (Table 2 ).
Stage1 (St-1) Conducting literature searches in digital repositories/DB’s for most relevant articles.
Stage 2 (St-2) A article’s inclusion or removal is based on its title and abstract readings.
Stage 3 (St-3) To determine if an item should be included or not, the introduction and findings must be reviewed.
Stage 4 (St-4) the inclusion and exclusion of data analysis research are based on a full-text review of the research's findings.
In Stage 5 (St-5) most of the original studies that will be included in the SMS study have been vetted and selected for inclusion.
There were 4827 documents collected from the chosen web-based libraries/DB’s by imposing inclusion and exclusion criteria following the initial search string iteration (see Sect. 3.3 ). (Sects. 3.5.1 and 3.5.2 , respectively). The tollgate strategy led to a shortlist of 100 publications that were eventually selected for the research. Quality evaluation criteria were used to evaluate the selected articles (Sect. 3.6 ). Appendix 1 includes a collection of the publications that were ultimately chosen.
3.8 Extracting and synthesizing content
A survey of the articles reviewed is used to obtain the data. In order to address the questions stated in Sect. 3.1 , the entire content of every article has been reviewed, and pertinent data extracted. You can find a precise technique for extracting data in the SMS Protocol.
4 Description of key findings
A comprehensive mapping analysis was used throughout this study to determine current state-of-the-art and privacy issues in data modeling, development, and maintenance. Sections 4.1 , 4.2 , 4.3 , 4.4 , 4.5 and 4.6 contain the facts of our observations.
4.1 The current state of the art
RQ1 has been addressed using the below sub-questions as a reference (Sects. 4.1.1 , 4.1.2 , 4.1.3 , 4.1.4 , 4.1.5 and 4.1.6 ).
4.1.1 Stages in the building of a protected database
RQ 1.1 focuses on a reliable DB’s most frequently studied stages (design, development, and maintenance). As seen in Table 3 , the “design” step was mentioned in most publications at a rate of 27%. There is a 25 percent chance that you’ll hear about the “developing” stage. The “maintenance” stage was only mentioned in 5 of our SMS research findings.
4.1.2 Well-known sources for the building of reliable DB’s
RQ 1.2 is addressed in the second part of this SMS, which concentrates on the location of the papers chosen for this SMS. For venue and provider type analyses, we looked at five repositories, including A, B, C, D, and E. Tables 4 and 5 exhibit the snowballing method, which we refer to as “others.” Several of the papers from these collections were presented at conferences, journals, and workshops/symposia, among other venues. As shown in Table 4 , 45 out of 100 articles were published through the conference venue. Secondly, we found that, with a rate of 37 out of 100, a large percentage of the publications came from the journal channel. Workshops and symposiums accounted for 18% of the articles presented.
Table 4 lists a total of 100 articles spanning a wide range of topics related to DB privacy. This indicates that scholars have devoted a great deal of attention to this topic. “International Journal of Information Security(IJIS)”, “The International Journal on Very Large Data Bases (VLDB)”, “Computers and Security (C&S)”, “Digital Investigation (DI)”, “Journal of Natural Sciences (JNS)” and “Journal of Zhejiang University SCIENCE A (JZUS-A)” were found to be the most popular publications for privacy mechanisms in secure DB designing, as mentioned in Table 5 . We also discovered that the “Annual Computer Security Applications Conference(ACSAC)” and the “International Workshop on Digital Watermarking(IWDW)” are the most often referred articles on the issue of our research. Software engineering and other related domains can benefit greatly from DB privacy studies.
4.1.3 Research institutions participating in the construction of a reliable DB
The institution of the first researcher was utilized to determine and evaluate the highly ongoing researching institutes in the field of protected DB’s. Table 6 shows the findings for RQ 1.3, which reveal that “University of Florida, USA (UOF)” and “CISUC, University of Coimbra, Portugal (UOC)” produced the most research publications on protected DB’s (3 percent, each, out of 100). Ben-Gurion University of the Negev (BGU); RMIT University in Melbourne, Australia; YONSEI University in Seoul; TELECOM Bretagne in Brest, France(ENST); Anna University in Chennai, India (AUC); Huazhong University of Science and Technology in Wuhan (HUST); and George Mason University in Fairfax, Virgin Islands(GMU). BGU has presented two publications for each of the selected research.
4.1.4 The most common kind of DB attacks, according to academic research
RQ 1.4 is concerned with identifying the many kinds of DB attacks that have been recorded. Table 7 shows the three types of incidents: internal, external, and both (internal and external). To effectively understand intrusions, we must combine cyber-attacks with breaches by collaborators. Because both internal and external attacks are mentioned in one article, we refer to this as both (internal and external). Our SMS study’s “Both (Internal & External)” attacks had a rate of 52, according to the assessment in Table 7 . The bulk of the articles in our SMS survey highlighted “External” attacks with a frequency of 35%. In total, 13 papers in our SMS addressed the topic of “internal” attacks.
4.1.5 Database types that have been identified in the literature
To answer RQ 1.5, we must recognize the various DB’s discussed in the literature. Seventeen different DB’s have been documented in the research based on the data we gathered from the articles we included in our SMS. Table 8 shows that of the 100 articles in our SMS survey, 24 papers mentioned the term “Web DB.” Secondly, we found that “Commercial DB” appeared in 11 of the 100 articles in our SMS analysis. According to SMS, “multilevel DB and distributed DB” was mentioned in ten publications.
4.1.6 Kinds of database management systems (DBMS) presented in the research
Data management systems (DBMS) are examined in RQ 1.6. In this research, 11 distinct DBMS types have been documented based on our SMS data, which was gathered from a selection of studies. Most of the articles in our SMS survey mentioned an “Oracle DB system” with a 31 out of 100 rate, as shown in Table 9 . Secondly, “MySQL DB system” was mentioned in most of the publications in our SMS analysis (23 out of 100). Our SMS research found 21 publications that mentioned the term “SQL Server DB system.”
4.2 Issues in databases
As demonstrated in Table 10 and Fig. 3 our existing research into DB privacy has uncovered 20 issues from a pool of 100 studies (see Appendix 1 ).
Issues in DB security
CC #1 Poor authentication system An unauthorized individual gains access to a DB, harvests vital information, and allows the hostile attacker to violate the safety of certified DB’s [ 1 , 51 ].
CC #2 Database intruders We are talking about when we say “threat database attacks” Anonymous queries (anomalous query attack), Harmful queries (query flood attack), and Inferential Attacks (polyinstantiation issue, aggregate problem).
CC #3: Inadequate database protection Best Strategies Specifications Engineering, Architectural, Planning, and Development all suffer from the absence of proper security procedures.
CC #4 Authorized/Malicious User Threats An authorized individual, employee, or administrator may collect or disclose critical data [ 52 ].
CC #5 Inadequate access contro Whenever many persons need access to the information, the risk of data fraud and leakage increases. The access should be restricted and regulated [ 1 ]
CC #6 Inadequate NOP protection Inadequate NOP Protection is a shortage of network privacy, operating system privacy, and physical safety.
CC #7 Data leakage/privacy challenges Clients of database systems are increasingly concerned about information security. Attacks on disclosed confidential information, including passwords, emails, and private photographs, triggered this issue. Individuals and database systems cannot stop the propagation of data exploitation and destruction once the content has been leaked [ 53 ].
CC #8 Inappropriate database implementation/configuration/maintenance Numerous DB’s are improperly setup, formatted, and maintained, among the main reasons for database privacy issues [ 54 ].
CC #9 Absence of resources When we talk about a shortage of resources, we are talking about a need of trained employees, a lack of time and budget, a shortage of reliable resources, and an insufficient storage capacity, to name a few things.
CC #10 Database management challenges There are aspects of effectively handling database systems, connectivity, and information at different levels [ 53 ].
CC #11 Inadequate connectivity platforms Presently, the majority of customer, user, and third-party conversations are conducted online. The inclusion of an insecure transmission medium was driven by the Internet’s opportunity to link DB’s [ 1 ].
CC #12 Loss of information usage monitoring Several users are unconcerned regarding their communications but may inadvertently send important information to an unauthorized person or untrustworthy servers. Because of a shortage of supervision of data consumption, they are also lost and destroyed [ 1 ].
CC #13 Web-based accessibility of tools for database attacks Several tools being used for intrusions are accessible in this globally networked domain, allowing intruders to expose weak spots with minimal expertise of the victim DB architecture [ 1 ].
CC #14 Inadequate database monitoring strategy Regulatory risk, discovery, mitigation, and restoration risk are just a few of the dangers posed by a lack of DB auditing [ 1 ].
CC #15 Poor cryptography and anonymization No DB privacy plan, regulation, or technology would be sufficient without cryptography, whether the information is traveling over a network or being kept in the DB system [ 1 ].
CC #16 Unauthorized data alteration/deletion Any type of unauthorized information alteration or deletion can result in substantial economic losses for an organization or corporation [ 55 ].
CC #17 Semantic ambiguities DB issues, including semantic uncertainty, which arises from an absence of semantics or inadequate semantic descriptions, dissemination issues, updating scope constraints, and tuple mistrust, are addressed [ 56 , 57 ].
CC #18: DB outsourcing problems : Because so many DB’s are now being outsourced, there are serious concerns about the data’s accuracy and safety. Clients will have to relinquish management of the information they have outsourced [ 58 , 59 ] .
CC #19 Regulatory and licensing challenges DB’s have many security issues, including policy and licensing concerns. Would the corporation have a consistent and approved policy and licensing from the authorities or organization [ 1 , 60 ]?
#20 Poor verification system A poor verification system allows an attacker to assume the credentials of a legitimate DB and access its data. The invader has a wide range of options for determining the identification of data. Assuming passwords are easy to remember [ 1 ] or using a preset username and password.
4.3 An assessment of database protection issues based on continents
There is much research on various continents in our SMS findings. A comparative analysis of only three continents, i.e., Europe, North America, and Asia, is discussed in this work (See Appendix 2 for more details). We want to find out if these issues are different across continents. We believe that by examining the similarities and distinctions among these problems, we may better prepare ourselves to deal with them on the continent in question. We employed the sequential correlation chi-square test to determine whether there were notable variations among the issues in the three continents listed previously (Martin, 2000). There are many more similarities than distinctions among the issues in the three continents. Poor authentication systems, DB intruders, inadequate DB protection best strategies, and authorized/ malicious user threats are the only major differences found in Table 11 . According to our findings, the most prevalent risks in the three continents are “Inadequate Access Control” (65%, 57%, and 64%), “Inadequate NOP Protection” (59%, 57%, and 60%), “Data Leakage/Privacy Challenges” (49%, 60%, 64%), and “Authorized/Malicious Individuals Threats” (40%, 20%, and 52%). It is not uncommon to see “Authorized/Malicious User Threats,” “Inadequate Access Control,” and “Inadequate NOP Security” across Europe and Asia. Inadequate Connectivity Platforms, Poor Verification Systems, Data Leakage/Privacy Challenges, and Regulatory and Licensing Challenges are some of the problems North American and European clients/users face while creating safe DB’s, as shown in Table 11 . According to our research, the “Poor Verification System” problem affects the most significant number of customers and users in Asia (78 percent). “Data Leakage/Privacy Challenges” is the most common issue faced by European customers and individuals (60 percent). Many customers in North America face “Inadequate Access Control” and “Data Leakage/Privacy Challenges” concerns, respectively (i.e., 64 percent) (Fig. 4 ).
Distribution depending on continents
4.4 Methodological assessment of database privacy issues
Table 12 shows how we divided the different types of difficulties into three distinct approaches. Table 12 shows the three approaches used: tests, Ordinary literature review OLR, and Other/Mixed Approaches as shown in Fig. 5 . Other techniques include writing an experience report, conducting a case study, conducting a survey, and utilizing fuzzy methodologies. When we talk about “many methodologies,” we mean that more than one is employed in a single work. Testing is commonly utilized (39 out of 100 times, according to Table 12 ). The second notable finding in our SMS research is that 31 of the 100 participants used a standard literature review approach. Appendix 2 has further information. Many issues have been revealed by studying the distribution of publications among the three methodologies. Seventeen issues have been detected in relation to OLR, as shown in Table 12 . Two of the Seventeen issues have been mentioned in over 50% of the publications. Inadequate Access Control (74%), and Data Leakage/Privacy Challenges (52%), are two of the most often stated problems. Tests face a total of 18 difficulties. Four of these 18 issues have been quoted more than 50% of the time in at least one of the publications. “Data Leakage/Privacy Challenges—64 percent”, “Inadequate NOP Protection—62 percent”, “Poor Authentication System—56 percent”, and “Inadequate Access Control—56 percent” are among the most often stated difficulties. Other/Mixed Approaches publications have highlighted twenty difficulties. Moreover, half of the publications cited 4 of the 20 issues listed. “Poor Authentication System—73%”, “Inadequate NOP Protection—63%”, “Inadequate Access Control—60%”, and “Data Leakage/Privacy Challenges—60%” are among the most frequently stated problems.
Methodological-based distribution of papers
Table 12 shows that no SMS approach was employed in any studies (n = 0). These findings prove that our study methodology is innovative in this particular field. We performed the Linear-by-Linear Chi-Square test for the earlier research-mentioned techniques and methodologies to establish whether there was a substantial difference between the challenges. “Poor Authentication System” and “Inappropriate DB implementation/configuration/maintenance” are the only notable variances.
4.5 Years-based study of database privacy issues
A comparison of issues over two time periods, 1990–2010 and 2011–2021, is shown in Table 13 and presented in Fig. 6 . More information can be found in Appendix 2 . Within the first phase; we found that 18 issues had been highlighted in the research. Four of the 18 issues have been quoted more than 50% in the publications. Inadequate Access Control (70 percent), Poor Authentication System (65 percent), Inadequate NOP Protection (62 percent), and Data Leakage/Privacy Challenges (52 percent) are the most commonly stated vulnerabilities. Between 1990 and 2010, 70 percent of DB’s had Inadequate Access Control, indicating that designers failed to effectively control access permission throughout implementationcontrol access permission throughout implementation.
Year-based distribution of publications
Furthermore, admins in an organization are liable for ensuring that data is adequately protected via access permissions. The “Inadequate Access Control” difficulty has dropped to 58 percent in the second period. The literature has revealed 19 problems for the second time period. Four of the 19 obstacles have been referenced in at least half of the publications. “Data Leakage/Privacy Challenges” accounts for 63% of the faults, “Inadequate Access Control” for 58%, “Poor Authentication System” for 55%, and “Inadequate NOP Protection” for 55% of the issues, respectively. We used the Linear-by-Linear Chi-Square analysis and only identified a substantial variation for one problem, “DB Management Challenges, “with a p -value of less than.05.
4.6 Evaluation of articles based on their venue
Table 14 displays a breakdown of the various distribution methods. In addition to Journals, Symposiums, Conferences, and Workshops, we have presented our final articles on extracting data via SMS in various other publications venues as well. Journals, Workshops/Symposiums, and conferences have been classified into three categories for easy study. We found that 45 percent of our comprehensive study of articles was presented at conferences, according to Table 14 and Fig. 7 . Additionally, 37% of the publications in Table 14 were presented in new journals. For further information, please see Appendix 2 at the ending of the study. Many issues have been discovered as a result of distributing papers via these three channels. According to our findings, 18 issues with journals need to be addressed. Four of the 18 challenges have been referenced in at least half of the publications. “Privacy Issues/Data Leakage—84 percent,” “Inadequate Access Control”—59 percent, “Inadequate NOP Protection”—59 percent,” and “Poor Authentication System—54 percent” are the most often stated difficulties. Conferences face a total of 20 obstacles. Three of these 20 difficulties have been quoted more than 50% of the time in at least one publication. “Poor Authentication System—71 percent,” “Inadequate Access Control—69 percent,” and “Inadequate NOP Protection—62 percent” are the most often stated issues. Workshops/Symposiums face a total of 16 difficulties. Two issues have been mentioned in over half of the publications out of the 16 total. “Data Leakage/Privacy Challenges—61 percent” and “Inadequate Access Control—56 percent” are the most commonly reported hurdles. Linear-by-Linear Chi-Squared test has been used to find substantial differences throughout the difficulties. We have found just one big variation between the hurdles “Data Leakage/Privacy Challenges”.
Venue-based distribution of articles
4.7 Comparison with existing studies
A wealth of studies have delved into various aspects of database security. Some of these have centered their attention on securing data transmission from server to client, while others have prioritized the construction of secure databases through secure coding practices. The increasing dependence on geographically dispersed information systems for daily operations might augment productivity and efficiency but simultaneously heightens the risk of security violations. Current security measures ensure data transmission protection, yet a comprehensive security strategy must also encompass mechanisms to enforce diverse access control policies. These policies should consider the content sensitivity, data attributes and traits, and other contextual data such as timing.
The consensus in the field is that effective access control systems should integrate data semantics. Moreover, strategies ensuring data integrity and availability must be customized for databases. Consequently, the database security community has developed an array of strategies and procedures over time to safeguard the privacy, integrity, and accessibility of stored data.
Nonetheless, despite these advancements, fresh challenges persist in the database security landscape. Evolving threats, data access “disintermediation,” and emerging computing paradigms and applications like grid-based computing and on-demand business have all introduced new security demands and innovative contexts where existing methodologies can be employed or extended. Despite a multitude of available solutions, raising awareness about existing security breaches is critical for bolstering database security.
In response, we decided to conduct a Systematic Mapping Study (SMS) on secure databases to offer an up-to-date perspective for both database users and developers. We did not find any comprehensive systematic literature review (SLR) or mapping study on this topic to draw comparisons with. However, we believe this research will offer a strategic roadmap for all database stakeholders.
5 Practical implications of research
The practical implications of this research are manifold and impactful. Initially, the results of this SMS will serve as an invaluable resource for DB privacy professionals and users. By leveraging the insights from this study, experts gain an enhanced understanding of DB privacy issues that need addressing. Consequently, they can prioritize their focus on the most significant security challenges. This, in turn, equips DB users with an awareness of their potential privacy risks. Thus, this study benefits consumers by assisting organizations in developing secure DB systems, mindful of the challenges they face (Table 10 ).
Furthermore, professionals such as DB designers, project managers, and scholars specializing in secure DB design are keen to keep abreast of the latest developments. This research provides DB developers with insights into novel strategies for DB security and the latest advancements in DB technology. Journals such as “VLDB,” “Computers & Security,” “DI,” and “JNS” should be of particular interest to them. Consequently, they would find it beneficial to scrutinize papers available from the “ACSAC” and “IWDW” Conferences and Workshops. The aforementioned venues present optimal resources for studying reliable DB development.
These venues, recognized for their focus on secure DB design, encourage scholars to contribute high-quality academic articles. The outcomes of this study will inform experts’ decision-making processes, providing guidance on where to invest when developing tools and methodologies for safeguarding DB systems. Lastly, it underscores the need for organizations to provide appropriate training for their customers to tackle critical challenges.
Acknowledgements
The authors would like to acknowledge the support provided by the Deanship of Scientific Research via project number DF201007 at King Fahd University of Petroleum and Minerals, Saudi Arabia.
Open Access funding provided by University of Oulu including Oulu University Hospital.
Author information
Authors and affiliations.
Department of Computer Science and IT, Software-Engineering-Research-Group (SERG-UOM), University of Malakand, Chakdara, Pakistan
Asif Iqbal & Siffat Ullah Khan
Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia
Mahmood Niazi
Interdisciplinary Research Center for Intelligent Secure Systems, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia
Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka, 72311, Saudi Arabia
Mamoona Humayun
Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, Kota Samarahan, Sarawak, Malaysia
Najm Us Sama
M3S Empirical Software Engineering Research Unit, University of Oulu, Oulu, Finland
Arif Ali Khan
Lancaster University Leipzig, Leipzig, Germany
Aakash Ahmad
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to Arif Ali Khan .
Additional information
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
See Table 15 .
See Table 16 .
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
About this article
Iqbal, A., Khan, S.U., Niazi, M. et al. Advancing database security: a comprehensive systematic mapping study of potential challenges. Wireless Netw 30 , 6399–6426 (2024). https://doi.org/10.1007/s11276-023-03436-z
Download citation
Published : 17 July 2023
Issue Date : October 2024
DOI : https://doi.org/10.1007/s11276-023-03436-z
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Database security
- Systematic mapping study
- Secure databases
- Modeling and maintenance of protected databases
- Issues in the development
- Find a journal
- Publish with us
- Track your research
IMAGES