Data Loss Prevention

A Glimpse Into the Future of Data Loss Prevention

The following collection of short topic explanations and articles offers additional insight into the innovative world of data loss prevention.

Data Loss Prevention

Simply explained, data loss prevention is ensuring that data do not exfiltrate out of an organization. The traditional focus for data loss prevention has been to implement technical controls at organizational network entry and exit points to examine data coming in and leaving the network. Organizations have also implemented controls at data repositories and endpoint devices. Organizations have also imposed limitations on removable devices and use of personally owned devices to ensure that data exfiltration does not occur.

However, with increasingly complex and widespread geographic dispersion of data and the sophistication of modern network attacks, highly sophisticated and innovative data loss prevention technologies are emerging. The focus on these more modern data loss prevention strategies is on the data, not the storage endpoints or even the enterprise network. The idea is simple: Once data are removed from a defined context, they cease to be meaningful and in some cases, they even self-destruct.

  • Takebayashi, T., Tsuda, H., Hasebe, T., & Masuoka, R. (2010). Data loss prevention technologies. Fujitsu Scientific and Technical Journal46(1), 47–55.

Big Data Analytics

The main purpose of analytics is to examine existing data and the knowledge contained in the data to observe patterns and relationships in order to discover new information, and eventually to create new knowledge and intelligence. The purpose is to make better informed data-driven decisions. Big data analytics create new knowledge from large-scale unstructured data or big data.

  • LaValle, S., Lesser, E., Shockley, R., Hopkins, M. S., & Kruschwitz, N. (2011). Big data, analytics and the path from insights to value. MIT Sloan Management Review52(2), 21.

Big Data Integrity

With massive dispersion of data and with the rapid proliferation of unstructured data—known also as big data—organizations are increasingly relying on big data and big data analytics to guide decisions. Data integrity is important because flawed data will lead to flawed decisions.

While data integrity is easier to maintain when the data reside within defined organizational boundaries, big data creates a complex data integrity challenge. In a big data environment, it is easier for malicious actors to modify data in order to put forth a different analysis and decision. These malicious actors can even be highly sophisticated state actors. This is why big data integrity is important for cybersecurity professionals. Big data integrity can also have national security implications.

  • Liu, C., Yang, C., Zhang, X., & Chen, J. (2015). External integrity verification for outsourced big data in cloud and IoT: A big picture. Future Generation Computer Systems49, 58–67.


In order to protect data, both in terms of confidentiality and integrity, we have long relied on such technologies as encryption, hashes, public key infrastructure (PKI), and other similar technologies. Blockchain technology elevates the level of both confidentiality and integrity. Blockchain can also add a level of anonymity to the data.

Blockchain technology is used in the emerging new world of digital currencies such as Bitcoin, which has been the currency of choice for perpetrators of ransomware attacks. Blockchain technology is increasingly becoming important in health care to ensure the confidentiality, integrity, and privacy of patient records.

  • Jaag, C., & Bach, C. (2016). Blockchain technology and cryptocurrencies: Opportunities for postal financial services (No. 0056).
  • Pilkington, M. (2015, September 18). Blockchain technology: Principles and applications. In F. Xavier Olleros and Majlinda Zhegu (Eds.) Research handbook on digital transformations. Cheltenham, UK and Northampton, MA: Edward Elgar Publishing.
  • Böhme, R., Christin, N., Edelman, B., & Moore, T. (2015). Bitcoin: Economics, technology, and governance. The Journal of Economic Perspectives29(2), 213–238.

Data Obfuscation

The idea behind data obfuscation is ensuring that defined portions of data or entire chunks of data are disaggregated so that a portion of the data in someone's possession does not provide complete information. While this technique has evolved primarily to solve privacy issues and data deidentification needs in the health care sector, this technique is also widely used in file storage—for increased file security so that a disk only contains bits and pieces of files and not the entire file. So even when a disk is stolen, files cannot be read.

  • Parameswaran, R., & Blough, D. (2005). A robust data obfuscation approach for privacy preservation of clustered data. In Workshop on Privacy and Security Aspects of Data Mining (pp. 18–25).

Data Masking

Data masking is another emerging data protection technique where the true information represented by data is only available within a context or with the authorized manner of access. When the same data are accessed in an unauthorized manner, they show completely false information, which appears to be true to the unauthorized person to make that person think he or she has something valuable—when in fact he or she does not. Such data masking technologies are frequently used in front-end legacy data repositories, which cannot be upgraded due to end-of life or because of expensive upgrade costs.

  • Ravikumar, G. K., Manjunath, T. N., Ravindra, S., & Umesh, I. M. (2011). A survey on recent trends, process and development in data masking for testing. International Journal of Computer Science Issues, 8(2), 535–544.

Operational Context/Context-Aware Security

Context awareness is another way to protect data and to ensure that internet of things (IoT) appliances or similar devices are properly authenticated and authorized. The principle behind this is simple, yet powerful—data should be meaningful within a context or an authorized environment. When taken outside this authorized context, it should be rendered useless or provide useless results to an unauthorized user.

  • Habib, K., & Leister, W. (2015). Context-aware authentication for the internet of things. In Proceedings of the 11th International Conference on Autonomic and Autonomous Systems (ICAS) (pp. 134–139).

Data Tokenization

Data tokenization is a particular form of data obfuscation whereby data are represented by tokens. These data are rendered meaningful only when accessed with proper authorization and authentication. When accessed in any other manner or stolen, these tokens are meaningless. This form of data obfuscation is increasingly being used in the payment card industry (PCI).

  • PCI Security Standards Council. (2015). Tokenization product security guidelines—irreversible and reversible tokens.


Tamperproofing is probably one of the most important features needed by IoT and other similar network devices. Just as medicine or food can show signs of tampering, an IoT device should have tamperproof characteristics. IoT devices that have been tampered with should advertise any unauthorized change to contents or configuration; they could even stop working or self-destruct.

  • Cloud Security Alliance. (2016). Future-proofing the connected world: 13 steps to developing secure IoT products.
  • Bhatia, E. K., Ohri, S., Kaur, G., Dhankar, M., & Dabas, S. (2015). Future perspective and current aspects of Internet of things enable design. International Journal of Software Engineering and Its Applications9(8), 127–132.
  • Kopetz, H. (2011). Internet of things. In Real-time systems (pp. 307–323). Springer US.

Data Governance

Data governance is the active oversight and maintenance of data, its location, its integrity and ultimately the preservation of its value to an organization and authorized users. Proper data governance not only protects the true value of data to an organization but can also enhance it. Data governance can ensure that only data with high levels of integrity are fed into data analytics and intelligence engines so that high-value data-informed decisions can be made.

  • Informatica. (2013). Holistic data governance: A framework for competitive advantage.