Provide a brief overview of the case study and prepare a diagram for the ENISA Big Data security infrastructure
Enisa’s Threat Landscape 2016 contains the Security of Big Data. Today, cyber attacks are focused on asset of internet in the form of Big Data (Enisa.europa.eu, 2017). But recent developments have shown that it is going to become most prominent tool for security professionals. Charm of Big Data is that it records the patterns and builds a sort of intelligence about the threats that has happened or is going to happen in the future. Enisa is intended to provide enhanced cyber security to the European Union and its Member States by identification of Big Data Security Challenges (Enisa.europa.eu, 2017). In the following case study discussion of threats posed by cyber terrorists and criminals are dealt and the way to tackle them in the realm of Big Data is discussed. Actually, Big Data is the type of technology, the tool and the analytics of data which is used to process huge amount of data as one entity or separate entities.
This report focuses on the challenges of information security and also lays emphasis on the changes Big Data has brought in the business world. We are observing a revolution caused by digital data, automation of systems and computation and digitization of every device available. All the activities of humans, machines and even work done in industries are amounting to data in every form. Everything is quantifiable into data (Brewer, 2015). Amount of data is increasing by unprecedented rate and to collect that much amount of data is becoming a daunting task. Big Data is being used by research institutes, scientists, and corporate gurus for analyzing patterns and to build new Big Data technologies (Enisa.europa.eu, 2017). Business men are collaborating with technology providers to build new systems of collection and analysis of large amount of data and to process them very fast. New business models are being developed in which Big Data analysis is being done faster than before with much precision and with pin perfect accuracy. It can be said that Big Data development is still in its research phase but the current rate of development will soon see it to its maturity. One of the biggest problems in the case of Big Data is security. Big Data systems are multifaceted and heterogeneous and to design security of this type of system should be holistic without any lose ends. Day by Day new technologies are being integrated into the Big Data systems which raises new kinds of security issues which needs to be handled. In the following report we will be discussing security problems of various facets which are encountered by Big Data Systems. These security issues are Authentication and access control, Security of Data Management, Filtering and Source Validation.
Out of the ‘’Top threats’’ which threat would you regard to be the most significant and why?
The biggest threat in case of Big Data Security management is that of the privacy. Concerns related to the privacy of data and other issues related to it are the most vital ones which need to be addressed as a top priority (Enisa.europa.eu, 2017). Privacy in the realm of Big Data denotes the blocks of data which is identifiable information which is used to create identity of any person or entity. When an expert designs the security of Big Data system, there are certain points which are kept in mind for their further move. The kind of personal data is being shared and with whom is the first concern. Second concern is the privacy and authentication of the cyber network on which the data is being shared, to ensure that the data is not being viewed by any undesirable entity across the network. This also raises concern of unlawful communication. The third concern is of anonymous communication (Chen & Storey, 2012). It is strictly denied that any anonymous is communicating with the system which can lead to a disaster for the security of the system.There are many experts who advocate about the business opportunities created by the Big Data system but some people are reluctant about the infinite possibilities of threats that can be posed by private data collectors. Here in Big Data systems profile data is not the cause of concern but the personal data which can be misused in infinite ways if put in wrong hands. Protection of data by using the knowledge, technologies, tools and programs can be done but a greater risk is created by compiling and making available huge amount of data at one place.But at the same time cyber security systems and processes are tailored to protect and analyze data from potential threats which can be caused by latest technologies of data management and analytics.
As it can be said that cyber security is an addition to the system of Big Data, privacy of data is magnified when the data transactions create large number of logs which in itself becomes a data which is visible and has to be protected. A certain number of logs are for the viewing of the users as they are confidential information which can be exploited in different ways (Enisa.europa.eu, 2017). In Big Data systems not all of the available data is relevant for its users. Only few of the data systems are being used and the rest are view protected and access protected. If one user wants to look into the data he is not authorized to then he won’t get through the system. As Big Data systems are complex and are layered they are designed to that purpose for the confidentiality.
Identify and discuss the key Threat Agents. What could be done to minimize their impact on the system? Based on the data provided, discuss the trends in threat probability.
There are number of threat agents which can compromise the security of Big Data Systems of an organization which needs to be identified and rectified on a priority basis. The following are some of the key threat agents:
Source of data: Data comes from innumerable sources in Big Data systems. The reliability and authenticity of data is a key concern. To mitigate this kind of threat validation and filtering on the incoming data is done on various levels. To provide the trust and protection of data on every level of data is required (Danaher, 2014). These levels are defending on the basis of criticality of data which is coming into the system. There are two levels of data one of which is digital relay and is considered critical because any compromise at this level could change the functionality of the entire system. Another one is battery management level which is less critical. These levels are protected by validation of data.
Application Software: Big Data is used in combinations of closed and open sources of software. There are multiple software application models in which data flow is continuous (Gandomi & Haider, 2015). This increases the threat of data loss from one software to another.
Infrastructure: Security of the meters through which data is maintained in the system comes under the ambit of infrastructure. It is the physical security of the system which is of prime importance (Gantz & Reinsel, 2012). Sensor networks and data meters are the main sources of collection of large amount of data in Big Data Systems. Monitoring of these input points is done so that the entire Big Data System remains secure and sound.
DOS Attack: Since Big Data systems are network of distributed systems Denial of Service Attack is most prominent threat. In this technology availability of service is the key to run the system properly (Gao, Peng & Li, 2015). DOS Attack prevents the system from responding to a request which makes the system unavailable. This attack is linked with the physical cyber security. If the system is attacked by DOS it gives wrong results. It is checked by applying security to the input systems and monitoring the kind of requests made from time to time.
Authentication: Access control is one of the major threats in the Big Data system. There are heaps of data including personal and classified data of various fields. It becomes critical to protect that kind of data from falling into wrong hands (Hurst & Fergus, 2014). Some of the energy switches act automatically in the Big Data System creates a risk of data as switches have the access to the data available in the system. Mainly those switches are not designed for authentication which leads to the problem of access monitoring. This is rectified by authentication of each switch for each transaction.
How could the ETL process be improved? Discuss.
Extract, Transform, Load is the process by which the data is inserted into the Big Data analysis system. If there is not much data in the beginning ETL process runs for a night and all the data is uploaded in the system (Enisa.europa.eu, 2017). However, with time the flow of data increases the ETL process takes longer duration to execute. Then he need arises to improve the process of ETL to get work done at a faster pace and with desired accuracy. Some methods to improve the performance are:
Tackle Bottlenecks: Keeping records of the metrics such as hardware usage, time and processed records is must and should be done on a priority basis (Kitchin, 2014). Log of the amount of resources being used should be tracked to find out the heaviest one and the lightest one.
Incremental Data Loading: Data loading takes time. To load only the difference between the previous data and current data reduces the time of the process (Leyshock & Tufte, 2014). It may be difficult than loading the entire data but needs to be done for performance improvement.
Division of Large Tables: Relational database improves the performance of data processing instead of using large tables. The method is to cut down large tables into small ones by date or by value. Indices are created for the partitioned tables to grant quick and easy access (Kwon & Shin, 2014). It also facilitates switching of data in and out of the table instead of using the conventional insertion and deletion methods.
Remove Heavy Data: In Big Data collection there are heaps of data but finding out relevant data and removing vague data is very important to enhance the ETL process (Omosebi & Hill, 2015). The data which is entering the data warehouse should be relevant and should be useful. At first small amount of data is targeted for scrutiny and with growth in the amount of data it is narrowed down with precision.
Cache the Data: Cache is known to be the fastest memory in a computer system. If the data of the system is allocated into cache it will gradually improve the speed of processing of the data. It is a better option than putting the data into hard drive which takes comparatively more time to process (Oseku-Afful, 2016). It has an advantage that its memory is small and not all data can be put which automatically excludes unimportant data.
Parallel Processing: Serial processing of data will take more time than processing the data in parallel (Lynch, 2014). Optimization of sources is needed so that more than one type of data is getting processed with sorting, aggregating functions deployed in it.
Use of Hadoop: Apache Hadoop is a tool designed for processing large amounts of data over a cluster of systems (O'Neil, 2017). It maintains the integrity of the system by excluding the duplicate files.
To sum up, should ENISA be satisfied with its current state of IT Security? Why? Or why not?
ENISA needs to improvise and make amendments in its current state of IT Security. ENISA needs to adopt some recommendations and mitigate the current underlying issues to make its Big Data System more robust and efficient. According to the issues discussed above there are certain recommendations which needs to be taken into account and should be implemented.
Policy makers of ENISA should facilitate some guidance for the security of Big Data systems in the area which is critical (Enisa.europa.eu, 2017). The nature of Big Data System is distributed among various levels and different cluster of systems. For this reason, the crucial support systems need proper guidelines to make their system full proof in the aspect of security. Another recommendation is that compliance with the security standards is required of the products which are used in the Big Data System. Products may be devices, cloud or services. It can eliminate threat as the products will be in certified from companies who provide original and cyber-attack proof products especially made for systems of Big Data. They offer solutions and are more flexible and cost efficient and provide self-certification and self-attestation.
Authorities who are competent in the critical sectors should advice the vendors to apply authentication mechanisms and protocols to increase the level of security of the products. Since a system constitutes mainly of the devices which are deployed to process and run the entire system. They have access to the data and processing is also done through them. So, if the devices are not capable enough to provide security o the data and processing then it becomes a serious security concern and should not be accepted. There should be standardization in the security of Big Data Systems. These standards should be made by some governing bodies. As of now there are no security standards listed anywhere in the name of Big Data Systems (Enisa.europa.eu, 2017). Keeping in mind the speed of growth and level of seriousness of the Big Data Systems some groups of industries or organizations should make standards of security for rest of the world to follow. To comply on those standardizations vendors should train and update their technical staffs to enhance security and learn standards of security applicable for Big Data Systems. New systems will evolve and new technology will come to replace the old ones. The technical staff should know the latest technology and should be well versed to tackle any challenge of the security of the system. For that, certifications and trainings should be provided to the staff from the vendors or organizations that use Big Data Systems.
Brewer, R. (2015). Cyber threats: reducing the time to detection and response. Network Security, 2015(5), 5-8.
Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS quarterly, 36(4).
Danaher, J. (2014). Rule by algorithm? Big data and the threat of algocracy. Institute for Ethics and Emerging Technologies.
Enisa.europa.eu. (2017). Big Data Threat Landscape — ENISA. Enisa.europa.eu. Retrieved 8 September 2017, from
Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2), 137-144.
Gantz, J., & Reinsel, D. (2012). Extracting value from chaos. IDC iview, 1142(2011), 1-12.
Gao, H., Peng, Y., Jia, K., Wen, Z., & Li, H. (2015, September). Cyber-Physical Systems Testbed Based on Cloud Computing and Software Defined Network. In Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 2015 International Conference on (pp. 337-340). IEEE.
Hurst, W., Merabti, M., & Fergus, P. (2014, May). Big data analysis techniques for cyber-threat detection in critical infrastructures. In Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on (pp. 916-921). IEEE.
Kitchin, R. (2014). The real-time city? Big data and smart urbanism. GeoJournal, 79(1), 1-14.
Kwon, O., Lee, N., & Shin, B. (2014). Data quality management, data usage experience and acquisition intention of big data analytics. International Journal of Information Management, 34(3), 387-394.
Leyshock, P., Maier, D., & Tufte, K. (2014, October). Minimizing data movement through query transformation. In Big Data (Big Data), 2014 IEEE International Conference on (pp. 311-316). IEEE.
Lynch, C. (2014). Big data: How do your data grow?. Nature, 455(7209), 28-29.
Omosebi, O., Sotiriadis, S., Asimakopoulou, E., Bessis, N., Trovati, M., & Hill, R. (2015, November). Designing a Subscription Service for Earthquake Big Data Analysis from Multiple Sources. In P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2015 10th International Conference on (pp. 601-604). IEEE.
O'Neil, C. (2017). Weapons of math destruction: How big data increases inequality and threatens democracy. Broadway Books.
Oseku-Afful, T. (2016). The use of Big Data Analytics to protect Critical Information Infrastructures from Cyber-attacks.
Patel, A. K., & Bhilare, D. S. (2013) A Survey of Big Data Analytics for Network Traffic Monitoring to Identify Cyber Attacks.
Scherer, V., & Kaponig, B. (2013, December). EMC Hadoop as a service solution for use cases in the automotive industry. In Connected Vehicles and Expo (ICCVE), 2013 International Conference on (pp. 488-493). IEEE.
Terry, N. P. (2012). Protecting patient privacy in the age of big data. UMKC L. Rev., 81, 385.
Townsend, A. M. (2013). Smart cities: Big data, civic hackers, and the quest for a new utopia. WW Norton & Company.
Vashisht, P., & Gupta, V. (2015, October). Big data analytics techniques: A survey. In Green Computing and Internet of Things (ICGCIoT), 2015 International Conference on (pp. 264-269). IEEE.