Essential Guide To Data Science Application Essay


Discuss about the Essential Guide To Data Science Application.



The Industrial Internet project, an information centric solution, is proposed by General Electric to converge its machines, data and the internet in one connection to provide real-time analysis (MIT Sloan Management Review., 2018). The General Electric is a multinational company with varied companies having different types of sectors and operations. The company has four themes for which they want to provide connectivity to all applications to access them at a single time. The project is to be developed to provide not only as a remote system but effective and secure at all conditions. The report focuses on providing integration by General Electric to manage its information system and data in Industrial Internet project. The report also focuses on explanations based on EAI RA templates and cloud options and suitability.

The outline of the report is explanation based on architecture overview and component relationship diagram of EAI RA templates, information management and integration strategies, and cloud infrastructure strategy, application and service integration strategies and recommendations.


The company has proposed Industrial Internet for mainly industries such as oil and gas, power generation and healthcare departments where system failures and unplanned downtime are frequent. These conditions can result in high-risk situations and sometimes life-threatening situations. The Industrial Internet has powerful operations which can detect failures and critical situations and result in saving the company from losing information (Digital and Things, 2018). The company created its own software that is Predix to address the needs of company in terms of digital innovation. The project is explained through EAI RA templates that are architecture overview diagram and component relationship diagram. The other aspects are also discussed and they are strategies for integration of multiple data sources used for different purposes by different teams, suitability of various cloud infrastructure and integration of services and applications.

EAI RA Templates

Architecture Overview Diagram

Component Relationship Diagram

Information Management and Integration

The present scenario of companies adopting data analytics are for decision-making purposes for their own company in real-time environment. The company needs to extract insights from data in operational areas such as sales department, finance department, marketing department, procurement department and other departments. These departments need to have its own targets for success and analysis of data developed and integrated in each department (Shmueli et al., 2015). The data in each department needs to be separated from other departments to ensure security, privacy and data breaches. This needs to be started by looking at each team individually with respect to success that defines each department. The example includes volume of leads created by marketing department team to be passed on to sales department team for overall target of the company. The next example is that sales department team looks conversely to what marketing department team looks for success and that is they look at volume of appointments created and converted to new businesses. The information technology department team looks to support greater use of data analytics and turns the business process more successful in getting the right us of data in the company. The data sources are such as networking data sources including web analytics, marketing automation data and CRM data. The data sources are media, cloud, web, IoT, databases in company (Talia, 2013). The media is the most popular source of data analytics in the company. This source is found in every company and it includes Google, Facebook, Twitter, YouTube and Instagram. This media helps to get an insight of consumer preferences and changing trends nowadays. The next is cloud storage which is adopted by most of the companies nowadays. Cloud storage includes structured and unstructured data which is generated. Cloud storage provides public and private clouds which are sourced to make and efficient and economical data sources in the company. The next is web data source. The public web generates data sources in the company. The public web is used to provide free and quick information and diverse usability to the company. The public web data provides company to enjoy the leverage of data without having to develop their own infrastructure (Baesens, 2014). The next one is IoT. IoT is internet of things which is a valuable source of data analytics in the company and it is machine generated content. This is usually generated from the sensors in the company that are connected to electronic device in internet of things. Databases data sources are the major source in the company as database is required in every aspect and segments in company. The databases pave the way for low investment and information technology costs. Database sources are structured as these are arranged formally and can be extracted without any hassle.

There are five components in data strategy and they are identification, storage, provision, integration and governing. The identification is done by constructing and sharing data across the company whether structured or unstructured content. The storage is important to retrieve data in future for various purposes in the company (Jaseena and David, 2014). The two types of data that are stored are internal, for example customer details, and external, for example third party data. The data is stored in separate locations from where the data can be easily accessed by the company in future without creating any copy and any misuse. The provision is that all data must be packaged and prepared for efficient sharing among the company’s workforce for standard business process. The integration of data is costly in company as it involves not only data extraction transformation or loading process but also moving and combining data across the system (Davenport and Dych?, 2013). The governing is to establish, manage and communicate information policies and procedures for effective usage of data across the company.

The data integration is challenging for information technology department in companies due to high cost. The data integration function works as each application creates its own integration logic and the data contained in each application differ across each application in the company (Gal, 2015). The data sources such as client data, pricing data, sales data and client contacts data are perceived and valued by different applications such as sales department and campaign management department.

Application and Data Storage Infrastructure Design

The cloud storage is the most necessary asset for every company. The different service models of cloud storage are IaaS, PaaS and SaaS (Rittinghouse and Ransome, 2016). The different deployment models are private cloud, public cloud, community cloud and hybrid cloud. The private cloud is the one where the services and infrastructure are maintained and efficiently managed by the company or third party person. The public cloud is the one where the services stored off-site and it is accessed over the internet by the company. The community cloud is the one which exists where various organizations share access to a particular private cloud having similar security functions and considerations. The hybrid cloud is the one where company takes advantage of both private and public cloud services for its operations (Herbst, Kounev and Reussner, 2013). The service model IaaS is Infrastructure as a Service is buying or renting company’s computer system power and disk space frm external providers for business use. SaaS is Software as a Service and it is the most common among the other models in businesses as it is accessed using a browser on the internet (Mishra et al., 2013). PaaS is Platform as a Service is described as a crossover of both SaaS and IaaS service models.

The suitability of cloud storage models are based on services and operations. SaaS is the most familiar and accessible service for customers in real world. SaaS is basically used to reduce the cost of company by removing technical staff to manage install, manage and update software and also cost of licensing software is reduced. This is used on subscription based model. PaaS is provides platform for software to be developed and deployed for company. PaaS provides an environment where clients can deal with server software as well as server hardware efficiently (Wang et al., 2017). This provides scalability for the company on business side. PaaS expands its services when demand grows for resources in company. IaaS is used for company where they need to have complete control over their high performing applications when there is scaling up or down is based on traffic networks.

The cloud adoptions in General Electric were due to six reasons (The cloud advantage, 2015). They are speed in implementation and innovation, end-to-end security, low costs, ability to scale, ubiquitous and global visibility and failure of isolation with micro services. The speed provided by cloud storage is efficient and rapid. The security provided by cloud storage is rapid as they patch the vulnerable with latest updates frequently with reduced variations. The lower cost of cloud storage is due to reduction in large scale of big data platform. The ability to scale of cloud storage is advanced and pave ways for customers and businesses to specify the right amount that is needed for their usage by scaling up and down based upon requirement changes. The ubiquitous and global visibility of cloud storage is to provide operation beyond the geographical boundaries for global business optimizations and operations. The failure isolation with micro services of cloud storage provides reusability of software.

The company developed its own cloud software that is Predix (GE Digital, 2018). The cloud infrastructure strategies for the company are given in the following discussion. The first one is that cloud computing needs to be more developed and advanced to be adopted by the company as it has trust issues regarding security and legacy systems. The second one is to deliver greater business agility, scalability and support by the information technology department that will be a high priority in coming future. The third one is to tackle with the challenges related to recruiting, training, and retaining of cloud architects, developers, engineers, support staff and service professionals within the company. The fourth one is to measure the contributions of expenses by the cloud storage and managing the cost to maintain the overall financial outcome of the company (Hwang, Dongarra and Fox, 2013). The fifth one is creation of cloud decision framework to keep a track on technology evaluations and company investments aligned with business strategies developed for business growth. The sixth one is mitigating risk and liability for future business operations as they can be cause several risks to company and they should be looked upon.

Application and Service Integration

The integration strategy for application and services are discussed in the following paragraph. There are two types of application integration and they are automating business process by connecting two or more applications and creating composite application in the company (Bussler, 2013). The first one is solely dependent within an organization where it is sometimes referred to as enterprise application integration and when done between two organizations then referred to as business-to-business integration. The second one is to provide a common front end to a group that has applications existed in the company and easier ad effective to use for the company. Business process automation provides benefits that include four significant points. They are faster process, cheaper process, accurate process and visible process. These are the benefits provided within the organizations. The business process automation across the organization also provides benefits for business process to span faster, cheaper and more accurate in terms of business-to-business integration. The other application integration is composite application. The benefits it provides are that employees can do their jobs more efficiently and effectively and it also provides cross-application functions. The strategies for service integration are addressing systematic and effective procedural barriers for collaboration of business processes (Charter and Tischner, 2017). The next one is to integrate services a collaborative group of service providers are needed. This is used to meet the needs of particular client individually. The next is a clear establishment phase where every member of the company agrees on the scope and structure of the business arrangements for future operations. The next is an agreement about the groups that are targeted to assess the business program and the process to assess the program. This is done to integrate the services for business purposes through target groups (Hoffmann, 2017). The last one is a case plan structure model to be used for guiding assessment and the structure of the case plan. This is the final strategy which results in the case plan model for integrating services in the company.


The above discussions conclude that the project of the company is valuable and effective in terms of security, cost, speed and real-time monitoring and analytics. The report focuses on the various aspects of project adoption and implementation using the templates given in the discussions. The strategies for data collection from different sources and for different purposes are discussed. The cloud options are also evaluated for different purposes and their suitability for these different purposes. Cloud infrastructure strategy focuses on how cloud strategy is necessary for business purposes and how it can be adopted effectively for future use. The integration of services and applications suggest that applications and services need to be managed and maintained properly. Therefore, it can be deduced that the strategies evaluated and described in the above discussions need to be taken into account for integrating and managing the company’s information system. Adding to this, evaluation of what data to be fed into the project is also to be taken into account.


  • To begin with the project the company should begin to inventory and share best security practices by establishing common security across the organization.
  • To reorient the overall business strategy of the company to leverage the full potential of latest advancement in the project.
  • To re-examine and update the company’s data protection and liability policies timely by policy-makers in the company.
  • To have active participation from stakeholders in the areas related to security, interoperability and management of system risks in the company.


Baesens, B., 2014. Analytics in a big data world: The essential guide to data science and its applications. John Wiley & Sons.

Bussler, C., 2013. B2B integration: Concepts and architecture. Springer Science & Business Media.

Charter, M. and Tischner, U. eds., 2017. Sustainable solutions: developing products and services for the future. Routledge.

Davenport, T.H. and Dych?, J., 2013. Big data in big companies. International Institute for Analytics, 3.

Digital, G. and Things, E., 2018. Everything You Need to Know About the Industrial Internet of Things. [online] GE Digital. Available at: [Accessed 1 Jan. 2018].

Gal, A., 2015, August. Big data integration. In Keynote speech at international conference on open and big data (OBD 2015).

GE Digital., 2018. GE Announces Predix Cloud - The World’s First Cloud Service Built for Industrial Data and Analytics. [online] Available at: [Accessed 1 Jan. 2018].

Herbst, N.R., Kounev, S. and Reussner, R.H., 2013, June. Elasticity in Cloud Computing: What It Is, and What It Is Not. In ICAC (Vol. 13, pp. 23-27).

Hoffmann, E., 2017. User integration in sustainable product development: Organisational learning through boundary-spanning processes. Routledge.

Hwang, K., Dongarra, J. and Fox, G.C., 2013. Distributed and cloud computing: from parallel processing to the internet of things. Morgan Kaufmann.

Jaseena, K.U. and David, J.M., 2014. Issues, challenges, and solutions: big data mining. NeTCoM, CSIT, GRAPH-HOC, SPTM–2014, pp.131-140.

Mishra, A., Mathur, R., Jain, S. and Rathore, J.S., 2013. Cloud computing security. International Journal on Recent and Innovation Trends in Computing and Communication, 1(1), pp.36-39.

MIT Sloan Management Review., 2018. GE’s Big Bet on Data and Analytics. [online] Available at: [Accessed 1 Jan. 2018].

Rittinghouse, J.W. and Ransome, J.F., 2016. Cloud computing: implementation, management, and security. CRC press

Shmueli, G., Bruce, P.C., Patel, N.R., Yahav, I. and Lichtendahl Jr, K.C., 2017. Data Mining for Business Analytics: Concepts, Techniques, and Applications in R. John Wiley & Sons.

Talia, D., 2013. Clouds for scalable big data analytics. Computer, 46(5), pp.98-101.

The cloud advantage., 2015. [ebook] General Electric. Available at: [Accessed 1 Jan. 2018].

Wang, L., Ranjan, R., Chen, J. and Benatallah, B. eds., 2017. Cloud computing: methodology, systems, and applications. CRC Press.

How to cite this essay: