INFS 5095 Big Data Basics For Bunnings Hardware Essay

Question:

Develop a proposal for management of a nominated organisation to implement Big Data capabilities Include a high-level architecture and recommendations of which Big Data technologies and methodologies should be introduced and why.

Answer:

About Bunnings Hardware

The burnings hardware is a business entity formed in 1887 by the Bunning brothers who immigrated to Australia from the United Kingdom. Burnings Hardware was bought by the Wesfarmers in the year 1994, making it a subsidiary of the Wesfarmers Group. The headquarters of the Bunnings Hardware is located in Hawthorn East, in the state of Victoria in Australia (Bunnings Hardware, 2018).

The company largely deals with the sale of home equipment and hardware with a bulk of its market being in Australia and New Zealand. Since its formation in 1887, the company has been able to establish market dominance by buying competing companies. This especially happened in the 70s and 80s prior to the acquisition by Wesfarmers. This market dominance is visible in present day with the company having an estimated 20% market share in Australia in the hardware business (Bunnings Hardware, 2018).

The Bunnings Hardware attempted entry into the United Kingdom and Ireland markets failed to materialise. This came after Wesfarmers bought the Homemade Company that was based in the United Kingdom and Ireland. This acquisition consisted of the brand and assets of the Homemade Company. This was intended to mark the entry of Bunnings Hardware in the United Kingdom and Ireland hardware markets, with the Homemade Stores being subsequently rebranded as Bunnings Stores. Losses however saw the move to enter the United Kingdom and Ireland Markets fall apart with Wesfarmers opting to sell the United Kingdom and Ireland business (Bunnings Hardware, 2018).

Despite the failed attempt to enter the United Kingdom and Ireland hardware market, Bunnings still maintained its dominance in the Australian and New Zealand hardware markets.

Key Business priority

Bunnings Hardware represents what can be referred to as already established brands. The company is also a traditional brand having been established in the year 1887. Established and traditional brands are faced with three main concerns; rapid technological advancement, market entry by an external brand and maintaining sales (Farris & Neil, 2010).

Bunnings Hardware may not be directly affected with rapid technological advancements. This is since Bunnings Hardware is a retailer and not a manufacturer. The company only concerns itself with the sale of the equipment, the burden of producing equipment with the latest technology falls on the manufacturers. This hence cautions Bunnings Hardware from the impact of rapid technological advancements that sees a technology being rendered redundant within a short period of time. This redundancy effect of rapid technology advancement can be attributed to the fast pace at which new product models with better technology are being developed and produced (Laudon & Guercio, 2014).

Bunnings may however be affected by other aspect of technological advancements such as marketing and online stores. The rise of social media has seen an evolution of marketing strategies and updating marketing strategies to fit the world of social media is vital in maintaining dominance (Pappas, 2016). The rapid technological advancements have also led to the rise in online stores. Customers are more often giving preference to shopping online than visiting an actual store to shop for products. This has seen a surge in the number of physical stores opening up their businesses to the online customers by setting up online stores (Kotler, 2009). Making this move and ensuring the quality of service on the online store is satisfactory is important in giving a company a good grip in e-commerce (Kiechel, 2010).

The threat of entry of an external hardware brand into the Australian and New Zealand market is one that Bunnings has to prepare for. The modern world of business is highly interconnected with very few or no laws preventing market entry by a foreign brand (Lechner & Boli, 2012). In countries that are free markets, market entry is particular easier holding other factors constant (Vujakovic, 2010). This lack of barrier to entry in terms of government legislation presents an opportunity for foreign brands to expand into the Australian and New Zealand hardware markets. The local competitors may lack in capital to effectively compete with Bunnings Hardware, but a foreign company may have enough capital to compete or even surpass Bunnings Hardware.

The two above concerns are closely linked to sales. This hence makes sales the key priority for Bunnings Hardware. Both the evolution of the marketing strategies and rise in e-commerce associated with rapid technological advancements are intended to maintain or increase the sales of Bunnings Hardware. The threat of entry of an external hardware brand into the Australian and New Zealand markets can also be prepared for by having robust plans for maintaining or increasing the sales for Bunnings Hardware.

Maintaining and increasing sales is an aspect of business that helps in the establishment or maintenance of market dominance (Anthony & Johnson, 2008). Since Bunnings Hardware is already the dominant business entity in the hardware industry in Australia and New Zealand, to maintain that status it has to have consistent sales and in order to strengthen their market dominance they will have to increase their sales.

Big data approach

In order to address the key business priority of Bunnings Hardware, which is maintaining and increasing sales, we have will apply two main big data approaches. These approaches will be: Cluster Analysis and Regression Analysis.

Cluster Analysis

Cluster analysis is an exploratory data analysis tools that groups either variables or observations depending on a predetermined condition (Jon, 2006; Nguyen, et al., 2009). This condition may be a variable within the same dataset for both instances of observations and variables.

Cluster is an important tool in identification of similarities and differences in the variables or observations in a dataset, that is, identifies and groups variables or observations that are similar and groups separately those that are not alike (Ren & Ying, 2010).

The categorisation provided by cluster analysis allows for the identification of observations or variables with high likelihood of having closely related characteristics (Yu, et al., 2011). Such information is important in determining what business approach to implement for different segments and niches of the relevant market. The categorisation is also essential in gaining a better understanding of the market dynamics over time.

Bunnings Hardware can apply the use of this statistical analysis tool in four ways:

Clustering of time periods depending on sales

The time period considered can be in years, seasons, months or weeks. Depending on the availability of data on the sales. Considering the case of clustering of yearly sales data, the results from this analysis will give information on which years there was similarity in terms of sales and which there were differences. The resultant clusters/groups can then be examined to determine what aspects of those years made the sales similar.

The cluster representing the group of years with the highest sales can be examined and findings can form the new policy on maintaining and improving the levels of sales for Bunnings Hardware moving forward.

The findings from the cluster representing the group of years with the lowest sales will also be important in informing Bunnings Hardware of what to avoid in order to improve sales.

This analysis can also be store specific. In this case, instead of focusing on the general sales made by Bunnings Hardware as a whole, the focus will be in the sales made by specific stores of interest over a specified number of years. The years will then be clustered and a similar examination conducted to establish what made certain group of years have high sales and what made other groups not have high sales. This information is especially crucial when interest is in making a poorly performing store, which was previously performing well, perform well once more and maintain the consistency in sales.

Clustering of Stores depending on Sales

This cluster analysis will include the analysis of all the stores owned and operated by Bunnings Hardware in New Zealand and Australia. The focus of the research will be in grouping the stores in terms of their levels of sales. The resultant clusters/groups can then be examined to determine what aspects of those stores made the sales similar.

The cluster representing the group of stores with the highest sales can be examined and findings can form the new policy on maintaining and improving the levels of sales for Bunnings Hardware moving forward.

The findings from the cluster representing the group of stores with the lowest sales will also be important in informing Bunnings Hardware of what to avoid in order improve sales. This cluster will also be an indicator for the Bunnings Hardware of stores that are performing poorly.

Another important clustering of stores would be that of sales in the online stores during different periods of time. The sales in different time periods can be grouped through this analysis and the most profitable times of day, week, month or year identified.

The information from this analysis would assist in the resource allocation in terms of when the online stores would require maximum attention from the staff. This information will also inform the marketing decisions in order to have customers spending more time on the online store.

Clustering of Regions/Cities depending on Sales

The results from this analysis will give information on which regions or cities showed similarity in terms of sales and which showed differences. The resultant clusters/groups can then be examined to determine what aspects of those regions or cities made the sales similar.

The cluster representing the group of regions or cities with the highest sales can be examined and the findings can form the new policy on maintaining and improving the levels of sales for Bunnings Hardware moving forward.

The findings from the cluster representing the group of regions or cities with the lowest sales will also be important in informing Bunnings Hardware on what to avoid in order improve sales.

This analysis can also be city specific. In this case the focus will be in the sales made in specific regions or cities of interest over a specified number of years. The years will then be clustered and examination conducted to establish what made certain group of years have high sales and what made other groups not have high sales.

When interest is in understanding why sales in a specific region or city have decreased and consequently developing a solution to regain high sales in the region or city, the information from this analysis would be important.

Clustering of Products depending on Sales

The results from this analysis will give information on which products showed similarity in terms of sales and which showed differences. The resultant clusters/groups can then be examined to determine what aspects of those products made the sales similar.

The cluster representing the group of products with the highest sales can be examined and the findings can form the new policy on maintaining and improving the levels of sales for Bunnings Hardware moving forward. This may include adopting the marketing strategies used in marketing of products in this cluster as the blueprint for marketing for all Bunnings Hardware products.

The findings from the cluster representing the group of products with the lowest sales will also be important in informing Bunnings Hardware on what to avoid in order improve sales.

This analysis can also be product specific. In this case the focus will be in the sales made of the specific products of interest over a specified number of years.

The years will then be clustered and examination conducted to establish what made certain group of years have high sales and what made other groups not have high sales. The information from this analysis become especially important when interest is in understanding why sales of a specific product have decreased and consequently developing a solution to regain high sales for that product.

Regression Analysis

Regression Analysis can be described as a data analytics technique that defines the nature of the relationship between variables in a dataset (Oscar, 2009; Hosmer, 2013). The definition is usually in the form of an equation with a basic form as below:

In this analysis the variable of interest is referred to as the dependent variable while the other variables in the dataset are referred to as the independent variables (Jorge, et al., 2013). The regression analysis for Bunnings Hardware would have the sales as the dependent variable. The independent variables would include aspects such as product category, product type, product price, year, region, city and store.

There are two broad types of regression analysis techniques: explanatory regression analysis and predictive regression analysis.

Explanatory regression analysis is a form of regression analysis that focuses on describing the effect (in terms of nature and level) that the independent variable(s) have on the dependent variable (Tri & Jugal, 2015; Witten Ian, 2011). For the explanatory regression we will determine the effect the independent variables have on the sales of Bunnings Hardware. This will provide information on the most influential factors for the sales of Bunnings Hardware.

Predictive regression analysis is a form of regression analysis which uses the values of the independent variables(s) to predict the value of the dependent variable (Galit, et al., 2018; Han Kamber & Jaiwei, 2011). This analysis gives an indication of what the value of the dependent variable would be if the value(s) of the independent variable are set to particular value(s) (Usama & Padhraic, 2008; Cortes & Mohri, 2014).

In the case of Bunnings Hardware, the predictive regression analysis would be useful in the prediction of the value of the sales for the company by altering the values of the independent variables. This would enable the company to continuously obtain the optimum values of the independent variables that would give the desired sales values.

Information and sources

The sales information for Bunnings Hardware since the year 1994 would provide the necessary data for analysis in this data. The choice for the year 1994 is informed by the year that the Bunnings Hardware’s ownership changed to Wesfarmers. This would be significant considering the new approach to business that the Wesfarmers brought into Bunnings Hardware.

The information source for this data will be the (Wesfarmers, 2018). This source contains data on the annual sales for the Bunnings Hardware from the year 1994, hence making it the most appropriate source of information for this big data analysis.

The table below gives a description of the data variables that will be used in the bug data analysis for Bunnings Hardware:

Data Variable Name

Data Variable Type

Data Variable Category

Measurement Scale

1. Sales

Dependent Variable

Numerical

Ratio

2. Product Category

Independent Variable

Categorical

Nominal

3. Product Type

Independent Variable

Categorical

Nominal

4. Product Price

Independent Variable

Numerical

Ratio

5. Year

Independent Variable

Numerical

Ratio

6. Region

Independent Variable

Categorical

Nominal

7. City

Independent Variable

Categorical

Nominal

8. Store

Independent Variable

Categorical

Nominal

Big data technologies

The application of big data analysis to Bunnings Hardware will require three big data technologies: Processing of Steaming Data, Data Storage and Data Analysis.

Processing of Streaming Data Technology

Despite the availability of secondary data from (Wesfarmers, 2018), there is need of having real time data available for continuous analysis and providing results that are up-to-date. For this to be possible the various stores of Bunnings Hardware must be able to simultaneously send sales information at the close of business every day. This data should then be able to be arranged in terms of the variables. This will be made possible by the Apache Sparks technology which enables the efficient and reliable processing of streaming data.

Data Storage Technology

The streaming data that has been processed by the Apache Sparks requires proper storage prior to the analysis stage. Proper and structured storage of data reduces the chances of loss of value of data as well as enhancing the precision of the analysis of the data (Ulf-Dietrich & Uwe, 2014). The NoSQL Database will provide the best data storage for the processed streaming data. The NoSQL Database allows for the storage of data in a structured manner in the forms of rows and columns. This aspect of the NoSQL Database make it appropriate for the data storage of the Bunnings Hardware data.

Data Analysis Technology

Both the secondary data from (Wesfarmers, 2018) and the streaming data processed by Apache Sparks and stored in NoSQL is required to be analysed to provide actionable inference for Bunnings Hardware. The big data approaches of cluster analysis and regression analysis will be carried out using the R software on the R Studio environment. The R software is a big data analysis software that uses the R language for the analysis of data (Galit, et al., 2018).

Below is a high level architectural diagram for the various big data technologies that will be applied for the case of Bunnings Hardware.

Through this analysis we are able to cluster together online gaming platforms with similar sales characteristics. The characteristics for this instance being level of sales. The results are important in determining which online gaming platforms have the highest sales. The group/cluster of platforms with the highest sales can then be examined to understand what similarities make their sales high.

Big data adoption challenges and governance

The biggest challenges facing adoption of big data are privacy and security. These two challenges are closely related with the main concern being establishing a means of safely accessing and analysing big data without the threat of facing a data breach. Big data presents a large source of valuable information, which makes it a target for hacking (Ulf-Dietrich & Uwe, 2014). In order to guarantee the security of big data, software solutions such as Apache Rangers are available for use at any stage of the big data analysis process.

Data governance deals with the question of the credibility of data. The credibility concerns the source and permission for using the data (Ulf-Dietrich & Uwe, 2014). Data governance also checks on how appropriate the data is for the type of analysis being intended to be conducted. Data governance much like the data privacy and security has a software solution, software such as IBM and SAS provide for data governance checks.

References

Anthony, S. D. & Johnson, M. W., 2008. Innovator's Guide to Growth. 1st ed. New York: Havard Business School Press.

Bunnings Hardware, 2018. Bunnings History. [Online]
Available at: www.bunnings.com.au/contact-us
[Accessed 1 November 2018].

Cortes, C. & Mohri, M., 2014. Domain Adaptation and Sample Bias Correction Theory and Algorithm for Regression. Theoretical Computer Science , pp. 103-126.

Farris, P. W. & Neil, B. T., 2010. Marketing Metrics. 2nd ed. New Jersey: Pearson Education.

Galit, S. et al., 2018. Data Mining for Business Analytics. 1st ed. New Delhi: John Wiley & Sons, Inc..

Han Kamber & Jaiwei, P., 2011. Data Mining: Concepts and Techniques. 3rd ed. London: Morgan Kaufman.

Hosmer, D., 2013. Applied Logistic Regression. 1 ed. Hoboken, New Jersey: Wiley.

Jon, K. R., 2006. The Practice of Cluster Analysis. Journal of Classification, 23(1), pp. 3-30.

Jorge, A. A., Angela, A. & Edson, Z. M., 2013. Robust Linear Regression Models: Use of Stable Distribution for the Response Data. Open Journal of Statistics, Volume 3, pp. 3-5.

Kiechel, W., 2010. The Lords of Strategy. 2nd ed. New York: Havard Business Press.

Kotler, P., 2009. Marketting Management. 1st ed. Washington DC: Pearson: Prentice-Hall.

Laudon, K. C. & Guercio, T. C., 2014. E-commerce. Business. Technology. Society. 1st ed. Chicago: Pearson.

Lechner, F. J. & Boli, J., 2012. The Globalization Reader. 4th ed. Blackwell: Wiley.

Nguyen, X. V., Julien, E. & James, B., 2009. Information Theoretic Measures of Clustering Comparison. Chicago: The Original.

Oscar, M., 2009. A data mining and knowledge discovery process model. 1st ed. Vienna: Julio Ponce.

Pappas, N., 2016. Marketing Strategies Perceived Risks, and Consumer Trust in Online Behaviour. Journal of Retailing and Consumer Services, 29(1), pp. 92-103.

Ren, J. & Ying, S., 2010. Research and Improvement of Clustering Algorithms in Data Mining, s.l.: 2010 2nd International Conference on Signal Processing Systems.

Tri, D. & Jugal, K., 2015. Select Machine Learning Algorithms Using Regression Models, s.l.: 2015 IEEE Conference.

Ulf-Dietrich, R. & Uwe, M., 2014. Mining "Big Data" Using Big Data Services. International Journal of Internet Science, 1(1), pp. 1-8.

Usama, F. & Padhraic, S., 2008. From data mining to Knowledge Discovery in Databases. 4th ed. New York: CRC Press.

Vujakovic, P., 2010. How to Measure Globalization? A New Globalization Index (NGI). Atlantic Economic Journal, 38(2), p. 237.

Wesfarmers, 2018. Reports. [Online]
Available at:
[Accessed 12 November 2018].

Witten Ian, 2011. Data Mining: Practical Machine Learning Tools. s.l.:Elsevier.

Yu, Y. P. et al., 2011. Pattern Clustering of Forest Fires Based on Meteorological Variables and its Classification Using Hybrid Data Mining Methods. Journal of Computational Biology and Bioinformatics Research, pp. 47-52. 3(1).

How to cite this essay: