Big data - The Analytic Perspective. Work in progress by Dr. - TopicsExpress



          

Big data - The Analytic Perspective. Work in progress by Dr. Joseph Aluya and Dr. Ossian L. Garraway As a branch of data science, predictive modeling uses historical data to predict the future; and is usually employed as a tool to reduce uncertainty, by testing strategies prior to implementation. For example in the Oil and Gas industry worldwide, predictive modeling has been used to integrate technology throughout the oil and gas lifecycle; to connect oil companies with not only the best solutions and practices for major challenges; but to improve each firms opportunity set for addressing future threats. When applied to data analysis, predictive modeling has rendered valuable insights to an organizations decision-making power. For example, the McDonalds Corporation has set up some stores with devices that collect operational data to track customer interactions, in-store traffic, and buying patterns. Armed with such real time information, corporate researchers have been using predictive modeling to better understand the impact that menu changes, restaurant layouts, and employee training have on customer demand (McGuire, James & Chui, 2008). When applied to the big data phenomenon, the field of analytics requires a holistic approach to maximize insight. Hence the optimal vector would have to include contributions from machine learning, statistics and operations research. In the twenty-first century, the research on personalization has expanded exponentially. New technologies have boosted potent interactions among customers on an individual basis by serving the appropriate content in the desired format to the individual at the right time, in real-time. Such technologies have been able to give insight into the context-specific behavior of online customers using a combination of learning analytics and statistics. For example, in the Nordic retail bank study carried out in 2012, the researchers applied a mixture of predictive analytics and statistics to measure the effect personalized online advertising had on customers. Using click stream data to calculate the customers attention span, the bank learned that turnover based on personalization was higher than that of direct mail promotions (Siemens & Gasevic, 2009; Bragge, Sunikka, & Kallio, 2012). In the past, many data technology departments have been nudged to increase the rigor with which they clean and transform data to maximize the high known value of data per byte. In today’s market place, the value of big data is increasingly being determined by unasked questions in the querying of the unquantified data. Coupled with the effects of data velocity, volume and variety, the environment impedes the effective use of the former strategy. As a result, big data is typically attributed a low value per byte because of the prevalence of unasked questions. Such unasked questions could be the source of reducing the sources of sample bias that may affect statistical conclusions. The push towards exploring big data content to gain more insight into unexplored phenomena has been encouraged and catalyzed by (a) faster machine-to-machine interconnectivity, (b) an explosion in the growth rate of year-to-year data, and (c) progressive reductions in the cost and size of integrated circuitry. Three of the most significant characteristics defining Big Data are volume, velocity and variety. The storage of data volume has increased from 800,000 petabytes in 2000 to a projected increase of 35 zettabytes by 2020 (i.e. a projected average increase of 1.7 zettabytes annually). According to IBM, there is a divergence in the growth rate of data being stored and the growth rate of the capacity to process, analyse and understand such data. So with regard to the enterprise, as data availability increases, the growth of the enterprise’s sense-making capabilities is on the decline. In the post twentieth century era, the explosion of smart devices, sensors and social communication technologies like twitter and facebook, has increased the variety of formats and database models used to parse and store the data available. With respect to data in motion and the speed of data, the shelf life of data collected is shortening due to the continuous updating from stream computing technologies. The processing of Big Data demands the continual performance of analytics on data in motion. In the financial industry, Sarbanes-Oxley compliance extends to the use of big data when it is connected to certifying the accuracy of financial statements under Section 302, “Corporate responsibility and Financial Reports” (Eaton, Deroos, Deytch, Lapis & Zikopoulos, 2012; Jarausch & Hardy, 1991 ). There is anecdotal evidence the importance of corporate Information Technology departments has waned in its effectiveness as customers demand more flexible and fluid forms of data access to improve productivity. The current surge in demand comes about as a result of the collective efforts of IBM, Google and Microsoft to invest heavily in cloud computing and software services from cyberspace. Such investments have expanded the computing market to include more small and medium-sized companies previously prohibited due to cost. Other contributing factors to the increased demand include the effective scalability of big data through cloud automation and the increased diversity of available data capturing devices like iPads, Smart phones, and machine sensors (Anonymous, 2008; Rogers, 2009; Keats, Hogue, Walsh, Mirakaj & Bruckner, 2011; Gobble, 2013). On the other hand, the influx of such devices poses security threats to the enterprise with regard to: (a) the volume of structured, semi-structured and unstructured data; (b) the companys infrastructure to manage data volume; (c) the lack of qualified Big Data analysts; (d) categorizing and managing the information, and (e) data integrity risk (Brand, 2013) In the information design sector, data products are in demand because they integrate many of the disciplines practiced in the new economy and are well positioned to influence work modes in the future. As a result, companies that leverage data products have the competitive advantage in the current market space. The ubiquitous escalation of unstructured data has energized all progressive companies worldwide to leverage available data for competitive advantage. The consequent surge for competitive advantage has overwhelmed current corporate infrastructures and modi operandi in terms of managing the volume, variety, velocity and veracity of data (Anonymous, 2012; Anonymous, 2013b; Anonymous, 2013c; Berg, 2012; Gadney, 2010). The deluge of unstructured data has spawned new business models, improvements in business processes, as well as reductions in costs and risks. Succinctly, the identity of Big Data is ... when your data sets become so large that you have to start innovating how to collect, store, organize, analyze and share it. (Gobble, 2013, para. 6). In addition, big data has transformed the output and methods of innovation. The current effluence of innovations in the health care sector has been driven by new and different kinds of data; ways of data extraction; ways of gaining insight into data whereby problem comprehension supersedes problem solving; as well as the new business models spawned from such innovations. In companies where the unleveraged data is collected from disparate systems, the larger ones use the new data science techniques in-house, while aligning resources to understand the data available. Such companies are combining data scientists with the proper technology and network infrastructure, so that upper management could realize and leverage the value of big data. Commercially, upper management in the modern enterprise has been compelled to consider the potential value of big data due to the prevalent reliance on the skill sets and points of view from the data scientists who know how to render the efficient and effective use of the data. The ensuing paradigm shift on data perspectives has nudged companies to re-design strategies aimed at exploiting big data when planning for future needs (Egodigwe, 2009; Ward, Tripp, & Maki, 2013;Bonometti, 2012, Cameron, 2013; Strahilevitz, 2013) . Big data requires specialized technical structures like sequential equation modeling along with various factor analyses to make sense of the data for some useful purpose. Lately, the global exponential increase in the unstructured portion of data has made it difficult for firms to systematize using conventional analytical tools. For instance, Web 3.0 (i.e. the internet of things) has played a significant part in the rise of unstructured data, which has resulted in new forms of automation in the collection, analysis and transmission of data. From an institutional perspective, big data can describe rapid changes in the availability, use and growth of information beyond its initial scope. As a result, many practitioners have struggled with the decision to implement big data initiatives. At the source of the decision, is the opportunity cost of big data conditioned with the caveat to avoid pyrrhic endeavors. (Cramptona, Graham, Poorthuis, Shelton, Taylor, Stephens, Wilson & Zook, 2013; Lapide, 2012; Brynko, 2009; Wisniewski, 2009; Wainer & Braun, 1988) .Further examining the foundational background few companies that are engaging in the use of the 5Vs is paramount to this study, starting with APPLE Corporation in the final paper with its references will be attached in jofdt/blog.php?blPid=48
Posted on: Thu, 31 Oct 2013 16:25:01 +0000

Recently Viewed Topics




© 2015