Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture , curate, manage, and process data within a tolerable elapsed time.
The term has been in use since the 1990s, with some giving credit to John Mashey for coining or at least making it popular.
CHARACTERSTICS:
Volume: big data doesn't sample; it just observes and tracks what happens
Velocity: big data is often available in real-time
Variety: big data draws from text, images, audio, video; plus it completes missing pieces through data fusion
Machine Learning: big data often doesn't ask why and simply detects patterns
Digital Footprint: big data is often a cost-free byproduct of digital interaction
BIG DATA Vs. BUSINESS INTELLIGENCE:
Business Intelligence uses descriptive statistics with data with high information density to measure things, detect trends, etc..
Big data uses inductive stats. and concepts from non-linear system identification to infer laws (regressions, nonlinear relationships, and causal effects) from large sets of data with low information density to reveal relationships and dependencies, or to perform predictions of outcomes and behaviors.
ARCHITECTURE:
Big data analytics for manufacturing applications is marketed as a 5C architecture (connection, conversion, cyber, cognition, and configuration).
A multiple-layer architecture is one option to address the issues that big data presents. A distributed parallel architecture distributes data across multiple servers; these parallel execution environments can dramatically improve data processing speeds.
TECHNOLOGIES:
A 2011 report characterizes the main components and ecosystem of big data as follows:
Techniques for analyzing data, such as A/B testing, machine learning and natural language processing.
Big Data technologies, like business intelligence, cloud computing and databases.
Visualization, such as charts, graphs and other displays of the data.
Big data has increased the demand of information management specialists so much so that Software AG, Oracle Corporation, IBM, Microsoft, SAP, EMC, HP and Dell have spent more than $15 billion on software firms specializing in data management and analytics. In 2010, this industry was worth more than $100 billion and was growing at almost 10 percent a year: about twice as fast as the software business as a whole.
APPLICATIONS:
TECHNOLOGY:
eBay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising.
Amazon.com handles millions of back-end operations every day, as well as queries from more than half a million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 they had the world's three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB.
Facebook handles 50 billion photos from its user base.
As of August 2012, Google was handling roughly 100 billion searches per month.
Oracle No SQL Database has been tested to past the 1M ops/sec mark with 8 shards and proceeded to hit 1.2M ops/sec with 10 shards.
INFORMATION TECHNOLOGY:
Especially since 2015, Big Data has come to prominence within Business Operations as a tool to help employees work more efficiently and streamline the collection and distribution of Information Technology (IT). The use of Big Data to attack IT and data collection issues within an enterprise is called IT Operations Analytics (ITOA). By applying Big Data principles into the concepts of machine intelligence and deep computing, IT departments can predict potential issues and move to provide solutions before the problems even happen. In this time, ITOA businesses were also beginning to play a major role in systems management by offering platforms that brought individual data silos together and generated insights from the whole of the system rather than from isolated pockets of data.
Comments
Post a Comment