Big Data or Big Saver?
Big Data is apparently complex term with a simple explanation.
We all use Smartphones right? But have you ever given a thought that how much data it generates in the form of text, phone calls, emails, photos, videos, searches, music and etc. Approximately 40 exabytes of data gets generated by a single smartphone user. Just multiply 40 exabytes by 5 million smartphone users and pheww we cannot even imagine or process the data in mind, which is just from Smartphone users. Other things Apart!
It’s quite a lot data for traditional computing systems to handle and to put it in words, this massive data is what we term as Big Data.
Let’s peep into some interesting massive data generated per minute on internet:
Snapchat: 2.1 Million snaps are shared
Google: 3.8 Million Queries
Facebook: 1.1 Million people log on to Facebook
Youtube: 4.5 million Videos
Emails: 188 Million
To put it into simple language, it is one of an ability to make Massive amount of Data and derive a story from it. Now this is a very interesting thing since years before all of this data was mostly in a blather way. But now we can analyse this data, derive predictions and detect patters. But the entrance of distributed computing and parallel processing changed all of that!
The Three V’s of Big Data
Big data comes with three main V’s which are three defining properties of Big Data –
There are many factors which are contributing to boost the data volume. Taking example of social media, volume here can be referred to the data generated through websites, portals and applications. For B2C companies Volume circumscribes the data which is out there and can be assessed for relevance.
Volume to brief is all about:
• The amount of data which is generated
• The Offline and Online Transactions
• It is saved in Tables, Records and Files
With reference to velocity we can refer to the speed by which this data is been generated. Keeping the example of social media, every second, every minute there is nuclear data explosion. And here comes into picture the Big Data where it helps the company to hold on to the explosion and further process it rapidly to avoid creating the bottlenecks.
Velocity to summarize is about:
• The spend of generating the Data
• Online and Offline Data
• Is generated in Real Time
• In streams, bits and batches
The importance of variety in Big Data is that it refers to all unstructured (Audio recording, ECG Reading, emails, voice mails, hand written text and etc.) and structured (texts, tweets, videos and pictures) data. Variety in simple words is about classifying the incoming data into different categories.
• Machine generated Readings
• Online Videos and Pictures
• Structured and unstructured data
Which big data technology is in demand?
As per the recent Trend, Hadoop Technology is the highly demanded for Big Data.
It makes the best Technology and is important because –
The ability to store and process huge data quickly: With data variability’s and Volume which are increasing constantly especially from Social Media that’s the key consideration.
High Computing power: The computing model distributed by Hadoop processes the big data fast. The more you make use of computing nodes, the more processing power you get.
Flexibility: We need not follow the traditional ways here of pre-processing data before storing it. You can store as much data and can decide how to use it later.
Fault Tolerance: Data and application processing is protected from hardware failure. If in any case nodes goes down, the jobs are automatically redistributed to other nodes to make sure the computing does not fail and also multiple copies are stored automatically.
The Scalability: By simply adding on the nodes one can simply grow the system more to handle more data.
Low Cost: Open Source Framework is free and to store large quantity data it uses commodity hardware.