“Big Data” is a term that is generally misunderstood. At least that’s what I found in a very small, unscientific survey of my family members. When I bring up the subject or simply ask the question, “What do you think about big data?” My family instantly goes into a negative reaction, saying things like, “Don’t like it” and “It’s bad.”
When pressed to explain what they think big data is, the consensus is that it is an invasion of privacy.
My little polling experiment revealed to me that the reality of big data needs to be better explained to the general public.
I did a little research on the topic and saw that there are a lot of definitions being bantered around about big data. But the meaning that made the most sense to me and upon which I’ve settled is simply that big data is the amount of accessible information available on people primarily through their everyday use of the Internet, and how companies are managing, analyzing, utilizing, and securing that data.
Under this definition, big data is mainly a big issue with which businesses must contend. In order to grow businesses in our digital age, access to and the proper use of big data is essential.
I personally remember the early days of the Internet in the late 1980s, early 1990s, and in the public relations and marketing fields, database management was all the rage. Everyone in PR, marketing, and advertising wanted to learn how to get and effectively use databases. At the time, the databases were very simple, containing a limited number of fields, such as name, address, phone, and email address, and they may be categorized under specific interests. They were used for email campaigns that would invite people to events, request media to run press releases, or solicit donors for nonprofits. But that was before Web 2.0 and the rise of social media and definitely years away from the “Semantic” Web 3.0.
These days, available data is big! The information fields of current databases are complex and varied, they may contain web browsing histories, financial transactions, several demographics, and buying habits. Big data requires special software to analyze it and store it. Additionally, it requires the four “V’s” to describe it:
Volume: The quantity of generated and stored data. The size of the data determines the value and potential insight, and whether it can be considered big data or not.
Variety: The type and nature of the data. This helps people who analyze it to effectively use the resulting insight. Big data draws from text, images, audio, video; plus it completes missing pieces through data fusion.
Velocity: In this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development. Big data is often available in real-time.
Veracity: The data quality of captured data can vary greatly, affecting the accurate analysis.
Today, companies must be proficient at effectively using big data to create personalized experiences for their customers. Certain companies are great at utilizing big data, such as Amazon, which is managing its access to big data to build brand loyalty and increase sales.
For example, I recently purchased an exercise bike on Amazon.com for my mother to use for physical therapy following her knee replacement surgery. It was convenient for me to purchase the bike because Amazon has my credit card information saved, to have the bike delivered to my mother because Amazon has her address saved from previous packages I’ve sent her, and finally, based on Amazon’s follow-up, emailed suggestion, I purchased a floor mat for the bike, which I didn’t initially think about but ended up needing.
The following information was cited in Wikipedia, under the definition of big data, describing the sheer volume of data that large digital companies contend with:
- ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising.
- Amazon.comhandles millions of back-end operations every day, as well as queries from more than half a million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005, they had the world’s three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB.
- Facebook handles 50 billion photos from its user base. As of June 2017, Facebook reached 2 billion monthly active users.
- Google was handling roughly 100 billion searches per month as of August 2012.
Looking at these numbers it’s hard to wrap one’s mind around the concept of big data and it’s easy to be overwhelmed by the idea. But despite my family members misgivings and trepidations, I don’t fear big data because it’s just information. I’m excited about its myriad possibilities.