A quick Note about (Big) Data : Just to be sure to understand

Data Science for Enthusiastic People

Most of people who are from relational database (the majority of statisticians that use to work with SAS) have some issues when we talk about New paradigm of storage and analytic on (big)Data.

Here are some quick definitions that can be helpful to understand Big Data world


Two types of databases :

  • Relational databases

based on ACID principle

Atomicity: all operations in a transaction will complete or none will

Consistency: database will be in a consistent state during transaction

Isolation: the transaction will operate as if it is the only operation being performed on the database

Durability: upon completion of the transaction operation will not be reversed

  • NOSql databasesg

based on CAP principle

Consistency: no contradiction in the data

Availability: every operation must terminate in an intended response,

Partition tolerance: fault tolerance

Shortcomings of RDBs

  • Schema

Schema may be gigantic making it not understandable

CONS=>Huge data are hard to…

