Data Science for Enthusiastic People
Most of people who are from relational database (the majority of statisticians that use to work with SAS) have some issues when we talk about New paradigm of storage and analytic on (big)Data.
Here are some quick definitions that can be helpful to understand Big Data world
- DATABASES
Two types of databases :
- Relational databases
based on ACID principle
Atomicity: all operations in a transaction will complete or none will
Consistency: database will be in a consistent state during transaction
Isolation: the transaction will operate as if it is the only operation being performed on the database
Durability: upon completion of the transaction operation will not be reversed
- NOSql databasesg
based on CAP principle
Consistency: no contradiction in the data
Availability: every operation must terminate in an intended response,
Partition tolerance: fault tolerance
Shortcomings of RDBs
- Schema
Schema may be gigantic making it not understandable
CONS=>Huge data are hard to…
View original post 649 more words