README
Databases
Can have a master slave distrubuting reads
Can have sharding handle different parts of requests
See AWS Storage
SQL VS NoSQL
SQL
battle-tested and refined
better consistency/ACID(Atomic Consisent Isolation Durability)
NoSQL
easier to use
more scalable(not a real problem until massive)
faster joins
SQL
SQL has tables and a schema managed by a RDMS(Relational Database Management System)
battle-tested and refined
better consistency/ACID(Atomic Consisent Isolation Durability)
NoSQL
easier to use
more scalable(not a real problem until massive)
faster joins
no complex transactions or constraints on data
BASE(Basically Available Soft-state Eventual Consistency) trades C and I for availability, graceful degradation, and performance
Types of SQL
Key value store i.e Memcached, Redis
Tabular i.e BigTable(Google)
Document Oriented i.e Mongodb
Data Warehousing
Hadoop is good for huge(PB) analytics using distrubuted file systems based on GFS(google file system) from paper in 2003
Map Reduce is used to do analytics
Hive Query Language is used to turn SQL queries into Map Reduce Jobs
Last updated