Data Analytics from the Perspective of AWS

TL;DR

Data Analytics at a Glance

Analysis is a detailed examination of something in order to understand its nature or determine its essential features. Data analysis is the process of compiling, processing, and analyzing data so that you can use it to make decisions.

Analytics is the systematic analysis of data. Data analytics is the specific analytical process being applied.

Data analysis solutions, which are broader than big data solutions, are made up of gathering, storing, processing, and visualizing data.

The challenges identified in many data analysis solutions can be summarized by five key challenges: volume, velocity, variety, veracity, and value.

Structured vs Semi-structured vs Unstructured data

  • Structured data is organized and stored in the form of values that are grouped into rows and columns of a table. Commonly stored in relational databases.

  • Semi-structured data is often stored in a series of key-value pairs that are grouped into elements within a file. Often stored in NoSQL databases or CSV, XML or JSON files.

  • Unstructured data is not structured in a consistent way. Some data may have structure similar to semi-structured data but others may only contain metadata. Often takes the form of files or objects.

Read more

Learning Paths towards AWS Certification

TL;DR

AWS IaaS

Amazon S3

scalable, durable object storage; decoupling storage from processing; parallelization; centralized, accessible and avoid moving between systems

S3 bucket policy and IAM policy are different.
Blocking the offending website IP traffic in security group may be trivial; makes “allows” but not “denies”

AWS IaC

CloudFormation

AWS CloudFormation treats infrastructure / environment as code.

AWS DBaaS

RDS

RDS Backups are automated, daily full backup; transaction logs are backed-up every 5 min. 7-35 days of retention. Manual DB snapshots have retention as long as possible.

RDS can have up to 5 read replicas, within AZ, cross AZ or cross origin. Async replication. Reads are eventually consistent. Replicas can be promoted. It is possible to setup multi-AZ read replicas for Disaster Recovery(DR).

Read more