Data Lake

A data lake is a centralized repository that holds raw, unstructured, and structured data at any scale. Teams often dump web scrape outputs into Amazon S3 or Google Cloud Storage before downstream analytics.