How to choose the right cloud data warehouse for your company - The Ultimate Checklist
Information is wealth. Today, less than 0.5% of data is actually being used, and businesses lose over $600 billion a year because of bad data. That means that, by and large, businesses are not properly storing, using and analyzing the data that they have coming in each day.
This is a huge problem, especially in today’s data-driven culture. Your data is only powerful if you can use it - and it is no good to you if you can’t properly organize and analyze it.
For many, this bad data problem comes from choosing the wrong type of data storage and running ineffective analytics as a result. Here, we break down the basics of data warehouses and discuss the important criteria that companies must keep in mind when considering a data warehouse for their business
Choosing a data warehouse is a crucial decision that you may have to make, and a bad decision can haunt you for years. When it comes to popular cloud data warehousing solutions, you have,
Cloud Data Warehouse Evaluation Checklist:
In the big picture, it’s important to choose a data warehouse solution that will fit in with your business model, your budget and your existing systems. Here's a comprehensive checklist for you to evaluate. Make sure it ticks all the crucial boxes.
1. Rich SQL Support: Its obvious, The cloud data warehouse should offer good SQL support. Run the below tests to make sure the product works fine and covers all your use cases.
Create a Table |
Create a View |
Create a Table as Select from Another table (CTAS) |
Rename a Table |
Create a View |
Create a View with complex joins |
Create views with multiple layers |
Alter table add, modify, drop columns |
Delete a Table, Delete a Table using complex joins |
Access provisioning - Test Groups, Roles and Grants |
Join Multiple tables on non Key fields |
Test Updates on Tables |
Test Inserts into Table |
Test Merge or Upsert statements |
Test Analytical functions - ROW_NUM, MIN, MAX, AVG, PERCENTILE |
Test Recursive functions |
Test ETL Loads and Unloads |
Thoroughly check all the datatypes are present and supported by your reporting tools |
Transaction control - Does it rollback the batch if any transaction fails |
Check support for Constraints - Identity, Not Nulls, Primary Keys, Foreign Keys |
Check connectivity and driver support with your consuming applications |
Test Stored procedures, macros, conditional constructs etc |
2. Speed : Perform speed benchmarks between your current solution vs the cloud data warehouse. Its essential to have a good QoS (Quality of Service) in your data warehouse. After all we want to provide an exceptional experience for your analytical users.
Performance benchmarks between your-current-solution vs New-cloud-datawarehouse |
Concurrency and spike load testing to ensure the solution scales with load |
Involve your reporting users to test their dashboards and gather their feedback |
3. Scalability & Concurrency : We need to ensure the cloud data warehouse scales in proportion to the load and also supports concurrent read / write operations.
Spike testing - Test if the cloud Data Warehouse scales up to handle any surge in the no of queries |
Test concurrent write / reads to make sure there is no lock waits |
Test the latency between Read and Write Operations |
Test Scale down functionality |
4. Cost and Return-on-Investment: The POC is never complete without performing extensive tests on the operating costs. After all, its our primary goal to evaluate the ROI, we need to make sure the solution is sustainable for our organisation
Estimate Storage costs - consider active storage and long term storage discounts |
Estimate Compute costs |
Estimate Other Fixed costs |
Estimate the staff training and migration costs |
Estimate the opportunity costs, to make sure there is no impact to the ongoing operations |
Run cost benchmarks with your on premise solution |
Run an ROI on the cloud proposal |
5.Innovation: Modern cloud platforms come with a bunch of innovative features. These features will save you costs as well as make your life easier. Here are few of the cool things you need to look for, before you settle on your platform
Time travel - Cloud DW platforms let you go back in time and look at the data at that point in time. You can also use this to recover any accidentally deleted data |
Zero copy Cloning - You can duplicate your database, tables without paying for storage twice |
Automated Backups |
Automated Restores |
Caching of query results - Would help you on your cloud bill |
Compression - Would help you on your cloud storage bill |
System schemas - To get metadata about your objects |
Cost monitoring and Alerts |
6.Security: Typical cloud providers stay hyper up-to-date with security patches and protocols to keep their host of customers safe and happy. Make sure the boxes are checked while choosing your cloud data warehouse.
Is the data at rest encrypted |
Is the data in transit encrypted |
Are there tools for handling sensitive data elements (PII) |
Authentication controls - SSO / 2 Factor Auth |
How are the encryption keys managed |
Cloud data warehouse provides you low entry of cost and happily scales as your company grows. Hope this checklist helps you choose the right data warehouse for your company and also help you run effective analytics and BI that you will need for long-term stability and success
victor
posted on 04 Dec 19Enjoy great content like this and a lot more !
Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds
Post Comment