Secure Data

Securing data access is the function of datasets in QuickSight. (You can find more details on it here )
Q works with datasets that have Row level security enabled. We will see this in action in below exercises.
Datasets with column level security are not supported by Q.

Exercise 1 - Check unsecured data
Let’s first get a sense of the data that is available in our monthly summary dataset.

  1. Ask Q (Pause after typing Tiers, for suggestions, ahead of submitting the question) - Show me overall sales for all Tiers
    Note that Q provides Tier-1,2 and 3 in the suggestion list and that overall sales is 2.2M.
  2. Ask Q - Show me sales by tier
    Note that Tier-1 has 1.08M, Tier-2 has 0.84M and Tier-3 has 0.37M in sales.
  3. If you are in topic view, click QuickSight icon to exit it.
QWorkshop-SecureData-Exercise

Exercise 2 - Create RLS rules dataset

  1. Open Datasets view from left panel and click New dataset button.
  2. Select S3, enter following details
    Data source name: SaaS-Sales-MonthlySummary-QRLS
    Manifest file URL: s3://ee-assets-prod-us-east-1/modules/337d5d05acc64a6fa37bcba6b921071c/v1/SaaSSalesQRLSManifest.json
    and click Connect button.
  3. On Finish dataset creation dialogue, click Edit/Preview data button.
  4. Click the User menu dropdown from top panel and followed by the Username tile therein. Copy the Username to clipboard and click Close button.
  5. Click Add calculated field and paste the Username into the calculation box.
    Then set up the calculation as shown below.
    Name: UserName
    Calculation: ifelse(User='User1', '<Fill your username>', User)
    and click Save button.
    In a true RLS dataset, we will be bringing in the UserName/UserArn directly. Here, we are creating that through a calculation just for ease of switching the applied rules and seeing their impact on Q answers.
  6. From left panel, open User fields menu and click Exclude field.
    Note that the preview shows Tier-1,Tier-2 against your UserName now. This rule, once applied on SaaS-Sales-MonthlySummary dataset, will tell QuickSight to show you data for only Tier-1 and Tier-2 products. Likewise, the rule for User2 grants access to Tier-3 products and the rule against Admin, which is blank in Tier column, grants access to all Tiers. These two additional rules are given in the file to allow you to tweak the calculation, have these apply to your username, and explore the end user experience later on.
  7. Click Save & publish / Save button.
  8. Click QuickSight icon to exit data prep view.
QWorkshop-SecureData-Exercise

Exercise 3 - Apply Row Level Security to SaaS-Sales-MonthlySummary dataset

  1. From Datasets view, click on SaaS-Sales-MonthlySummary dataset.
  2. Click Row-level security button.
  3. Expand User-based rules section.
  4. Select SaaS-Sales-MonthlySummary-QRLS dataset (Search for it if needed) and click Apply dataset button.
  5. In confirmation screen, click Apply and activate button.
  6. Click Back to datasets link to exit RLS configuration screen.
QWorkshop-SecureData-Exercise

Exercise 4 - Check topic state Let’s see the impact of above action on our topic.

  1. Click Topics from left panel and then on SaaS-Sales topic to open it.
  2. Scroll down on to Datasets section and expand SaaS-Sales-MonthlySummary dataset. Note that Q acknowledges that the dataset has got RLS enabled.
  3. Scroll up and click on Data tab.
  4. From View dropdown, click on SaaS-Sales to deselect it. Note that the Non date dimension fields - Product and Tier - has been disabled.
  5. Try clicking the slider to enable these fields.
    These are disabled on purpose to ensure that there is no data leakage via Q for RLS enabled datasets. We will see the options to enable this in later part of this module.
QWorkshop-SecureData-Exercise

Exercise 5 - Force query routing and check results
We have secured only one of the two datasets in our topic. It can be tricky to verify results ahead of enabling the dimensions in the secured dataset. So, let’s add couple of calculations that will help us force Q to route the query to the intended dataset.

  1. Click View dropdown and choose the Select all option to include both datasets.
  2. Click Add calculated field button,
    Type name as LGS (short for Lower Grain Sales),
    select SaaS-Sales dataset from right panel,
    add Sales field to the calculation box. (Type in directly/Search and double click from right panel Fields list)
    and click Save button.
  3. Click Add calculated field button,
    Type name as HGS (short for Higher Grain Sales),
    select SaaS-Sales-MonthlySummary dataset from right panel,
    add Sales field to the calculation box. (Type in directly/Search and double click from right panel Fields list)
    and click Save button.
  4. Ask Q - Show me overall LGS
    Q returns 2.2M result as obtained from the unsecured SaaS-Sales dataset.
  5. Ask Q - Show me overall HGS
    Q returns 1.9M result as obtained from the secured SaaS-Sales-MonthlySummary dataset. The rules applied earlier grants you access to only Tier-1 and Tier-2 products. Hence, the drop in overall sales that you are able to see from this dataset.
  6. Ask Q - Show me sales by tier Q was able to answer this question for us earlier. Now, Q asks us what we meant by tier. This is because the non date dimensions have all been disabled in Q when we enabled RLS on the dataset.
  7. Click QuickSight icon to exit topic view.
QWorkshop-SecureData-Exercise

Exercise 6 - Allow dimensions to be used in Q
As we saw in prior exercise, Q honors the RLS rules applied on underlying dataset in the results that are returned. However, the type ahead suggestions that Q provides are not bound by these RLS rules. So, users might see category values that they do not have access to in the type ahead suggestion list. They selecting such a value will of course return no data. So, allowing dimensions for Q’s use has to be a conscious decision from the dataset owner. We will see how to do this below.

  1. Click on Datasets from left panel and then on SaaS-Sales-MonthlySummary dataset.
  2. Click on Edit dataset button to launch the data prep view.
  3. From left panel, click on Product and Tier to select both fields.
  4. Open the Field menu and click Row level security for Q.
  5. Change the selection to Allow users to ask questions regarding this field
    Note that this should be done for only dimensions that in itself doesn’t have sensitive information.
  6. Click Apply button.
    You should now see a Q icon against both these fields in left panel. This indicates that you have allowed Q to use these fields.
  7. Click Save & publish / Save button.
  8. Click QuickSight icon to exit data prep view.
QWorkshop-SecureData-Exercise

Exercise 7 - Enable dimensions in Q
Now that we allowed these dimensions to be used in Q, let’s return to the topic and enable these dimensions.

  1. Click Topics from left panel and then on SaaS-Sales topic to open it.
  2. Click on Data tab.
  3. From View dropdown, click on SaaS-Sales to deselect it.
  4. Click the sliders to enable Product and Tier fields.
  5. Ask Q (Pause after typing Tiers, for suggestions, ahead of submitting the question) - Show me sales by tiers
    Note that Q provides Tier-1,2 and 3 in the suggestion list even though you currently have access to only Tier-1 and Tier-2. As explained earlier, this is why the access to use the dimension in Q has to be explicitly granted by dataset author and then the dimension be explicitly enabled by Q author.
    Submitting the question will bring back result with Tier-1 and Tier-2 only.
  6. Ask Q - Show me sales by tiers for Tier-3
    To re-emphasize, Q honors the RLS rules in the results. So, Q responds with a No data message since you don’t currently have access to Tier-3 products.
  7. Click on QuickSight icon to exit the topic view.
QWorkshop-SecureData-Exercise

Exercise 8 - Optional - Change RLS rules
Let’s change the RLS rules in the rules dataset and see the impact on response to above questions.

  1. From Datasets view, click on SaaS-Sales-MonthlySummary-QRLS dataset.
  2. Click on Edit dataset button to launch data prep view.
  3. From left panel, open Field menu for UserName and click Edit calculation,
    Change the ifelse function’s condition to User='User2' while retaining rest of the parts as is,
    Click Save button in calculation edit screen and
    Click Save & publish / Save button to save the dataset.
  4. Ask Q - Show me sales by tiers
    Q now returns data for only Tier-3 products as specified in the RLS rule.
  5. Edit the calculation again.
    Change the ifelse function’s condition to User='Admin' while retaining rest of the parts as is,
    Click Save button in calculation edit screen and
    Click Save & publish / Save button to save the dataset.
  6. Resubmit the question in Q bar (Show me sales by tiers)
    Q now returns data for all product tiers per the new RLS rule we applied.
QWorkshop-SecureData-Exercise