-13.8 C
New York
Sunday, February 8, 2026

Configure seamless single sign-on with SQL analytics in Amazon SageMaker Unified Studio


Amazon SageMaker Unified Studio offers a unified expertise for utilizing information, analytics, and AI capabilities. SageMaker Unified Studio now helps trusted id propagation (TIP) for SQL workloads, enabling fine-grained information entry management based mostly on particular person consumer identities. Organizations can use this integration to handle information permissions by means of AWS Lake Formation whereas utilizing their present single sign-on (SSO) infrastructure.

Organizations already utilizing Amazon Redshift with TIP can lengthen their present Lake Formation permissions to SageMaker Unified Studio. Customers merely log in by means of SSO and entry their licensed information utilizing the SQL editor, sustaining constant safety controls throughout their analytics atmosphere.

This submit demonstrates methods to configure SageMaker Unified Studio with SSO, arrange tasks and consumer onboarding, and entry information securely utilizing built-in analytics instruments.

Resolution overview

For our use case, a retail company is planning to implement gross sales analytics to establish gross sales patterns and product classes which might be doing nicely. This can assist the gross sales workforce enhance on gross sales planning with focused promotions and assist the finance workforce plan budgeting with higher stock administration. The company shops a buyer desk in an Amazon Easy Storage Service (Amazon S3) information lake and a store_sales desk in a Redshift cluster.

The company makes use of SageMaker Unified Studio because the UI, with customers onboarded from their id supplier (IdP) to AWS IAM Identification Middle with TIP. Amazon SageMaker Lakehouse centralizes information from Amazon S3 and Amazon Redshift, and Lake Formation offers fine-grained entry management based mostly on consumer id. For our instance use case, we discover two completely different customers. The next desk summarizes their roles, the instruments they use, and their information entry.

Consumer Group Instrument Information Entry
Ethan (Information Analyst) Gross sales Amazon Athena for interactive SQL evaluation Non-sensitive buyer information (id, c_country, birth_year) and store_sales full desk entry
Frank (BI Analyst) Finance Amazon Redshift for studies and visualization US buyer information (c_country='US')

The next diagram illustrates the answer structure.

SageMaker Unified Studio with IAM Identification Middle simplifies the consumer journey from authentication to information evaluation. The workflow consists of the next steps:

  1. Customers register with organizational SSO credentials by means of their IdP and are redirected to SageMaker Unified Studio.
  2. Customers configure IAM Identification Middle authentication for Amazon Redshift, linking id administration with information entry.
  3. Customers entry the question editor for Amazon Redshift or SageMaker Lakehouse, triggering IAM Identification Middle federation to generate session and entry tokens.
  4. SageMaker Unified Studio retrieves consumer authorization particulars and group membership utilizing the session token.
  5. Customers are authenticated as IAM Identification Middle customers, able to discover and analyze information utilizing Amazon Redshift and Amazon Athena.

To implement our resolution, we stroll by means of the next high-level steps:

  1. Arrange SageMaker Lakehouse sources.
  2. Create a SageMaker Unified Studio area with SSO and TIP enabled.
  3. Configure Amazon Redshift for TIP and validate entry.
  4. Validate information entry utilizing Amazon Athena.

Conditions

Earlier than you start implementing the answer, you should have the next in place:

  1. In the event you don’t have an AWS account, you may enroll for one.
  2. We offer utility scripts to assist arrange numerous sections of the submit. To make use of them:
    1. Proper-click this hyperlink and save the utility scripts zip file.
    2. Unzip the file to a terminal that has the AWS Command Line Interface (AWS CLI) configured. You can too use AWS CloudShell.
    3. Run the scripts solely when prompted within the related sections.

    Notice: The utility scripts are configured for
    us-east-1 area. In the event you want one other area, edit the area within the scripts earlier than operating them.

  3. To deploy the infrastructure, right-click this hyperlink and choose ‘Save Hyperlink As’ to put it aside as sagemaker-unified-studio-infrastructure.yaml. Then add the file when creating a brand new stack within the AWS CloudFormation console, which can create the next sources:
    1. An S3 bucket to carry the client information used on this submit.
    2. An AWS Identification and Entry Administration (IAM) position known as DataTransferRole with permissions as outlined in Conditions for managing Amazon Redshift namespaces within the AWS Glue Information Catalog.
    3. An IAM position known as IAMIDCRedshiftRole, which can be used later to arrange the IAM Identification Middle Redshift software.
    4. An IAM position known as LakeFormationRegistrationRole, following the directions in Necessities for roles used to register places, and obligatory IAM insurance policies.
  4. In the event you don’t have a Lake Formation consumer, you may create one. For this submit, we use an admin consumer. For directions, see Create a knowledge lake administrator.
  5. If IAM Identification Middle is just not enabled, seek advice from Enabling AWS IAM Identification Middle for directions to allow it.
    1. If you might want to migrate present Redshift customers and teams, use the IAM Identification Middle Redshift migration utility.
    2. For a fast approach to take a look at the function and familiarize your self with the method, we offer a script to generate mock customers and teams. Run the setup-idc.sh script, which is supplied in Step 2, to create take a look at customers and teams in IAM Identification Middle for demonstration functions.
  6. Combine IAM Identification Middle with Lake Formation. For directions, see Connecting Lake Formation with IAM Identification Middle.
  7. Register the S3 bucket as a knowledge lake location:
    1. On the Lake Formation console, select Information lake places within the navigation pane.
    2. Select Register location.
    3. For the position, use LakeFormationRegistrationRole.
  8. Create an IAM Identification Middle Redshift software, as detailed in our earlier submit:
    1. On the Amazon Redshift console, select IAM Identification Middle connections within the navigation pane and select Create software.
    2. For each the show title and software title, enter redshift-idc-app.
    3. Set the IdP namespace to awsidc.
    4. Select IAMIDCRedshiftRole because the IAM position.
    5. Select Subsequent to create the applying.
    6. Pay attention to the applying Amazon Useful resource Title (ARN) to make use of in subsequent steps. The ARN format is arn:aws:sso:::software/ssoins-/apl-.
  9. In the event you don’t have present Redshift tables to work with, run the script setup-producer-redshift.sh, which is supplied in Step 2, to create a producer namespace and workgroup, arrange a pattern gross sales database, and generate obligatory tables with take a look at information.
  10. The submit additionally makes use of simulated buyer information saved within the AWS Glue Information Catalog. To arrange this information and configure the required Lake Formation permissions, run the setup-glue-tables-and-access.sh script supplied in Step 2.

Arrange SageMaker Lakehouse sources

On this part, we configure the foundational lakehouse sources required for SageMaker to entry and analyze information throughout a number of storage programs. We’ll register the Redshift occasion to the AWS Glue Information Catalog to make warehouse information discoverable and set up Lake Formation permissions on lakehouse sources for consumer identities to make sure safe, ruled entry to each information lake and information warehouse sources from inside SageMaker environments.

Register Redshift occasion to the Information Catalog

On this step, we use the store_sales information, which we created earlier utilizing the setup-producer-redshift.sh script. You possibly can register whole clusters to the Information Catalog and create catalogs managed by AWS Glue. To register a cluster to the Information Catalog, full the next steps:

  1. On the Lake Formation console, select Administrative roles and duties within the navigation pane.
  2. Underneath Information lake directors, select Add.
  3. Select Learn-only administrator, then select AWSServiceRoleForRedshift.
  4. On the Amazon Redshift console, open your namespace.
  5. On the Actions dropdown menu, selected Register with AWS Glue Information Catalog, then select Register.
  6. Register to the Lake Formation console as the info lake administrator and select Catalogs within the navigation pane.
  7. Underneath Pending catalog invites, choose the namespace and settle for the invitation by selecting Approve and create catalog.
  8. Present the title for the catalog as salescatalog.
  9. Choose Entry this catalog from Apache Iceberg suitable engines, select DataTransferRole for the IAM position, then select Subsequent.
  10. Select Add permissions and select the admin IAM position below IAM customers and roles.
  11. Choose Tremendous consumer for catalog permissions and select Add.
  12. Select Subsequent.
  13. Select Create catalog.

Arrange Lake Formation permission on lakehouse sources for consumer identities

On this part, we configure Lake Formation permissions to allow safe entry to lakehouse sources for federated consumer identities. Lake Formation offers fine-grained entry management that works seamlessly with IAM Identification Middle, permitting you to handle permissions centrally whereas sustaining safety boundaries.

We’ll give attention to granting database entry to IAM Identification Middle teams in Lake Formation and setting table-level permissions for federated Redshift catalog tables. These permissions kind the safety basis for our federated question structure, enabling customers to seamlessly entry each S3 information lake and Redshift information warehouse sources by means of a unified interface.

Grant database entry to IAM Identification Middle teams in Lake Formation

After you share your Redshift catalog with the Information Catalog and combine with Lake Formation, you should grant acceptable database entry. Observe these steps to arrange permissions in your information lake sources for company identities:

  1. On the Lake Formation console, below Permissions within the navigation pane, select Information permissions.
  2. Select Grant.
  3. Choose Principals for Principal sort.
  4. Underneath Principals, choose IAM Identification Middle and select Add.
  5. Within the pop-up window, if that is your first time assigning customers and teams, select Get began.
  6. Seek for and choose the IAM Identification Middle teams awssso-sales and awssso-finance.
  7. Select Assign.
  8. Underneath LF-Tags or catalog sources, select Named Information Catalog sources.
    1. Select :salescatalog/dev for Catalogs.
    2. Select sales_schema for Database.
  9. Underneath Database permissions, choose Describe.
  10. Select Grant to use the permissions.

Grant table-level permissions for federated Redshift catalog tables

Full the next steps to grant desk permissions to the IAM Identification Middle teams:

  1. On the Lake Formation console, below Permissions within the navigation pane, select Information permissions.
  2. Select Grant.
  3. Choose Principals for Principal sort.
  4. Underneath Principals, choose IAM Identification Middle and select Add.
  5. Within the pop-up window, if that is your first time assigning customers and teams, select Get began.
  6. Seek for and choose the IAM Identification Middle group awssso-sales.
  7. Select Assign.
  8. Underneath LF-Tags or catalog sources, select Named Information Catalog sources.
    1. Select :salescatalog/dev for Catalogs.
    2. Select sales_schema for Database.
    3. Select store_sales for Desk.
  9. Choose Choose and Describe for Desk permissions.
  10. Select Grant to use the permissions.

Create a SageMaker Unified Studio area with SSO and TIP enabled

For directions to create a SageMaker Unified Studio area, seek advice from Create an Amazon SageMaker Unified Studio area – fast setup. As a result of your IAM Identification Middle integration is already full, you may specify an IAM Identification Middle consumer within the area configuration settings.

Allow TIP in SageMaker Unified Studio

Full the next steps to allow TIP in SageMaker Unified Studio:

  1. On the SageMaker console, use the AWS Area selector within the high navigation bar to decide on the suitable Area.
  2. Select View domains and select the area’s title from the record.
  3. On the area’s particulars web page, on the Mission profiles tab, select a undertaking profile, for instance, SQL analytics.
  4. Choose SQL analytics and select Edit.
  5. Within the Blueprint parameters part, choose enableTrustedIdentityPropagationPermissions and select Edit.
  6. Replace the worth as true.
  7. To implement authorization-based on TIP, the SageMaker Unified Studio admin could make this parameter non-editable.
  8. Select Save.

Allow consumer entry for SageMaker Unified Studio area

Full the next steps to allow consumer entry for the SageMaker Unified Studio area:

  1. Open the SageMaker console within the acceptable Area and select Domains within the navigation pane.
  2. Select an present SageMaker Unified Studio area the place you wish to add SSO consumer entry.
  3. On the area’s particulars web page, on the Consumer administration tab, within the Customers part, select Add and Add SSO customers and teams.
  4. Select the consumer (for this submit, we add the consumer Frank) from the dropdown record and select Add customers and teams.

Add undertaking members

SageMaker Unified Studio tasks facilitate workforce collaboration for various enterprise initiatives. Because the undertaking proprietor, Ethan now can add Frank as a workforce member to allow their collaboration. So as to add members to an present undertaking, full the next steps:

  1. Register to the SageMaker Unified Studio console utilizing the SSO credentials of who owns the undertaking (for this submit, Ethan).
  2. Select Choose a undertaking.
  3. Select the undertaking you wish to edit.
  4. On the Mission overview web page, develop Actions and select Handle members.
  5. Select Add members.
  6. Enter the title of the consumer or group you wish to add (for this submit, we add Frank).
  7. Choose Contributor if you wish to add the undertaking member as a contributor.
  8. (Non-compulsory) Repeat these steps so as to add extra undertaking members. You possibly can add as much as eight undertaking members at a time.
  9. Select Add members.

Create a SQL analytics undertaking in Unified Studio

On this step, we federate into SageMaker Unified Studio and create a undertaking utilizing SQL analytics. Full the next steps:

  1. Federate into SageMaker Unified Studio utilizing your IAM Identification Middle credentials:
    1. On the SageMaker console, select Domains within the navigation pane.
    2. Copy the SageMaker Unified Studio URL on your area and enter it into a brand new browser window.
    3. Select Register with SSO.
    4. A browser pop-up will redirect you to your most popular IdP login web page, the place you enter your IdP credentials.
    5. If authentication if profitable, you’ll be redirected to SageMaker Unified Studio.
  2. After logging in, select Create undertaking.
  3. Enter a reputation on your undertaking. This undertaking title is ultimate and might’t be modified later.
  4. (Non-compulsory) Enter an outline on your undertaking. You possibly can edit this later.
  5. Select a undertaking profile. For this demo, we select the SQL analytics profile from the obtainable templates.
  6. Depart the default values as they’re or modify them based on your use case, then select Proceed.
  7. Select Create undertaking to finalize the undertaking and initialize your SQL analytics workspace.

For extra detailed data and superior configurations, seek advice from Create a undertaking.

Configure Amazon Redshift for TIP and validate entry

Run the setup-consumer-redshift.sh script (supplied within the conditions). This script will create a brand new namespace and workgroup and add the required tags, which you’ll use later to combine with SageMaker Unified Studio compute.

If you’re creating the cluster manually, add one of many following tags to the Redshift cluster or workgroup that you simply wish to add to SageMaker Unified Studio:

  • Possibility 1 – Add a tag to permit solely a selected SageMaker Unified Studio undertaking to entry it: AmazonDataZoneProject=
  • Possibility 2 – Add a tag to permit all SageMaker Unified Studio tasks on this account to entry it: for-use-with-all-datazone-projects=true

Create compute utilizing IAM Identification Middle authentication

After you arrange your undertaking, the following step is to ascertain a compute useful resource connection on the SageMaker Unified Studio console. Observe these steps so as to add both Amazon Redshift Serverless or a provisioned cluster to your undertaking atmosphere:

  1. Go to the Compute part of your undertaking in SageMaker Unified Studio.
  2. On the Information warehouse tab, select Add compute.
  3. You possibly can create a brand new compute useful resource or select an present one. For this submit, we select Connect with present compute sources, then select Subsequent.
  4. Select the kind of compute useful resource you wish to add, then select Subsequent. For this submit, we select Redshift Serverless.
  5. Underneath Connection properties, present the JDBC URL or the compute you wish to add, which is built-in with IAM Identification Middle. If the compute useful resource is in the identical account as your SageMaker Unified Studio undertaking, you may choose the compute useful resource from the dropdown menu. In our instance, we use the patron account that was simply provisioned.
  6. Underneath Authentication, choose IAM Identification Middle.
  7. For Title, enter the title of the Redshift Serverless or provisioned cluster you wish to add.
  8. For Description, enter an outline of the compute useful resource.
  9. Select Add compute.

The SageMaker Unified Studio Mission Compute and Information pages will now show data for that useful resource.

If all the things is configured accurately, your compute can be created utilizing IAM Identification Middle. As a result of your IdP credentials are already cached when you’re logged in to SageMaker Unified Studio, it makes use of the identical credentials and creates the compute.

Take a look at information entry utilizing Amazon Redshift

When Ethan logs in to SageMaker Unified Studio utilizing IAM Identification Middle authentication, he efficiently federates and might entry buyer information from all nations however just for non-sensitive columns. Let’s connect with Amazon Redshift in SageMaker Unified Studio by following these steps:

  1. Select Actions and select Open Question editor.
  2. Select Redshift within the Information explorer pane.
  3. Run the client gross sales calculation question to watch that consumer Ethan (a knowledge analyst) can entry buyer information from all nations however solely non-sensitive columns (id, birth_country, product_id):
    choose current_user, c.*, sum(s.sales_amount) as total_sales
    from "awsdatacatalog"."customerdb"."buyer" c
    be part of "dev@salescatalog"."sales_schema"."store_sales" s 
    on c.id=s.id
    group by all;

You’ve gotten efficiently configured Redshift to make use of IAM Identification Middle authentication in SageMaker Unified Studio.

Validate information entry utilizing Amazon Athena

When Frank logs in to SageMaker Unified Studio utilizing IAM Identification Middle authentication, he efficiently federates and might entry buyer information just for america. To question with Athena, full the next steps:

  1. Select Actions and select Open Question editor.
  2. Select Lakehouse within the Information explorer pane.
  3. Discover AwsDataCatalog, develop the database, select the respective desk, and on the choices menu (three dots), select Preview information.

The next demonstration illustrates how consumer Frank, a BI analyst, can carry out SQL evaluation utilizing Athena. On account of row-level filtering applied by means of Lake Formation, Frank’s entry is restricted to buyer information from america solely. Moreover, you may observe that within the Information explorer pane, Frank can solely view the customerdb database. The dev@salescatalog database is just not seen to Frank as a result of no entry has been granted to his respective group from Lake Formation.

The IAM Identification Middle authentication integration is full; you should use each Amazon Redshift and Athena by means of SageMaker Unified Studio in a simplified, all-in-one interface.Notice that, on the time of writing, Athena doesn’t work with Redshift Managed Storage (RMS).

Clear up

Full the next steps to wash up the sources you created as a part of this submit:

  1. Delete the info from the S3 bucket.
  2. Delete the Information Catalog objects.
  3. Delete the Lake Formation sources and Athena account.
  4. Delete the SageMaker Unified Studio undertaking and related area.
  5. In the event you created new Redshift cluster for testing this resolution, delete the cluster.

Conclusion

On this submit, we supplied a complete information to enabling trusted id propagation inside SageMaker Unified Studio. We coated the setup of a SageMaker Unified Studio area with SSO, the creation of tailor-made tasks, environment friendly consumer onboarding with acceptable permissions, and the administration of AWS Glue and Amazon Redshift managed catalog permissions utilizing Lake Formation. Via sensible examples, we demonstrated methods to use each Amazon Redshift and Athena inside SageMaker Unified Studio, showcasing safe information entry and evaluation capabilities. This strategy helps organizations preserve strict id controls whereas serving to information scientists and analysts derive precious insights from each information lake and information warehouse environments, supporting each safety and productiveness in machine studying workflows.

For extra data on this integration, seek advice from Trusted id propagation.


Concerning the authors

Maneesh Sharma

Maneesh Sharma

Maneesh is a Sr. Architect at AWS with 15 years of expertise designing and implementing large-scale information warehouse and analytics options. He works carefully with clients to assist them modernize their legacy functions to AWS cloud-based platforms.

Srividya Parthasarathy

Srividya Parthasarathy

Srividya is a Senior Massive Information Architect with Amazon SageMaker Lakehouse. She works with the product workforce and clients to construct sturdy options and options for his or her analytical information platform. She enjoys constructing information mesh options and sharing them with the group.

Arun A K

Arun A Ok

Arun is a Senior Massive Information Specialist Options Architect at Amazon Internet Companies. He helps clients design and scale information platforms that energy innovation by means of analytics and AI. Arun is obsessed with exploring how information and rising applied sciences can resolve real-world issues. Exterior of labor, he enjoys sharing information with the tech group and spending time along with his household.

Related Articles

Latest Articles