Step 3: Creating a Clean Room

(Optional) If you already have a test clean room set up for use with the Clean Room API, you can skip this step.

This tutorial assumes that you will create a hybrid clean room to use as your API sandbox. The clean room types available to your organization depend on your LiveRamp contract and use cases. Your exact configuration will depend on many details, such as source storage location, data types, and so on. For help determining your clean room type, contact your LiveRamp representative.

Replace the variable inputs with your IDs and other values.

Endpoint used: cleanrooms

For more information, see the POST /cleanrooms call.

# To create a hybrid clean room in the US on GCP, define the cleanroom_metadata body JSON.

cleanroom_metadata = {
  "type": "Hybrid",
  "parameters": {
    "REGION": "REGION_US",
    "CLOUD": "CLOUD_GCP",
    "SUB_REGION": "SUB_REGION_EAST1"
  },
  "name": "API Tutorial Sandbox",
  "description": "Used to run tutorials",
  "startAt": "<INSERT CURRENT DATE YYYY-MM-DD>"
}

# Create the clean room.
response = cleanroom_api.create_clean_rooms(cleanroom_metadata)
print(response)

Tip: To verify that your clean room was created, check the Clean Room UI.

[OPTIONAL EXAMPLES] Create Credentials and Data Connections

For this tutorial, use LiveRamp's synthetic data connections, which are provided to every organization. If these have already been set up for your organization, skip to the next step. If not, contact your LiveRamp representative to request them. To configure a data connection, you also need corresponding credentials. You can use the following example operations if you need to programmatically get credential options and data source options.

# Get credential source IDs.

# response = cleanroom_api.get_credential_source_options()
# print(json.dumps(response, indent=4))

# Get data source IDs.

#response = cleanroom_api.get_data_source_options()
#print(json.dumps(response, indent=4))

The credential and data source GET commands return various configuration parameters. This tutorial uses the synthetic dataset libraries, which are not configurable by the API and should be provisioned in your organization. If you lack access to these libraries, contact your LiveRamp representative to request them for your organization.

To use this tutorial, you don't need to create a credential or data connection. The following example is provided in case you need to programmatically create a credential and data connection. Most clean room customers do not frequently need to create data connections, so they use the Clean Room UI to manually create them.

# Example: Create a GCS data connection #[COMMENTED OUT AS UNNECESSARY FOR THE TUTORIAL]

# Create the Google Cloud Storage (with SA) Credential

# gcs_cred_json = {<INSERT YOUR JSON HERE>}
# organization_credential_details = {
#    "name" : "API Test GCS Credential",
#    "credentialSourceId" : "7b047e6c-7c98-432f-a959-1d1e1165db61",
#    "credentials" : [
#        { 
#            "name": "Credential JSON",
#            "value": json.dumps(gcs_cred_json)
#        },
#        {
#            "name": "Project ID",
#            "value": "<YOUR PROJECT ID>"
#        }
#    ]
# }

# response = cleanroom_api.create_org_credentials(organization_credential_details)

# Make note of the ID for the credential in the response. This is needed to create the data connection.

# Create the Google Cloud Storage with service account (SA) Data Connection
# Note, to get inputs for dataSourceConfiguration, you must use the data-source-parameters endpoint and know the dataSourceID as well as the dataTypeId.
# Use dataTypeId = ac8b015e-036d-43ec-8aea-f3aededa1d70 for Generic.

# data_connection_details = {
#    "name" : "API Test GCS Data Connection",
#    "category" : "Testing",
#    "credentialId" : "8abb4a4f-39ca-4199-bc83-8273a65bd4d0",
#    "dataType" : {
#        "displayName":"Generic"
#        },
#    "dataSource" : {
#        "id" : "e1a1a9ba-91c4-4343-88d5-effacafad516",
#        "name" : "CLIENT_GCS_SA",
#        "displayName" : "Google Cloud Storage(with SA)",
#        "credentialSource" : "Google Service Account"
#    },
#    "dataSourceConfiguration" : [
#        {
#            "name" : "FieldDelimiter",
#            "value" : "PIPE"
#        },
#        {
#            "name" : "QuoteCharacter",
#            "value" : "\""
#        },
#        {
#            "name" : "FileFormat",
#            "value" : "CSV"
#        },
#        {
#            "name" : "DataLocation",
#            "value" : "<YOUR gs://bucket-path/>"
#        },
#        {
#            "name": "SampleFilePath",
#            "value" : "< Path to YOUR sample file>"
#        }
#    ]
    
# }

# Create the data connection

# response = cleanroom_api.create_data_connections(data_connection_details)
# print(json.dumps(response, indent=4))

# Take note of the ID in the response.

[OPTIONAL EXAMPLE] Data Connection Field Mapping

For this tutorial, your synthetic datasets have already been mapped, and the following optional example has a hard-coded data connection ID. However, you can choose to store the ID output from the previous step as variables and dynamically assign them.

Some Clean Room customers have data connections with hundreds of fields that they don't want to manually configure. The following examples show how to complete mapping for a data connection with many fields programmatically via API.

Before executing this step, check the status of your data connection and ensure it has moved to the MAPPING REQUIRED stage.

Endpoint used: data-connections

For more information, see the following calls:

# Get the list of configurations you need to create for the data connection. This will return the existing values prior to configuration.
# data_connection_id = "<YOUR data connection id>"
# response = cleanroom_api.get_data_connection_field_configurations(data_connection_id)
# print(json.dumps(response, indent=4))

# Update the field mapping with your bulk updates.

# data_connection_id = "<YOUR data connection id>"

# field_configuration_details = [
#    {
#        "fieldName" : "RampID",
#        "dataType" : "STRING",
#        "identifierType" : "RampID",
#        "delimiter" : "",
#        "isPii" : "false",
#        "isUserIdField" : "true",
#        "isExcluded" : "false"
#    },
#    {
#        "fieldName": "MD5_CID",
#        "dataType": "STRING",
#        "identifierType": "Customer First Party Identifier",
#        "delimiter": "",
#        "isPii": "false",
#        "isUserIdField": "true",
#        "isExcluded": "false"
            
#        }
#    ]

# response = cleanroom_api.create_map_field_configurations(data_connection_id,field_configuration_details)

# print(json.dumps(response, indent=4))