In this previous article we explained how to use DataLakeHouse.io with Snowflake using a Harvest Connector.
Now let's create a Data Lake in Google BigQuery using a Salesforce Sales Cloud data source.
First, go to Connections, Sources and add a new Salesforce connection.
Name your Salesforce connection and Target Schema Prefix as salesforce_salescloud and select either your Production or Sandbox Salesforce environment. Click on Authorize Your Account, where you will be redirected to your Salesforce login page. Enter your Salesforce credentials:
After that you will receive a Connection Successful message.
Open the Actions and click on Reload Schema from Server. This operation takes a few minutes to run depending on the number of Salesforce entities available to your Salesforce organization.
On the Salesforce Source Connection, go to Actions and click on Edit. On the Connection Settings screen go to the Source Schema tab, you should see all of the metadata objects on the screen below.
NOTE: If you can't see any Schema objects return to the Connection Strings and click on Re-authorize your connection. After this step, go to Source Schema and click Load Schema again. If an error persists, reconnect to DataLakeHouse.io, clean your browser cache and create a new Salesforce source connection.
Open Google BigQuery
Go to https://console.cloud.google.com/ and click on Create or Select a Project.
Create a new project named DLH-io and also inform the project ID dlh-io, you will need this ID a few steps later.
After that you will get a message like this. Click on Select Project.
You will be redirected to the page like this. Search for BigQuery.
You should receive an empty database like this.
Now search for the IAM:
On the IAM page click on Grant Access
Grant access to dlh-global-bq-data-sync-svc@prd-datalakehouse.iam.gserviceaccount.com, assign the role to BigQuery user and click on Save.
If successful you will see the new role on the IAM page:
Return to DatalakeHouse.io and create a new BigQuery Target:
Create a new Target named BigQuery and inform the Project ID created before:
If successful you will receive a message like this:
Go to the Sync Bridge and create a new sync named Salesforce-BigQuery, select the Salesforce source, BigQuery target, GMT time zone and click on Save Sync Bridge.
A new Salesforce-BigQuery sync bridge will appear as the screen bellow
Click on Actions and click on Re-Sync All History in order to bring in all historical and current data.
You should receive this message and click on OK.
Next, go to the Monitoring page in DataLakeHouse.io and verify the logs:
Return to your Google BigQuery console and refresh the page. You should see all of the Salesforce objects tables with data replicated.
Conclusion
DataLakeHouse.io allows a quick setup of your Salesforce source and BigQuery destination in just a few clicks. It doesn't require any coding, firewall or API setting, just make sure that all of the connections were successful in order to create your sync bridge. Wait a few minutes, monitor the logs and view your Salesforce data in BigQuery.
DataLakeHouse.io is a must have technology for any company who requires an advanced and easy-to-use Data Lake and Platform Analytics with several connectors to famous cloud applications. More details on DLH.io.
About the author: Angelo Buss is a Solutions Architect and the founder of BRF Consulting, a DataLakeHouse.io and Salesforce ISV partner with expertise on Data Engineering.
Comentários