Find your content:

Search form

You are here

Copy Production data to Developer Sandbox

 
Share

I'm on Enterprise Edition and only have a developer sandbox... (I really miss my Full Copy sandbox and UE).

I need to do some tests against "real" data. I have a strict rule of not testing or developing in Production. This means I need to get a sample of my Real data in production into my sandbox.

Is there anyway to do this without doing a bunch of exports, imports, and additional reports/imports to get the relationships right?


Attribution to: Salesforce Wizard

Possible Suggestion/Solution #1

I've written an Open-Source windows app to do this. Its called SandboxBerry and its on GitHub:

https://github.com/codeulike/SandboxBerry

This is how SandboxBerry works:

  • You give it a list of Salesforce objects (in the form of an XML file)
  • optionally you can specify filters for those objects in SOQL syntax
  • SandboxBerry will then transfer the data for those objects from your Source (usually Production) to a Target (always a Sandbox)
  • While transferring the data, it fixes the following problems:
    • Relationships between objects will automatically be preserved, substituting new Ids for the old Ids
    • Any inactive or missing users will be defaulted to the user specified by you
    • Only the objects fields that are present in both Source and Target will be transferred
  • Optionally, you can skip specific fields or hard-code their values in the Target - this can be useful for anonymisation

Its a tool aimed at developers so you'll need to be happy with editing XML files and knowing the API names for Salesforce objects, but hopefully it will be useful for people.

See the Readme and the wiki on the github repo for more info

You can download the built version from the Github Releases section.

Screenshot:


Attribution to: codeulike

Possible Suggestion/Solution #2

The SFDX CLI can download and upsert:

sfdx force:data:soql:query -q "SELECT Name FROM Blog__c" -r csv -u prod > data.csv
sfdx force:data:bulk:upsert -s Blog__c -f data.csv -i Name -u sandbox

The object must have field with unique values for all rows. Ideally with a unique constraint or marked as an external id, but not required.

It also supports lookup fields as long as the lookup object has an external id field. When selecting the data use SELECT Parent.My_External_Id__c, ... FROM ...


Attribution to: Brian Low

Possible Suggestion/Solution #3

We did some R&D on this topic and we built a tool like this to copy and keep in sync the data between sandboxes and production. We use a self written java tool that copies the production data to a mysql database and then from that database to a sandbox you specify. In the tool you can specify what object you want to sync to your sandbox and so on. Now we are trying to create a subset of data to load in a 10mb sandbox environment. So we are still improving the tool but it already has quit some capabillities. If you are interested in such a tool let me know


Attribution to: Sven Delporte

Possible Suggestion/Solution #4

I think there is nothing better than the Full-copy sandbox to test on and how easy salesforce makes it to create one. You could always purchase a full-copy sandbox for an additional fee on Enterprise Edition too.

The other way is more manual either with import/exports or writing code to access the classes that describe sobject structures and querying out all the related records based on the relationship structure.


Attribution to: manubkk

Possible Suggestion/Solution #5

Recently, I began using Jitterbit's dataloader replacing the Salesforce official version. I found it much better for this type of scenario because of the complex login, source, target save scenarios, and scheduled actions. Short of an ETL tool or the more complex suggestions above, this seems to be a decent option. And it's free!


Attribution to: Ryan Phillips

Possible Suggestion/Solution #6

It appears that as of Winter '14 Salesforce will be offering something like this: https://success.salesforce.com/ideaView?id=08730000000BrKsAAK

In Winter '14 we will be offering a new product that will allow you to copy a sampled subset of production data based on what is defined in a sandbox template.

For each object selected to be copied, we will copy up to 10,000 records. The storage limit for this sandbox will be 5GB and the refresh interval will be 5 Days. The data sampling does take into account required foreign key relationships.

For example, if the sandbox template you define contains Accounts and Contacts, then, when applied to a partial sandbox creation, we will copy up to 10,000 account records and up to 10,000 related Contacts.

the new documentation says this:

https://help.salesforce.com/apex/HTViewHelpDoc?id=data_sandbox_create.htm&language=en

Partial Data sandboxes include all of your organization’s metadata and add a selected amount of your production organization's data that you define using a sandbox template. A Partial Data sandbox is a Developer sandbox plus the data you define in a sandbox template. It includes the reports, dashboards, price books, products, apps, and customizations under Setup (including all of your metadata). Additionally, as defined by your sandbox template, Partial Data sandboxes can include your organization's standard and custom object records, documents, and attachments up to 5 GB of data and a maximum of 10,000 records per selected object. A Partial Data sandbox is smaller than a Full sandbox and has a shorter refresh interval. You can refresh a Partial Data sandbox every 5 days.


Attribution to: Jen Johnson

Possible Suggestion/Solution #7

I have a few ETL scripts I built using Talend Open Studio to move data over.

Using a tool like Talend (which is Free and has SFDC Connectors) or Informatica Cloud or Boomi, you can create scripts to pull data from your Production Org and populate in your Sandbox Org. Of course depending on your sandbox you'll hit size limits so you need to filter your data pulls. Not a big deal.

This approach also requires Enterprise Edition as you will need API access.

You can build your scripts to properly create the relationships.

For Example:

Step 1 - Upsert Accounts (Store SFDC Production ID in an External Field, say "PROD_ID__c". Step 2 - Upsert Contacts (Store SFDC Production ID in an Exteral Field, say "PROD_ID__C". Use ACCOUNTID on the Contact as the External ID in the script to automatically relate the Parent / Child Objects. Step 3 - Continue Pattern.

Pros of ETL Approach: Repeatable & Saves you TONS of time on Sandbox Refreshes. Less error prone as your not manually running Data Loader / Reports. Low Cost if you use Open Source or Jitterbit, etc. You can automate this to keep data in sync as necessary via schedulable services (Apple Automater, Windwos Schedule Services, etc)

Cons: Requires some up front time & effort to build out the ETL scripts. IF you modify the Fields in Production you need to update in Sandbox. You need to know your Data Model. May sound stupid but if you have dependencies on multiple objects you need to build your script so that it loads the Object Records in an order that makes sense.


Attribution to: CoryCowgill

Possible Suggestion/Solution #8

Try Informatica Cloud Test Data Management (TDM). You simply choose the objects you're interested in moving, and TDM figures out what related objects to bring over. You can also choose groupings of objects as defined by SFDC, like Sales, Marketing, Case, Process, etc. You can find more at: http://www.informaticacloud.com/cloud-tdm

If you want to see more you can just fill out the Contact us link and someone will get back to you.


Attribution to: Prashant

Possible Suggestion/Solution #9

You could also look into Monarch a product by Dreamfactory the folk who do Snapshot Monarch

We use them and it is extremely easy. We have a checkbox on the Account object hidden to all non-sys admin users and then we populate 100 Account Records and then Monarch helps us pull all the dependencies right down to Assets, Service Contracts etc. We use this to import (again using Monarch) to all our Dev sandboxes.


Attribution to: Matt Brown

Possible Suggestion/Solution #10

This can be done, for free, with open-source ETL tools, for example TalenD

I've just written a lengthy step-by-step example of how to do this with TalenD on my blog:

Copying data to Salesforce Sandboxes using TalenD

This covers all the details, including how to use External IDs to make sure relationships between objects are preserved as you copy the data from Source to Target.


Attribution to: codeulike

Possible Suggestion/Solution #11

I have addressed this requirement before using reflection and the SF API. Through an external script you can use reflection to get the schema and query the data from an object. By setting a recursion level you can have the script pull related data from production and insert it into sandbox in reverse order.

There are a lot of hand coded exceptions that have to be built in as some data cannot be transferred and other values need to be referenced by name since ID's will be different (Record types and the ilk)

I implemented my approach in python but you can have your pick of the litter, http://www.salesforce.com/us/developer/docs/api/Content/sforce_api_calls_describesobjects.htm is the reference API doc for the reflection.

this will return a bunch of info on whether the field is writable, the type, etc. selecting a record by ID afterwards will give you the available data. Put those two things together with some code to recurse over foreign key fields and your off to the races =)


Attribution to: ebt
This content is remixed from stackoverflow or stackexchange. Please visit https://salesforce.stackexchange.com/questions/273

My Block Status

My Block Content