Using the Salesforce Bulk API and Apex Code

At Brightcove we use to manage our customer information. Our sales, account management, support and finance teams use for various activities such as contacting sales leads, tracking support cases, and generating usage reports. It’s important for our business to keep pushing customer data into in a timely and reliable way.

The data model for both our Video Cloud and App Cloud products supports a many-to-many relationship between users and accounts. An account object represents an organization or a department within a large organization, and a user object represents an individual who works for one or multiple organizations. In Salesforce we customize the built-in Contact object to represent each user of Brightcove services and we define a custom object called BCAccount to represent an account (see figure 1).

Figure 1. Data model in Brightcove service and

Several years ago we built the data synchronization feature using the Salesforce SOAP API and quartz, and we have seen some problems with this implementation. There are two major difficulties:

  • It is too chatty, which makes it slow. Only 700 objects can be synchronized to Salesforce per hour.
  • It requires a lot of effort to make any changes to the data model. To add a new field to an object, it forces us to export a new WSDL file from Salesforce and generate Java classes from the WSDL file.

In light of these difficulties, we decided to build a new synchronization system using the Salesforce bulk API and Apex code. The new implementation consists of a data pushing engine called RedLine and a set of Salesforce Apex classes to process bulk data pushed from RedLine.

Figure 2. New data synchronization

RedLine is built using Sinatra, a lightweight ruby Web server, as a standalone service independent from the other Brightcove services. RedLine uses the rufus scheduler to periodically query object creates, updates and deletes from Video Cloud and App Cloud via RESTful APIs. Then RedLine transforms JSON response to CSV and sends to Salesforce as bulk request. Salesforce has a limit of 10,000 objects per bulk request, which is enough for our usage. Since bulk request is processed asynchronously in Salesforce, neither any of the Brightcove services nor RedLine needs to wait after sending data to Salesforce.

We wrote a few Apex classes to process bulk requests, including adapting the user and account objects to the Salesforce objects, and then deployed the Apex classes to Salesforce and scheduled Apex batch jobs to run these classes once data arrives as bulk request. In this way, no code exists in Brightcove services for the Salesforce data model and only Salesforce Apex code needs to deal with Salesforce data model. Salesforce provides a set of monitoring tools for both bulk request and Apex batch job.

If there are any errors during the processing of a bulk request, we can easily see them in the Salesforce Web UI. We also deployed an Apex class which runs periodically to check whether a bulk request arrives in an expected frequency, and alerts if a request has not arrived for a while.

In the new synchronization system, to release a change of new fields of user or account object we just need to add the new fields in the Salesforce custom object and then expose the new fields in the JSON response of the Brightcove service API. We don’t need to change or restart RedLine for object format change since RedLine is smart enough to convert the new fields in JSON as new columns in CSV in bulk requests.

Since the launch of App Cloud, there have been four changes to account objects and one change to user objects, and we didn’t have to change a line of RedLine code for these changes. For the old SOAP API based synchronization system, it used to take us one to two weeks to synchronize a new field for user or account objects.

After running this new synchronization application in production for 8 months, we have seen it handle a couple of burst data changes gracefully. Recently a batch change of 900 accounts was made during a deployment, and all of them were synchronized to Salesforce in less than a minute (most of the time was spent by Apex classes running in Salesforce). It used to take longer than an hour to synchronize the same amount of objects in the old synchronization system.

We are planning to open source the core part of RedLine soon.