In today’s fast-paced, data-driven world, the ability to streamline workflows across various platforms is essential for enhancing productivity and efficiency. For organizations that depend on geographic data, like those leveraging CommCare for data collection and ArcGIS Online for spatial analysis, integrating these systems can unlock significant potential. However, without an automated workflow, the process of moving data between CommCare and ArcGIS Online can be time-consuming and error-prone.

This article explores how to automate the data pipeline from CommCare to ArcGIS Online using a combination of

  • Pandas,
  • GeoPandas,
  • the ArcGIS API for Python,
  • and a bit of custom Python code,

All within a Jupyter Notebook hosted on ArcGIS Notebooks Server.

Through this process, you’ll discover how to create a smooth, error-free, and scalable workflow that seamlessly pushes data from CommCare into ArcGIS Online for analysis and visualization.

Getting Started with the code

First and for-most you need to be working in a python environment. If you are new to python environments, check out this article which outlines the process.

Then we need to make sure our secrets like username, passwords, and specific account details are also hidden to prevent hackers from attempting to do what they do. For that we will use Python-dotenv.

Lastly, if you are using your local machine, try to make sure you have ArcGIS for Python API installed as well.

Make sure you have these installed before you jump in with me for some python experience.

Connect to CommCare

Lets ensure we can access CommCare first. Make sure you have a Live OData Feed that you have created with one or any of your forms. Learn about CommCare OData Feeds here.

Copy to Clipboard

 

Make sure to replace and store these variables in the .env file.

  • commcare_domain
  • commcare_form_id
  • commcare_username
  • commcare_password

If the connection is successful and your data can be read the output should read Successfully accessed OData feed

Access the JSON Data for Exploration 

Lets just see how our data columns look like. 

Copy to Clipboard

 

You might want to check your data first before you go any further.

Just add;

df.head()

Let’s Zip data to Prepare and create geospatial data 

In this step, we are going to: 

  • use geopandas to create a GeoDataframe 
  • correct issues associated with dates (in case we have this in the data) 
  • finally created a zipped shapefile 
Copy to Clipboard

Before we deal with the critical GIS component allow me to mention that CommCare presents GPS_Coordinates in the format lon lat alt accuracy. This is usually separated by a space (” “).

An example coordinate looks like so:

29.73923 -17.37292 1493.343 0.2

So now let’s split this and get what we need (longitude and latitude).

Copy to Clipboard

 

Publishing Shapefile 

Now we have a zip file, and this is what we need to publish a feature-layer but first we need to publish the shapefile onto our ArcGIS Online account first. 

So here we are: 

  • connecting to AGOL 
  • publishing the shapefile 
  • publishing the feature layer 
Copy to Clipboard

 

Publish a shapefile 

Copy to Clipboard

 

Then lets publish the feature layer, 

Copy to Clipboard

 

Additionally, the documentation also allows us to update a feature-layer that has already been published. We can do that like so: 

Copy to Clipboard

 

Updating the Data At Certain Intervals

Remember we do not desire to do this process every time our data is updated from CommCare side but rather have this being done automatically in the cloud while we sleep and enjoy our time on other stuff.

While I know there is ArGIS Datapipelines for this kind of work, some folk would want to experience the hands-on programming and remember this also consumes some credits once you start building with Pipelines. 

For this section, we will utilize the FeatureLayerCollection() class. 

Most of the code we have written will still work, but when updating the Feature Layer, do not repeat the publish feature layer section. 

What we need to do for the update script is to: 

  • access the previous published feature layer 
  • then overwrite the feature layer 
Copy to Clipboard

 

Make sure to change CommCare Data – to ArcGIS to the name you used for your layer. 

Overwrite the Existing Layer 

Overwrite the existing layer with the new data 

⚠️ DisclaimerThis is not the best approach, an updated version will be coming soon.  

Copy to Clipboard

If no errors during the last process, the console or output should display

{'success': True}

The final step

Since we are using ArcGIS, we can leverage the Notebook task to run this at periodic intervals. Read more about scheduling a Notebook task here.

Automating data workflows between CommCare and ArcGIS Online not only saves time but also ensures that your spatial data remain synchronized, enabling timely decision-making and more efficient operations.

By leveraging the power of Python, Pandas, GeoPandas, and the ArcGIS API, organizations can fully integrate their data ecosystems and make the most of both platforms.

At KM-Spatial, we specialize in delivering tailored GIS solutions that streamline workflows, integrate systems, and elevate projects to the next level. Whether you need help with GIS workflows for business, project management, or IT integration, KM-Spatial is your trusted partner for delivering innovative solutions.

Reach out to us to explore how we can help optimize your workflows and support your spatial data needs!

 

Published On: September 12th, 2024 / Categories: Automation / Tags: , , , /

Subscribe To Receive The Latest News

We promise not to spam your inbox. Just tips, nuggets and promos all the way.

Add notice about your Privacy Policy here.