We use Mixpanel at Cratejoy to track a lot of user interactions across the sites. However, there was a lot of profile data we were storing (and paying for) that we weren’t actually doing much with. So, I decided to see if it was possible to back up this data locally, create re-import files (in case we ever needed it again), and then delete in bulk.
Here’s some notes and tips about that process.
Command Line Tools
There were a few command line tools for working with the Mixpanel API and data that I found incredibly useful.
Mixpanel Engage Query is a js library for interacting with the Mixpanel API. To install:
npm install --global mixpanel-engage-query
The author (Stefan Pettersson), also has another library for dealing batch posting to the Mixpanel API called Mixpanel Engage Post.
npm install --global mixpanel-engage-post
Since the Mixpanel API data is returned as JSON, it’s useful to be able to manipulate this easily via the command line. For that I used a library called JQ. You can use brew to install JQ:
brew install jq
Once the mixpanel-engage libraries are installed, you probably want to create an .env file so you don’t have to enter the Mixpanel API keys every time you run a command.
Downloading Data for Specific Users
I had hoped mixpanel-engage-query would allow me to query for specific users, but I had no luck with a command like this:
engage -q 'properties["$email"] == "[email protected]"'
So, instead I used cURL with a JQL query. Mixpanel has documentation regarding JQL here.
Here’s what that command looks like:
curl https://mixpanel.com/api/2.0/jql \ -u SECRETKEY: \ --data-urlencode script='function main() { return People().filter(function(user){ return user.properties.$email == "[email protected]" }) }'
If you want to save that output directly into a file, just add ” > user.json” to the end of the command.
Here’s a query that will find all people records that do not have an e-mail address set and save them into a .json file:
curl https://mixpanel.com/api/2.0/jql \ -u SECRETKEY: \ --data-urlencode script='function main() { return People().filter(function(user){ if ( ! user.properties.$email ) { return user } }) }' > nullemails.json
Here’s how you could export all your people data in bulk (assuming the e-mail [email protected] doesn’t exist in your data):
curl https://mixpanel.com/api/2.0/jql \ -u SECRETKEY: \ --data-urlencode script='function main() { return People().filter(function(user){ return user.properties.$email != "[email protected]" }) }' > export-jql.json
Deleting Profiles
Since we’ve downloaded the data in json files, we can use JQ to automatically format the data into something we can use to run our delete queries.
This converts that single “user.json” record we saved above:
jq '[.[] | { "$distinct_id" : .distinct_id , "$delete": "" }]' user.json > delete.json
We can now delete the people record by running:
cat delete.json | engagepost
To verify the people record was deleted, you count people records before and after the delete (assuming new ones aren’t being added):
engage -t
Or, log into Mixpanel and go to the profile record and verify it’s been deleted.
The same commands above would also work in bulk for the other json exports we pulled.
Creating and Import File and Importing
To reformat the export data as import file, we’ll use JQ again.
jq '[.[] | { "$distinct_id" : .distinct_id, "$ignore_time" : "true", "$set": .properties }]' user.json > import.json
Then import it using the engage library:
cat import.json | engagepost
I’d highly recommend experimenting with exporting, deleting, and importing with small subsets of people data and verify everything is working correctly before running it against a large dataset. Also, backup all the records before you do any work!
Troubleshooting Tips
Mixpanel API Docs: https://mixpanel.com/help/reference
Their support team generally responds within a day or two if you have specific questions.
If you want to edit the engage scripts to get more verbose logging, those should be installed somewhere around here:
/Users/YourName/.nvm/versions/node/v5.7.0/lib/node_modules/mixpanel-engage-post
I had cases where specific profiles weren’t deleting. Turns out these were aliased profiles. They can be deleted by passing $ignore_alias : “true” flag.