Clean up and secure WordPress data with WP Hammer
When making copies of a website for development and testing, populating a thorough content and data set is vital for Quality Assurance (QA). The most efficient path typically involves mirroring the entire production site’s database. But this can be problematic: a large site can have tens of thousands of posts (each with many revisions and healthy doses of metadata) and many user accounts.
Those user accounts (and sometimes the site’s content) can contain sensitive data that, if mishandled, can put clients at risk. On top of that, testing, development, and initial imports—often executed on lightweight virtual machines—can be painfully slow when working with very large datasets. Cleaning up imported production data is a must, but has been a tedious, inefficient task.
Enter 10up’s WP Hammer: an open source developer tool that quickly and efficiently reduces—or completely removes—production data and sensitive client information like email addresses and hashed account passwords from a WordPress installation.
Why cleanse this data?
Storing sensitive or private data on local or staging sites is a security risk. These environments rarely carry the same level of protective monitoring as a live production site. A staging site containing production credentials offers an easier (and often overlooked) target for a malicious attack.
Overtly private data—say, personal information within a medical or financial community—may even require scrubbing within an organization, by law, before being passed around for development or testing. While user tables are clear targets, some sites may also store sensitive personal data in content objects; e.g., a custom post type for “Medical Records” with custom fields.
Exposure of private information isn’t the only cause for concern. When working on a new feature, it’s far too easy for a test message or notification to accidentally be sent out to registered users brought over from the production site. It’s best to not keep contact information like email addresses in test environments.
Further, most test environments don’t require all client data—it isn’t a backup of production—just a workable subset. Why store and run complex, taxing queries on 100,000 posts when 100 posts is a sufficient sample set? In some cases, pruning posts to an exact number can even help developers test features like pagination.
How can I set up WP Hammer?
WP Hammer assumes you have basic command line familiarity, as well as a Linux / UNIX based environment (like VVV).
To install WP Hammer, begin by fetching the package and ensuring its built by running the following commands:
cd $PROJECT_WORKING_DIR
git clone https://github.com/10up/wp-hammer.git
cd wp-hammer
composer install
Once available, there are several options for installation:
- Install it as a plugin
cd wp-content/plugins
mv $PROJECT_WORKING_DIR/wp-hammer .
wp plugin activate wp-hammer
wp hammer - Call it from the command line
wp --require=$PROJECT_WORKING_DIR/wp-hammer/wp-hammer.php
- Add it to your WP-CLI config
- Add it as an alias in your .bashrc
alias hammer='wp --require=$PROJECT_WORKING_DIR/wp-hammer/wp-hammer.php'
How does WP Hammer work?
WP Hammer adds the wp hammer
command (or, if you prefer to save keystrokes, wp ha
) to the popular WP-CLI (WordPress Command Line Interface) toolkit. With it, you can easy make sitewide changes to your data, but be aware that all database modifications are final. Be sure to backup your database before running any commands.
With these basic WP Hammer commands, you can:
- Clean up user emails
wp ha -f users.user_email='yourname+__ID__@example.com'
- Clean up user passwords
wp ha -f users.user_pass=auto
- Remove extra users
wp ha -l users=10
- Remove extra posts
wp ha -l posts=100
- Replace post content with dummy content
wp ha -f posts.post_content=markov,posts.post_title=random
You can also can chain tasks together:
wp ha -f posts.post_author=auto users.user_pass=__user_email__UMINtHeroJEreAGleC users.user_email='yourname+__ID__@example.com' posts.post_title=ipsum posts.post_content=markov
The above string results in the following changes:
posts.post_author
is set to a random user ID for all remaining usersusers.user_pass
is set to the user email followed byUMINtHeroJEreAGleCusers.user_email='yourname+__ID__@example.com'
__ID__
is replaced by the user IDposts.post_title=ipsum
replaces all Post Titles with auto-generated Lorem Ipsumposts.post_content=markov
replaces all Post Content with randomly generated content, using Markov chains
Contributions welcome!
10up is actively developing WP Hammer, but we’ve also released it on Github for the entire open source community to advance. We encourage you to experiment with the tool, submit issues, and make pull requests.