Scheduling MySQL tasks using Python and cron jobs on Linux / Amazon Linux / macOS - Deleting databases and users at fixed intervals

mysql python linux shell

October 21, 2020

When creating live versions for some of my projects I needed to create temporary MySQL users and databases and had to find a way to automate their deletion every 24 hours.
Let's pretend that you have some projects that require permanent storage, typically a SQL database, but only for a limited amount of time. In my case I wanted to show off some of my projects and be able to transparently generate a user / password / database combination for every new visitor and still being able to delete this data once not needed anymore.
In this article, I'll show you how to schedule users and databases deletion solely based on their names.
Note: When dealing with deletion of any data, make sure you know what you're doing to avoid tremendous issues.
Prerequisites:
  • Familiarities with the command line
  • Familiarities with Python and Unix-like environments
  • (Optional) Famiiarities with AWS and Elastic Beanstalk environments
Every newly created user / database combination begin with the following pattern:
  • User names follow the pattern tmp_usr_<some-unique-id>
  • Database names follow the pattern data_tmp_usr_<some-unique-id>
Knowing that we can easily create a script that will delete every database and user containing theses patterns, assuming no other sensitive data follow the same pattern.
While we could have used shell scripts, SQL syntax being quite specific can conflict with bash syntax and make escaping special characters a real pain. Using Python and MySQLdb library we can easily write scripts containing SQL commands without having to deal with escaping characters.
Make sure you have mysql installed on your machine.
In a Python virtual environment, install MySQLdb library using the following command:
(env) $ pip install mysqlclient
Retreiving databases to delete
Among all our databases we want to delete the ones containing the pattern "data_tmp_usr"
mysql> show databases; +----------------------------------------+ | Database | +----------------------------------------+ | blog | | data_tmp_usr_49eb3ca6980ab9c51ce2158f | | data_tmp_usr_518760786410f494b94e0fa8 | | data_tmp_usr_81d975bec9519ea84504ad30 | | information_schema | | mysql | | performance_schema | | photos | +----------------------------------------+ 8 rows in set (0.01 sec)
Using SQL 'LIKE' statement we can select only the ones containing a specific pattern.
mysql> SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE 'data_tmp\_%' ORDER BY schema_name; +----------------------------------------+ | SCHEMA_NAME | +----------------------------------------+ | data_tmp_usr_49eb3ca6980ab9c51ce2158f | | data_tmp_usr_518760786410f494b94e0fa8 | | data_tmp_usr_81d975bec9519ea84504ad30 | +----------------------------------------+ 3 rows in set (0.02 sec)
Retreiving users to delete
Using again SQL 'LIKE' statement we can select only the users containing the desired pattern.
mysql> SELECT user from mysql.user WHERE user LIKE 'tmp_usr\_%' ORDER BY user; +----------------------------------+ | user | +----------------------------------+ | tmp_usr_49eb3ca6980ab9c51ce2158f | | tmp_usr_518760786410f494b94e0fa8 | | tmp_usr_81d975bec9519ea84504ad30 | +----------------------------------+ 3 row in set (0.01 sec)
Since we want to drop the selected users and databases, we simply need to concatenate 'DROP' with the results to make a valid SQL statement. The following command acheives just that:
1 SELECT CONCAT('DROP DATABASE `',schema_name,'`;') AS `-- stmt` 2 FROM information_schema.schemata 3 WHERE schema_name LIKE 'data_tmp_usr\_%' 4 ORDER BY schema_name;
Same goes for users to delete:
1 SELECT CONCAT('DROP user `',usr,'`;') AS `-- stmt` 2 FROM mysql.user 3 WHERE usr LIKE 'tmp_usr\_%' 4 ORDER BY schema_name;
Creating a Python script containing SQL commands to execute
Create a new .py file and allow its execution using chmod:
$ chmod +x script.py
To make your Python script executable by itself, simply add a shebang followed by the Python executable path you want to use at the beginnning of your script:
#!/usr/local/bin/python3
Import MySQLdb and specify the host and your credentials like so:
4 import MySQLdb 5 6 db = MySQLdb.connect(host="host", 7 port=3306, 8 user="user", 9 passwd="passwd", 10 db="db", 11 autocommit=True, 12 use_unicode=True 13 ) 14 cursor = db.cursor()
Using MySQLdb library we can execute our SQL query using the following command:
17 query = """SELECT CONCAT('DROP DATABASE `',schema_name,'`;') AS `-- stmt` 18 FROM information_schema.schemata 19 WHERE schema_name LIKE 'data_tmp_usr\_%' 20 ORDER BY schema_name 21 """ 22 23 cursor.execute(query) 24 queries = cursor.fetchall()
MySQLdb returns query results as a tuple of tuples like the following:
# queries: (('DROP DATABASE `data_tmp_usr_49eb3ca6980ab9c51ce2158f`;',), ('DROP DATABASE `data_tmp_usr_518760786410f494b94e0fa8`;',), ('DROP DATABASE `data_tmp_usr_81d975bec9519ea84504ad30`;',))
Results return valid SQL commands and we simply need to iterate through the tuple and execute each command one afer the other.
Note: To ensure we are not deleting the wrong data we could check that final queries are correct and do not include data that we don't want to delete.
Now to execute each query one after the other we use a for loop:
27 # Execute drop statements one after the other 28 for q in queries: 29 print(q[0]) 30 cursor.execute(q[0])
We do exactly the same to delete users:
33 query = """SELECT CONCAT('DROP USER "',user,'";') AS `-- stmt` 34 FROM mysql.user 35 WHERE user LIKE 'tmp_usr\_%' ORDER BY user; 36 """ 37 38 cursor.execute(query) 39 queries = cursor.fetchall() 40 41 # Execute drop statements one after the other 42 for q in queries: 43 print(q[0]) # Store queries into logs 44 cursor.execute(q[0])
After deleting users we need to flush the privileges using the simple command:
query = "FLUSH PRIVILEGES;" cursor.execute(query)
The whole script looks like this:
1 #!/usr/local/bin/python3 2 3 # Script run every 24 data to delete temporary users / databases created by visitors 4 import MySQLdb 5 6 # Delete temporary databases 7 db = MySQLdb.connect(host="host", 8 port=3306, 9 user="user", 10 passwd="passwd", 11 db="db", 12 autocommit=True, 13 use_unicode=True 14 ) 15 cursor = db.cursor() 16 17 query = """SELECT CONCAT('DROP DATABASE `',schema_name,'`;') AS `-- stmt` 18 FROM information_schema.schemata 19 WHERE schema_name LIKE 'data_tmp_usr\_%' 20 ORDER BY schema_name 21 """ 22 23 cursor.execute(query) 24 queries = cursor.fetchall() 25 26 27 # Execute drop statements one after the other 28 for q in queries: 29 print(q[0]) # Store queries into logs 30 cursor.execute(q[0]) 31 32 # Delete temporary users (beginning by 'tmp_user') 33 query = """SELECT CONCAT('DROP USER "',usr,'";') AS `-- stmt` 34 FROM mysql.user 35 WHERE usr LIKE 'tmp_usr\_%' ORDER BY user; 36 """ 37 38 cursor.execute(query) 39 queries = cursor.fetchall() 40 41 # Store queries into logs 42 # Execute drop statements one after the other 43 for q in queries: 44 print(q[0]) 45 cursor.execute(q[0]) 46 47 48 # Flush privileges 49 query = "FLUSH PRIVILEGES;" 50 cursor.execute(query) 51 52 cursor.close() 53 db.close() 54
Creating a cron job executing the Python script at every given time
Create a new cron job using the following command:
$ crontab -e
This will open the current user crontab file in the default text editor (usually vim or nano)
Add the following line to execute our Python script every day at midnight:
0 0 * * * /usr/local/bin/sqlCleaner.py >> /var/log/sqlCleanerCron.log 2>&1
Note: to determine what combination to use for any given interval I recommend visiting this website if you are not already familiar with the cron daemon.
Now save your modifications and exit (using :x with vim).
To make sure our cron job executes properly, let's check the log file in which it is supposed to output any data dropped from the database:
$ cat /var/log/sqlCleaner.log DROP DATABASE `data_tmp_usr_49eb3ca6980ab9c51ce2158f`; DROP DATABASE `data_tmp_usr_518760786410f494b94e0fa8`; DROP DATABASE `data_tmp_usr_81d975bec9519ea84504ad30`; DROP USER "tmp_usr_49eb3ca6980ab9c51ce2158f"; DROP USER "tmp_usr_518760786410f494b94e0fa8"; DROP USER "tmp_usr_81d975bec9519ea84504ad30";
Seems like it is working fine!
(AWS specific) Setting instance termination policies
If you have automatic scaling enabled AWS will automatically create / delete instances as the traffic of your application varies. To avoid having multiple instances running the same cron job we need to ensure that only the leader instance is running it. We also need to make sure that the leader instance should never be terminated.
I personally chose to set the instance termination policy to "Newest Instance", this way the first and oldest instance should remain the leader and thus the only one executing the cron job.
  • Go to the EC2 management console and select "Auto Scaling groups" at the bottom of the page.
  • Select your instance and click on "Edit"
  • In "Advanced Configuration", enable instance scale-in protection and set the "Termination policies" to "Newest Instance"
(AWS specific) Locating AWS Elastic Beanstalk Python virtual environment
Elastic Beanstalk uses Python virtual environments by default to run an application. At the time of writing it is located at the following path:
/var/app/venv/staging-LQM1lest/bin
To make your Python script executable, simply add a shebang followed by the Python executable path from the virtual env you want to use at the beginnning of your script:
#!/var/app/venv/staging-LQM1lest/bin/python
(AWS specific) Creating a .config file containing our cron job
We start by creating a .txt file named 02_sql_cleaner_cron.txt in .ebextensions containing the command to execute.
1 0 0 * * * root /usr/local/bin/sqlCleaner.py >> /var/log/sqlCleanerCron.log 2 # New line needed for cron to execute the cron job
Then we create a .config file in .ebextensions as well that we call cron_job.config containing the following lines:
• A container_command ensuring that old cron jobs are removed from the instance on deployement:
1 container_commands: 2 01_remove_cron_jobs: 3 command: "rm /etc/cron.d/sql_cleaner_cron || exit 0"
• The actual cron job:
5 02_sql_cleaner_cron: 6 command: "cat .ebextensions/02_sql_cleaner_cron.txt > /etc/cron.d/sql_cleaner_cron && chmod 644 /etc/cron.d/sql_cleaner_cron" 7 leader_only: true # This will ensure that only the leader instance is executing the cron job
• A .log file storing the script output:
9 files: 10 "/var/log/sqlCleaner.log": 11 mode: "000644" 12 owner: root 13 group: root 14 content: | 15 # Temporary users / databases deleted
• The Python script and its content:
17 "/usr/local/bin/sqlCleaner.py": 18 mode: "000755" 19 owner: root 20 group: root 21 content: | 22 #!/var/app/venv/staging-LQM1lest/bin/python 23 24 # Script run every 24 hours to delete temporary users / databases created by visitors 25 import MySQLdb 26 27 # Delete temporary databases 28 db = MySQLdb.connect(host="host", 29 port=3306, 30 user="user", 31 passwd="passwd", 32 db="db", 33 autocommit=True, 34 use_unicode=True 35 ) 36 cursor = db.cursor() 37 38 query = """SELECT CONCAT('DROP DATABASE `',schema_name,'`;') AS `-- stmt` 39 FROM information_schema.schemata 40 WHERE schema_name LIKE 'data_tmp_usr\_%' 41 ORDER BY schema_name 42 """ 43 44 cursor.execute(query) 45 queries = cursor.fetchall() 46 47 # Store queries into logs 48 # Execute drop statements one after the other 49 for q in queries: 50 print(q[0]) 51 cursor.execute(q[0]) 52 53 # Delete temporary users (beginning by 'tmp_user') 54 query = """SELECT CONCAT('DROP USER "',user,'";') AS `-- stmt` 55 FROM mysql.user 56 WHERE user LIKE 'tmp_usr\_%' ORDER BY user; 57 """ 58 59 cursor.execute(query) 60 queries = cursor.fetchall() 61 62 # Store queries into logs 63 # Execute drop statements one after the other 64 for q in queries: 65 print(q[0]) 66 cursor.execute(q[0]) 67 68 69 # Flush privileges 70 query = "FLUSH PRIVILEGES;" 71 cursor.execute(query) 72 73 cursor.close() 74 db.close()
The whole file looks like this:
1 container_commands: 2 01_remove_cron_jobs: 3 command: "rm /etc/cron.d/cron_jobs || exit 0" 4 5 02_sql_cleaner_cron: 6 command: "cat .ebextensions/02_sql_cleaner_cron.txt > /etc/cron.d/sql_cleaner_cron && chmod 644 /etc/cron.d/sql_cleaner_cron" 7 leader_only: true 8 9 files: 10 "/var/log/sqlCleaner.log": 11 mode: "000644" 12 owner: root 13 group: root 14 content: | 15 # Temporary users / databases deleted 16 17 "/usr/local/bin/sqlCleaner.py": 18 mode: "000755" 19 owner: root 20 group: root 21 content: | 22 #!/var/app/venv/staging-LQM1lest/bin/python 23 24 # Script run every 24 hours to delete temporary users / databases created by visitors 25 import MySQLdb 26 27 # Delete temporary databases 28 db = MySQLdb.connect(host="host", 29 port=3306, 30 user="user", 31 passwd="passwd", 32 db="db", 33 autocommit=True, 34 use_unicode=True 35 ) 36 cursor = db.cursor() 37 38 query = """SELECT CONCAT('DROP DATABASE `',schema_name,'`;') AS `-- stmt` 39 FROM information_schema.schemata 40 WHERE schema_name LIKE 'data_tmp_usr\_%' 41 ORDER BY schema_name 42 """ 43 44 cursor.execute(query) 45 queries = cursor.fetchall() 46 47 # Store queries into logs 48 # Execute drop statements one after the other 49 for q in queries: 50 print(q[0]) 51 cursor.execute(q[0]) 52 53 # Delete temporary users (beginning by 'tmp_user') 54 query = """SELECT CONCAT('DROP USER "',user,'";') AS `-- stmt` 55 FROM mysql.user 56 WHERE user LIKE 'tmp_usr\_%' ORDER BY user; 57 """ 58 59 cursor.execute(query) 60 queries = cursor.fetchall() 61 62 # Store queries into logs 63 # Execute drop statements one after the other 64 for q in queries: 65 print(q[0]) 66 cursor.execute(q[0]) 67 68 69 # Flush privileges 70 query = "FLUSH PRIVILEGES;" 71 cursor.execute(query) 72 73 cursor.close() 74 db.close() 75 70
Note: Make sure that the .config file indentation is correct. It basically uses the same syntax as YAML files (2 spaces indentation).
We can check that our cron job executes properly by logging into the instance via SSH and checking its log file:
$ cat /var/log/sqlCleaner.log