“Efficiency meets automation: Keeping your MariaDB clean and your DBA informed with Python.”
MinervaDB Automation and Performance Engineering
To delete duplicate records in MariaDB, you can use SQL queries to identify and remove them. Below are the steps for doing this and a Python script for scheduling this task weekly and notifying the DBA via email. In this example, we’ll use the smtplib library to send email notifications.
Step 1: Delete Duplicate Records
You can identify and delete duplicate records using a SQL query that utilizes the DELETE statement with a self-join. Here’s a sample SQL query to delete duplicates from a table named your_table based on a specific criteria (e.g., duplicate email values):
1 2 3 4 5 |
DELETE t1 FROM your_table t1 JOIN your_table t2 ON t1.email = t2.email WHERE t1.id > t2.id; |
This query will delete all but one of the duplicate records, keeping the one with the lowest id value. Adjust the table name, columns, and criteria to match your database schema.
Step 2: Python Script for Weekly Scheduling and Email Notification
You can use Python’s schedule library to schedule the task and smtplib to send email notifications. Before running the script, make sure you have configured your SMTP email settings.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
import schedule import time import smtplib from email.mime.text import MIMEText # Database Connection import mysql.connector db_connection = mysql.connector.connect( host="your_db_host", user="your_db_user", password="your_db_password", database="your_db_name" ) db_cursor = db_connection.cursor() # Function to Delete Duplicate Records def delete_duplicates(): # Execute the SQL query to delete duplicates delete_query = """ DELETE t1 FROM your_table t1 JOIN your_table t2 ON t1.email = t2.email WHERE t1.id > t2.id; """ db_cursor.execute(delete_query) db_connection.commit() # Function to Send Email Notification def send_email_notification(): # Email Configuration sender_email = "your_sender_email@gmail.com" sender_password = "your_sender_password" recipient_email = "dba_email@example.com" # Create the email message msg = MIMEText("Duplicate records have been deleted in the MariaDB.") msg["Subject"] = "MariaDB Duplicate Records Cleanup" msg["From"] = sender_email msg["To"] = recipient_email # Send the email try: server = smtplib.SMTP("smtp.gmail.com", 587) server.starttls() server.login(sender_email, sender_password) server.sendmail(sender_email, recipient_email, msg.as_string()) server.quit() print("Email notification sent successfully.") except Exception as e: print(f"Email notification failed: {str(e)}") # Schedule the task to run weekly schedule.every().week.at("00:00:00").do(delete_duplicates) schedule.every().week.at("00:15:00").do(send_email_notification) # Run the scheduled tasks while True: schedule.run_pending() time.sleep(1) |
Make sure to replace the placeholders (e.g., your_db_host, your_db_user, your_db_password, your_db_name, your_table, and email details) with your actual database and email configuration. This script will delete duplicate records and send an email notification to the DBA every week on the specified schedule.
Note: Ensure that you have installed the required Python libraries (schedule, smtplib) using pip if they are not already installed. Additionally, consider running this script on a server or machine where it can be scheduled to run automatically.