Partitioning for ASTPP CDRs Table

Following worked for me so that sharing for others use. Please do remember to take a full backup of the db using root

mysqldump -u root -p --single-transaction --routines --triggers astpp > /opt/astpp_backup_$(date +%Y%m%d_%H%M%S).sql

-- Pure Partitioning for ASTPP CDRs Table
-- Uses only callstart column - ignores reseller_id and other constraints

-- ============================================================================
-- PHASE 1: VERIFY CURRENT STRUCTURE
-- ============================================================================

-- Check current index (we'll ignore reseller_id for partitioning)
SHOW INDEX FROM cdrs;

-- Verify callstart data distribution
SELECT 
    YEAR(callstart) as year,
    MONTH(callstart) as month,
    COUNT(*) as record_count
FROM cdrs
GROUP BY YEAR(callstart), MONTH(callstart)
ORDER BY year DESC, month DESC;

-- ============================================================================
-- PHASE 2: PURE PARTITIONING (ONLY CALLSTART)
-- ============================================================================

-- Partition using only callstart column
-- This ignores reseller_id and other constraints completely

ALTER TABLE cdrs PARTITION BY RANGE (YEAR(callstart) * 100 + MONTH(callstart)) (
    -- 2024 partitions (historical data)
    PARTITION p2024_07 VALUES LESS THAN (202407),
    PARTITION p2024_08 VALUES LESS THAN (202408),
    PARTITION p2024_09 VALUES LESS THAN (202409),
    PARTITION p2024_10 VALUES LESS THAN (202410),
    PARTITION p2024_11 VALUES LESS THAN (202411),
    PARTITION p2024_12 VALUES LESS THAN (202412),
    
    -- 2025 partitions (current data)
    PARTITION p2025_01 VALUES LESS THAN (202501),
    PARTITION p2025_02 VALUES LESS THAN (202502),
    PARTITION p2025_03 VALUES LESS THAN (202503),
    PARTITION p2025_04 VALUES LESS THAN (202504),
    PARTITION p2025_05 VALUES LESS THAN (202505),
    PARTITION p2025_06 VALUES LESS THAN (202506),
    PARTITION p2025_07 VALUES LESS THAN (202507),
    PARTITION p2025_08 VALUES LESS THAN (202508),
    PARTITION p2025_09 VALUES LESS THAN (202509),
    PARTITION p2025_10 VALUES LESS THAN (202510),
    PARTITION p2025_11 VALUES LESS THAN (202511),
    PARTITION p2025_12 VALUES LESS THAN (202512),
    
    -- Future partition for data beyond 2025
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

-- ============================================================================
-- PHASE 3: VERIFY PARTITIONING
-- ============================================================================

-- Check partition distribution
SELECT 
    PARTITION_NAME,
    TABLE_ROWS,
    ROUND(DATA_LENGTH/1024/1024, 2) as data_size_mb
FROM INFORMATION_SCHEMA.PARTITIONS 
WHERE TABLE_SCHEMA = 'astpp' 
AND TABLE_NAME = 'cdrs'
ORDER BY PARTITION_ORDINAL_POSITION;

-- Verify data distribution across partitions
SELECT 
    PARTITION_NAME,
    COUNT(*) as record_count
FROM cdrs
GROUP BY PARTITION_NAME
ORDER BY PARTITION_NAME;

-- ============================================================================
-- PHASE 4: TEST PARTITION PRUNING
-- ============================================================================

-- Test partition pruning for July 2025 (should only scan p2025_07)
EXPLAIN SELECT COUNT(*) FROM cdrs 
WHERE callstart >= '2025-07-01' AND callstart < '2025-08-01';

-- Test partition pruning for August 2025 (should only scan p2025_08)
EXPLAIN SELECT COUNT(*) FROM cdrs 
WHERE callstart >= '2025-08-01' AND callstart < '2025-09-01';

-- ============================================================================
-- PHASE 5: PERFORMANCE TESTING
-- ============================================================================

-- Test July 2025 query performance
SELECT 
    'July 2025' as period,
    COUNT(*) as total_calls,
    SUM(billseconds) as total_duration,
    SUM(debit) as total_revenue
FROM cdrs 
WHERE callstart >= '2025-07-01' AND callstart < '2025-08-01';

-- Test August 2025 query performance
SELECT 
    'August 2025' as period,
    COUNT(*) as total_calls,
    SUM(billseconds) as total_duration,
    SUM(debit) as total_revenue
FROM cdrs 
WHERE callstart >= '2025-08-01' AND callstart < '2025-09-01';

-- ============================================================================
-- PHASE 6: MONITORING
-- ============================================================================

-- Monitor partition sizes
SELECT 
    PARTITION_NAME,
    TABLE_ROWS,
    ROUND(DATA_LENGTH/1024/1024, 2) as data_size_mb,
    ROUND(INDEX_LENGTH/1024/1024, 2) as index_size_mb
FROM INFORMATION_SCHEMA.PARTITIONS 
WHERE TABLE_SCHEMA = 'astpp' 
AND TABLE_NAME = 'cdrs'
ORDER BY PARTITION_ORDINAL_POSITION;

-- Check partition pruning effectiveness
EXPLAIN SELECT COUNT(*) FROM cdrs 
WHERE callstart >= '2025-07-01' AND callstart < '2025-08-01';

-- ============================================================================
-- PHASE 7: ROLLBACK (IF NEEDED)
-- ============================================================================

-- If partitioning causes issues, remove it (keeps data intact)
-- ALTER TABLE cdrs REMOVE PARTITIONING;

-- ============================================================================
-- USAGE EXAMPLES
-- ============================================================================

/*
-- Query specific date range (automatic partition pruning)
SELECT COUNT(*) FROM cdrs 
WHERE callstart >= '2025-07-01' AND callstart < '2025-08-01';

-- Daily call counts for July 2025
SELECT 
    DATE(callstart) as call_date,
    COUNT(*) as daily_calls,
    SUM(billseconds) as total_bill_seconds,
    SUM(debit) as total_debit
FROM cdrs
WHERE callstart >= '2025-07-01' AND callstart < '2025-08-01'
GROUP BY DATE(callstart)
ORDER BY call_date;

-- Account analysis for July 2025
SELECT 
    accountid,
    COUNT(*) as call_count,
    SUM(billseconds) as total_bill_seconds,
    SUM(debit) as total_debit
FROM cdrs
WHERE callstart >= '2025-07-01' AND callstart < '2025-08-01'
GROUP BY accountid
ORDER BY call_count DESC
LIMIT 10;
*/ 
1 Like

@linux.kernel Thanks for sharing your approach! While I haven’t personally tested it yet, it looks quite promising. Really appreciate you taking the time to explore this and share your insights. It could be very helpful for others.

A small note of caution: this should only be attempted if you’re comfortable with MySQL, as it requires regular monitoring and maintenance. Otherwise, it could potentially affect a production system.

Replying again after testing for 6 months. This clearly helped me to increase performance.

BTW I did following mods to the code back in November 2025


-- Use a non-conflicting delimiter for the routine body
DELIMITER $$

-- Optional: increase concat buffer in case many partitions match
SET SESSION group_concat_max_len = 1048576 $$

DROP PROCEDURE IF EXISTS astpp.sp_cdrs_partition_maint $$
CREATE PROCEDURE astpp.sp_cdrs_partition_maint()
BEGIN
  DECLARE v_keep_from_yymm CHAR(6);
  DECLARE v_drop_stmt      TEXT;

  /* Keep last 4 full months (older than this yyyymm will be dropped) */
  SET v_keep_from_yymm = DATE_FORMAT(DATE_SUB(CURDATE(), INTERVAL 4 MONTH), '%Y%m');

  /* Build DROP PARTITION statement for all older monthly partitions */
  SELECT CONCAT(
           'ALTER TABLE astpp.cdrs DROP PARTITION ',
           GROUP_CONCAT(p.PARTITION_NAME ORDER BY p.PARTITION_ORDINAL_POSITION SEPARATOR ', '),
           ';'
         )
    INTO v_drop_stmt
  FROM INFORMATION_SCHEMA.PARTITIONS p
  WHERE p.TABLE_SCHEMA   = 'astpp'
    AND p.TABLE_NAME     = 'cdrs'
    AND p.PARTITION_NAME REGEXP '^p[0-9]{4}_[0-9]{2}$'
    AND CAST(REPLACE(REPLACE(p.PARTITION_NAME, 'p', ''), '_', '') AS UNSIGNED)
        < CAST(v_keep_from_yymm AS UNSIGNED);

  /* Execute DROP if any target partitions exist */
  IF v_drop_stmt IS NOT NULL AND v_drop_stmt <> '' THEN
    SET @q := v_drop_stmt;
    PREPARE s FROM @q;
    EXECUTE s;
    DEALLOCATE PREPARE s;
  END IF;

  /* Ensure next two future months exist (m1 = +1 month, m2 = +2 months) */
  SET @m1_yymm = DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 1 MONTH), '%Y%m');
  SET @m1_name = DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 1 MONTH), 'p%Y_%m');
  SET @m2_yymm = DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 2 MONTH), '%Y%m');
  SET @m2_name = DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 2 MONTH), 'p%Y_%m');

  /* Add +1 month if missing */
  IF (SELECT COUNT(*)
      FROM INFORMATION_SCHEMA.PARTITIONS
      WHERE TABLE_SCHEMA='astpp' AND TABLE_NAME='cdrs' AND PARTITION_NAME=@m1_name) = 0 THEN
    SET @sql := CONCAT(
      'ALTER TABLE astpp.cdrs REORGANIZE PARTITION p_future INTO (',
      'PARTITION ', @m1_name, ' VALUES LESS THAN (', @m1_yymm, '), ',
      'PARTITION p_future VALUES LESS THAN MAXVALUE',
      ');'
    );
    PREPARE s1 FROM @sql; EXECUTE s1; DEALLOCATE PREPARE s1;
  END IF;

  /* Add +2 months if missing (p_future remains the last partition after the previous step) */
  IF (SELECT COUNT(*)
      FROM INFORMATION_SCHEMA.PARTITIONS
      WHERE TABLE_SCHEMA='astpp' AND TABLE_NAME='cdrs' AND PARTITION_NAME=@m2_name) = 0 THEN
    SET @sql := CONCAT(
      'ALTER TABLE astpp.cdrs REORGANIZE PARTITION p_future INTO (',
      'PARTITION ', @m2_name, ' VALUES LESS THAN (', @m2_yymm, '), ',
      'PARTITION p_future VALUES LESS THAN MAXVALUE',
      ');'
    );
    PREPARE s2 FROM @sql; EXECUTE s2; DEALLOCATE PREPARE s2;
  END IF;

  /* Refresh stats (cheap) */
  ANALYZE TABLE astpp.cdrs;
END $$
DELIMITER ;




…… BELOW IS THE JOB ….. 

SET GLOBAL event_scheduler = ON;

DROP EVENT IF EXISTS astpp.evt_cdrs_partition_maint;

DELIMITER $$
CREATE EVENT astpp.evt_cdrs_partition_maint
ON SCHEDULE EVERY 1 MONTH
    STARTS TIMESTAMP(DATE_FORMAT(CURRENT_DATE, '%Y-%m-01 02:00:00'))
DO
BEGIN
  CALL astpp.sp_cdrs_partition_maint();
END $$
DELIMITER ;



-- call job manually

CALL astpp.sp_cdrs_partition_maint();

-- monitoring 

SELECT PARTITION_NAME,
       TABLE_ROWS,
       ROUND((DATA_LENGTH+INDEX_LENGTH)/1024/1024,2) AS size_mb
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_SCHEMA='astpp' AND TABLE_NAME='cdrs'
ORDER BY PARTITION_ORDINAL_POSITION;

This is excellent work and exactly the kind of contribution that makes this community valuable.

The approach of partitioning by callstart using RANGE on YEAR * 100 + MONTH is clean and well thought out. Ignoring reseller_id for the partition key was the right call since adding composite partition keys in MySQL creates significant overhead and constraint complexity that is not worth it for a CDR table.

The November update with the automated stored procedure and Event Scheduler is particularly useful. Running partition maintenance as a scheduled MySQL event rather than a manual cron job is the right production pattern, and keeping 4 months of partitions before dropping older ones is a sensible default for most operators.

A few things worth noting for others reading this thread:

Before running the ALTER TABLE to add partitioning, ensure the MySQL Event Scheduler is enabled on your server. You can check and enable it with SET GLOBAL event_scheduler = ON. Some server setups have it disabled by default.

Also confirm that the MySQL user running ASTPP has EVENT and EXECUTE privileges if you want the automated maintenance to work without manual intervention.

Thank you for sharing this and for coming back with real-world results. Posts like this save other operators hours of trial and error. We will look at whether we can reference this in our documentation as a recommended optimisation for high-volume deployments.

Here is the documentation:

https://inextrix.atlassian.net/wiki/spaces/ASTPP/pages/846528515/CDR+Table+Partitioning+for+High-Volume+Deployments