Module 9: Performance Tuning and Optimization
Learn advanced techniques to optimize SLT replication performance, reduce latency, and maximize throughput.
1. Performance Baseline
Current Performance Assessment
-- Measure current performance
SELECT
MT_ID,
TABLE_NAME,
COUNT(*) / TIMESTAMPDIFF(HOUR, MIN(TIMESTAMP), MAX(TIMESTAMP)) as RECORDS_PER_HOUR,
AVG(TIMESTAMPDIFF(MILLISECOND, TIMESTAMP, PROCESSING_TIME)) as AVG_LATENCY_MS
FROM /DMIS/REPLICATION_STATS
WHERE TIMESTAMP >= ADD_DAYS(CURRENT_DATE, -7)
GROUP BY MT_ID, TABLE_NAME;
Performance Targets
| Metric | Current | Target | Excellent |
|---|---|---|---|
| Throughput | 500 rec/s | 2,000 rec/s | 5,000+ rec/s |
| Latency | 5 seconds | 2 seconds | < 1 second |
| Error Rate | 1% | 0.1% | < 0.01% |
| CPU Usage | 85% | 60-70% | 40-60% |
2. Bottleneck Identification
Performance Analysis Flow
graph TD
A[Slow Replication] --> B{Where is bottleneck?}
B -->|Source Read| C[Optimize source queries]
B -->|Network| D[Increase bandwidth/reduce latency]
B -->|Transformation| E[Simplify or offload transformations]
B -->|Target Write| F[Optimize target database]
B -->|SLT Processing| G[Increase jobs/memory]
Diagnostic Queries
-- Find slowest tables
SELECT
TABLE_NAME,
AVG(PROCESSING_TIME_MS) as AVG_TIME,
MAX(PROCESSING_TIME_MS) as MAX_TIME,
COUNT(*) as RECORD_COUNT
FROM /DMIS/PERFORMANCE_LOG
WHERE TIMESTAMP >= CURRENT_DATE
GROUP BY TABLE_NAME
HAVING AVG_TIME > 1000
ORDER BY AVG_TIME DESC
LIMIT 10;
-- Identify resource bottlenecks
SELECT
'CPU' as RESOURCE,
AVG(CPU_PERCENT) as USAGE
FROM M_HOST_RESOURCE_UTILIZATION
UNION ALL
SELECT 'MEMORY', AVG(USED_PHYSICAL_MEMORY * 100.0 / TOTAL_PHYSICAL_MEMORY)
FROM M_HOST_RESOURCE_UTILIZATION;
3. Source System Optimization
Optimize Source Reads
1. Reduce Package Size for Small Tables
Large packages: Overhead, memory issues
Small packages: Better parallelism
MARA (100K rows): Package = 10,000
VBAP (10M rows): Package = 100,000
2. Add Indexes on Source Tables (if missing)
CREATE INDEX IDX_MARA_ERDAT ON MARA(ERDAT);
3. Partitioning Large Tables
PARTITION TABLE BSEG BY RANGE(GJAHR);
4. Archive Old Data
Only replicate active data: WHERE ERDAT >= '20240101'
4. Network Optimization
Bandwidth and Latency
Test Network Performance:
├── Bandwidth: iperf -c target-system
├── Latency: ping -c 100 target-system
├── Packet Loss: mtr target-system
└── RFC Connection: SM59 → Connection Test
Optimization Techniques:
✅ Use dedicated network link for replication
✅ Enable compression (for low bandwidth)
✅ Increase RFC timeout for unstable networks
✅ Use QoS to prioritize replication traffic
Compression Configuration
LTRC → Advanced Settings → Network
Data Compression:
☑ Enable compression
Compression Level: ● Medium (balance)
○ High (CPU intensive)
○ Low (faster)
Result:
Before: 100 MB/s network usage
After: 30 MB/s (70% reduction)
Trade-off: +5% CPU usage
5. SLT Server Tuning
Parallel Job Optimization
Current: 8 jobs → 1,500 rec/s
Target: 24 jobs → 4,500 rec/s (3x improvement)
Configuration:
LTRC → Advanced Settings
Number of Parallel Jobs: [24]
Calculation:
Optimal Jobs = (CPU Cores × 2) - 4
Example: 16 cores × 2 - 4 = 28 jobs
Memory Configuration
Instance Profile: SLT_DVEBMGS00_hostname
# Increase memory allocations
em/initial_size_MB = 4096
em/max_size_MB = 8192
ztta/roll_extension = 4000000000
# Buffering
zcsa/table_buffer_area = 1000000
zcsa/db_max_buftab = 50000
Restart required: stopsap r3 / startsap r3
Job Scheduling Optimization
-- Distribute jobs evenly across work processes
Transaction: SM50 (Process Overview)
Before Tuning:
WP 0: ████████████████████ (100% busy)
WP 1: ████████████████████ (100% busy)
WP 2: ████░░░░░░░░░░░░░░░░ ( 20% busy)
WP 3: ██░░░░░░░░░░░░░░░░░░ ( 10% busy)
After Tuning:
WP 0: ████████████░░░░░░░░ ( 60% busy)
WP 1: ████████████░░░░░░░░ ( 60% busy)
WP 2: ███████████░░░░░░░░░ ( 55% busy)
WP 3: ███████████░░░░░░░░░ ( 55% busy)
6. Target System Optimization
HANA Database Tuning
-- Disable auto-merge during heavy replication
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'SYSTEM')
SET ('mergedog', 'active') = 'no' WITH RECONFIGURE;
-- Increase write buffer
ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM')
SET ('persistence', 'log_buffer_size_kb') = '102400' WITH RECONFIGURE;
-- Optimize delta merge
ALTER TABLE SLTREPL.VBAP
AUTO MERGE OFF;
-- Manual merge during low-traffic period
MERGE DELTA OF SLTREPL.VBAP;
Index Management
-- Disable indexes during initial load
ALTER TABLE SLTREPL.MARA DISABLE INDEX ALL;
[Perform initial load]
-- Rebuild indexes after load
ALTER TABLE SLTREPL.MARA REBUILD INDEX ALL;
-- Result:
Initial Load Time:
With indexes: 4 hours
Without indexes: 1.5 hours (62% faster)
Partitioning Strategy
-- Partition large target tables by date
ALTER TABLE SLTREPL.BSEG PARTITION BY RANGE (GJAHR) (
PARTITION P2023 VALUES < '2024',
PARTITION P2024 VALUES < '2025',
PARTITION P2025 VALUES < '2026',
PARTITION P2026 VALUES < '2027'
);
-- Benefits:
- Faster inserts (partition-local)
- Better query performance
- Easier archival
7. Transformation Optimization
Push Transformations to Target
❌ Bad: Complex transformation in SLT
- Slows replication
- Increases latency
- Uses SLT resources
✅ Good: Simple replication + HANA view
- Fast replication
- Transformation in HANA (optimized)
- Leverages HANA parallelism
Example:
Instead of: SLT transformation (field mapping, calculations)
Use: Direct replication + Calculation View in HANA
Simplify Transformation Logic
" Before (Slow):
DATA: lv_result TYPE string.
LOOP AT lt_table INTO ls_row.
lv_result = complex_calculation( ls_row ).
ls_row-new_field = lv_result.
MODIFY lt_table FROM ls_row.
ENDLOOP.
" After (Fast):
SELECT * FROM @lt_table AS t
INNER JOIN calc_table AS c ON t~key = c~key
INTO TABLE @lt_result.
8. Monitoring and Tuning
Real-Time Performance Dashboard
Transaction: LTRC → Performance Tab
Current Performance:
┌──────────────────────────────────────┐
│ Throughput: 3,245 rec/s ⬆ +45% │
│ Latency: 0.8 seconds ⬇ -60% │
│ CPU Usage: 68% ⬇ -17% │
│ Memory: 52% → Stable │
└──────────────────────────────────────┘
Top Consumers:
Table Records/s Latency % Total
VBAP 1,234 0.5s 38%
BSEG 892 1.2s 27%
MSEG 654 0.7s 20%
MARA 465 0.3s 15%
Performance Trending
-- Week-over-week comparison
SELECT
TO_VARCHAR(DATE_TRUNC('WEEK', TIMESTAMP), 'YYYY-WW') as WEEK,
AVG(THROUGHPUT) as AVG_THROUGHPUT,
AVG(LATENCY_MS) as AVG_LATENCY
FROM /DMIS/PERFORMANCE_STATS
WHERE TIMESTAMP >= ADD_MONTHS(CURRENT_DATE, -3)
GROUP BY DATE_TRUNC('WEEK', TIMESTAMP)
ORDER BY WEEK;
-- Result:
WEEK AVG_THROUGHPUT AVG_LATENCY CHANGE
2025-50 1,500 5,000ms baseline
2025-51 1,850 4,200ms +23% ⬆
2026-01 2,400 2,800ms +30% ⬆
2026-02 3,100 1,200ms +29% ⬆
2026-03 3,850 800ms +24% ⬆
9. Advanced Techniques
Technique 1: Micro-Batching
Instead of: Process each record individually
Use: Batch small changes together
Configuration:
Commit Frequency: Every 1,000 records
Batch Window: 5 seconds
Result:
- 40% reduction in commit overhead
- 25% throughput improvement
Technique 2: Table-Specific Tuning
Different tables, different strategies:
High-frequency tables (VBAP):
- Small batches (10K)
- More parallel jobs (16)
- Aggressive cleanup (hourly)
Low-frequency tables (MARA):
- Large batches (100K)
- Fewer jobs (4)
- Standard cleanup (daily)
Technique 3: Load Balancing
Distribute tables across multiple MT_IDs:
MT_ID_SALES_01:
├── VBAK (4 jobs)
├── VBAP (8 jobs)
└── VBFA (4 jobs)
MT_ID_FINANCE_01:
├── BKPF (6 jobs)
├── BSEG (10 jobs)
└── BSID (4 jobs)
Benefit: Parallel processing, isolated tuning
10. Performance Tuning Checklist
Quick Wins (1-2 hours)
- Increase parallel jobs to 2× CPU cores
- Enable data compression for remote targets
- Clean up logging tables (delete processed)
- Add indexes on frequently filtered fields
- Disable unnecessary transformations
Medium Impact (1-2 days)
- Optimize target database (HANA tuning)
- Implement table partitioning
- Push transformations to target views
- Set up load balancing across MT_IDs
- Configure micro-batching
Long-Term (ongoing)
- Monitor and trend performance weekly
- Capacity planning (quarterly)
- Regular health checks
- Performance regression testing
- Document optimization changes
Summary
✅ Performance baseline and targets ✅ Bottleneck identification techniques ✅ Source, network, and target optimization ✅ SLT server tuning (jobs, memory) ✅ Transformation optimization strategies ✅ Real-time monitoring and trending ✅ Advanced tuning techniques ✅ Performance checklist
Next: Module 10 - Complex Scenarios and Advanced Use Cases