Module 9: Performance Tuning and Optimization

Learn advanced techniques to optimize SLT replication performance, reduce latency, and maximize throughput.

1. Performance Baseline

Current Performance Assessment

-- Measure current performance
SELECT 
  MT_ID,
  TABLE_NAME,
  COUNT(*) / TIMESTAMPDIFF(HOUR, MIN(TIMESTAMP), MAX(TIMESTAMP)) as RECORDS_PER_HOUR,
  AVG(TIMESTAMPDIFF(MILLISECOND, TIMESTAMP, PROCESSING_TIME)) as AVG_LATENCY_MS
FROM /DMIS/REPLICATION_STATS
WHERE TIMESTAMP >= ADD_DAYS(CURRENT_DATE, -7)
GROUP BY MT_ID, TABLE_NAME;

Performance Targets

Metric	Current	Target	Excellent
Throughput	500 rec/s	2,000 rec/s	5,000+ rec/s
Latency	5 seconds	2 seconds	< 1 second
Error Rate	1%	0.1%	< 0.01%
CPU Usage	85%	60-70%	40-60%

2. Bottleneck Identification

Performance Analysis Flow

graph TD
    A[Slow Replication] --> B{Where is bottleneck?}
    B -->|Source Read| C[Optimize source queries]
    B -->|Network| D[Increase bandwidth/reduce latency]
    B -->|Transformation| E[Simplify or offload transformations]
    B -->|Target Write| F[Optimize target database]
    B -->|SLT Processing| G[Increase jobs/memory]

Diagnostic Queries

-- Find slowest tables
SELECT 
  TABLE_NAME,
  AVG(PROCESSING_TIME_MS) as AVG_TIME,
  MAX(PROCESSING_TIME_MS) as MAX_TIME,
  COUNT(*) as RECORD_COUNT
FROM /DMIS/PERFORMANCE_LOG
WHERE TIMESTAMP >= CURRENT_DATE
GROUP BY TABLE_NAME
HAVING AVG_TIME > 1000
ORDER BY AVG_TIME DESC
LIMIT 10;

-- Identify resource bottlenecks
SELECT 
  'CPU' as RESOURCE,
  AVG(CPU_PERCENT) as USAGE
FROM M_HOST_RESOURCE_UTILIZATION
UNION ALL
SELECT 'MEMORY', AVG(USED_PHYSICAL_MEMORY * 100.0 / TOTAL_PHYSICAL_MEMORY)
FROM M_HOST_RESOURCE_UTILIZATION;

3. Source System Optimization

Optimize Source Reads

1. Reduce Package Size for Small Tables
   Large packages: Overhead, memory issues
   Small packages: Better parallelism
   
   MARA (100K rows): Package = 10,000
   VBAP (10M rows): Package = 100,000

2. Add Indexes on Source Tables (if missing)
   CREATE INDEX IDX_MARA_ERDAT ON MARA(ERDAT);
   
3. Partitioning Large Tables
   PARTITION TABLE BSEG BY RANGE(GJAHR);

4. Archive Old Data
   Only replicate active data: WHERE ERDAT >= '20240101'

4. Network Optimization

Bandwidth and Latency

Test Network Performance:
├── Bandwidth: iperf -c target-system
├── Latency: ping -c 100 target-system
├── Packet Loss: mtr target-system
└── RFC Connection: SM59 → Connection Test

Optimization Techniques:
✅ Use dedicated network link for replication
✅ Enable compression (for low bandwidth)
✅ Increase RFC timeout for unstable networks
✅ Use QoS to prioritize replication traffic

Compression Configuration

LTRC → Advanced Settings → Network

Data Compression:
☑ Enable compression
Compression Level: ● Medium (balance)
                   ○ High (CPU intensive)
                   ○ Low (faster)

Result:
Before: 100 MB/s network usage
After:  30 MB/s (70% reduction)
Trade-off: +5% CPU usage

5. SLT Server Tuning

Parallel Job Optimization

Current: 8 jobs → 1,500 rec/s
Target:  24 jobs → 4,500 rec/s (3x improvement)

Configuration:
LTRC → Advanced Settings
Number of Parallel Jobs: [24]

Calculation:
Optimal Jobs = (CPU Cores × 2) - 4
Example: 16 cores × 2 - 4 = 28 jobs

Memory Configuration

Instance Profile: SLT_DVEBMGS00_hostname

# Increase memory allocations
em/initial_size_MB = 4096
em/max_size_MB = 8192
ztta/roll_extension = 4000000000

# Buffering
zcsa/table_buffer_area = 1000000
zcsa/db_max_buftab = 50000

Restart required: stopsap r3 / startsap r3

Job Scheduling Optimization

-- Distribute jobs evenly across work processes
Transaction: SM50 (Process Overview)

Before Tuning:
WP 0: ████████████████████  (100% busy)
WP 1: ████████████████████  (100% busy)
WP 2: ████░░░░░░░░░░░░░░░░  ( 20% busy)
WP 3: ██░░░░░░░░░░░░░░░░░░  ( 10% busy)

After Tuning:
WP 0: ████████████░░░░░░░░  ( 60% busy)
WP 1: ████████████░░░░░░░░  ( 60% busy)
WP 2: ███████████░░░░░░░░░  ( 55% busy)
WP 3: ███████████░░░░░░░░░  ( 55% busy)

6. Target System Optimization

HANA Database Tuning

-- Disable auto-merge during heavy replication
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'SYSTEM')
SET ('mergedog', 'active') = 'no' WITH RECONFIGURE;

-- Increase write buffer
ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM')
SET ('persistence', 'log_buffer_size_kb') = '102400' WITH RECONFIGURE;

-- Optimize delta merge
ALTER TABLE SLTREPL.VBAP
AUTO MERGE OFF;

-- Manual merge during low-traffic period
MERGE DELTA OF SLTREPL.VBAP;

Index Management

-- Disable indexes during initial load
ALTER TABLE SLTREPL.MARA DISABLE INDEX ALL;

[Perform initial load]

-- Rebuild indexes after load
ALTER TABLE SLTREPL.MARA REBUILD INDEX ALL;

-- Result:
Initial Load Time:
  With indexes: 4 hours
  Without indexes: 1.5 hours (62% faster)

Partitioning Strategy

-- Partition large target tables by date
ALTER TABLE SLTREPL.BSEG PARTITION BY RANGE (GJAHR) (
  PARTITION P2023 VALUES < '2024',
  PARTITION P2024 VALUES < '2025',
  PARTITION P2025 VALUES < '2026',
  PARTITION P2026 VALUES < '2027'
);

-- Benefits:
- Faster inserts (partition-local)
- Better query performance
- Easier archival

7. Transformation Optimization

Push Transformations to Target

❌ Bad: Complex transformation in SLT
   - Slows replication
   - Increases latency
   - Uses SLT resources

✅ Good: Simple replication + HANA view
   - Fast replication
   - Transformation in HANA (optimized)
   - Leverages HANA parallelism

Example:
Instead of: SLT transformation (field mapping, calculations)
Use: Direct replication + Calculation View in HANA

Simplify Transformation Logic

" Before (Slow):
DATA: lv_result TYPE string.
LOOP AT lt_table INTO ls_row.
  lv_result = complex_calculation( ls_row ).
  ls_row-new_field = lv_result.
  MODIFY lt_table FROM ls_row.
ENDLOOP.

" After (Fast):
SELECT * FROM @lt_table AS t
  INNER JOIN calc_table AS c ON t~key = c~key
  INTO TABLE @lt_result.

8. Monitoring and Tuning

Real-Time Performance Dashboard

Transaction: LTRC → Performance Tab

Current Performance:
┌──────────────────────────────────────┐
│ Throughput: 3,245 rec/s ⬆ +45%     │
│ Latency:    0.8 seconds ⬇ -60%     │
│ CPU Usage:  68%         ⬇ -17%     │
│ Memory:     52%         → Stable    │
└──────────────────────────────────────┘

Top Consumers:
Table    Records/s  Latency  % Total
VBAP     1,234      0.5s     38%
BSEG     892        1.2s     27%
MSEG     654        0.7s     20%
MARA     465        0.3s     15%

-- Week-over-week comparison
SELECT 
  TO_VARCHAR(DATE_TRUNC('WEEK', TIMESTAMP), 'YYYY-WW') as WEEK,
  AVG(THROUGHPUT) as AVG_THROUGHPUT,
  AVG(LATENCY_MS) as AVG_LATENCY
FROM /DMIS/PERFORMANCE_STATS
WHERE TIMESTAMP >= ADD_MONTHS(CURRENT_DATE, -3)
GROUP BY DATE_TRUNC('WEEK', TIMESTAMP)
ORDER BY WEEK;

-- Result:
WEEK     AVG_THROUGHPUT  AVG_LATENCY  CHANGE
2025-50  1,500          5,000ms      baseline
2025-51  1,850          4,200ms      +23% ⬆
2026-01  2,400          2,800ms      +30% ⬆
2026-02  3,100          1,200ms      +29% ⬆
2026-03  3,850          800ms        +24% ⬆

9. Advanced Techniques

Technique 1: Micro-Batching

Instead of: Process each record individually
Use: Batch small changes together

Configuration:
Commit Frequency: Every 1,000 records
Batch Window: 5 seconds

Result:
- 40% reduction in commit overhead
- 25% throughput improvement

Technique 2: Table-Specific Tuning

Different tables, different strategies:

High-frequency tables (VBAP):
- Small batches (10K)
- More parallel jobs (16)
- Aggressive cleanup (hourly)

Low-frequency tables (MARA):
- Large batches (100K)
- Fewer jobs (4)
- Standard cleanup (daily)

Technique 3: Load Balancing

Distribute tables across multiple MT_IDs:

MT_ID_SALES_01:
├── VBAK (4 jobs)
├── VBAP (8 jobs)
└── VBFA (4 jobs)

MT_ID_FINANCE_01:
├── BKPF (6 jobs)
├── BSEG (10 jobs)
└── BSID (4 jobs)

Benefit: Parallel processing, isolated tuning

10. Performance Tuning Checklist

Quick Wins (1-2 hours)

Increase parallel jobs to 2× CPU cores
Enable data compression for remote targets
Clean up logging tables (delete processed)
Add indexes on frequently filtered fields
Disable unnecessary transformations

Medium Impact (1-2 days)

Optimize target database (HANA tuning)
Implement table partitioning
Push transformations to target views
Set up load balancing across MT_IDs
Configure micro-batching

Long-Term (ongoing)

Monitor and trend performance weekly
Capacity planning (quarterly)
Regular health checks
Performance regression testing
Document optimization changes

Summary

✅ Performance baseline and targets ✅ Bottleneck identification techniques ✅ Source, network, and target optimization ✅ SLT server tuning (jobs, memory) ✅ Transformation optimization strategies ✅ Real-time monitoring and trending ✅ Advanced tuning techniques ✅ Performance checklist

Next: Module 10 - Complex Scenarios and Advanced Use Cases

1. Performance Baseline​

Current Performance Assessment​

Performance Targets​

2. Bottleneck Identification​

Performance Analysis Flow​

Diagnostic Queries​

3. Source System Optimization​

Optimize Source Reads​

4. Network Optimization​

Bandwidth and Latency​

Compression Configuration​

5. SLT Server Tuning​

Parallel Job Optimization​

Memory Configuration​

Job Scheduling Optimization​

6. Target System Optimization​

HANA Database Tuning​

Index Management​

Partitioning Strategy​

7. Transformation Optimization​

Push Transformations to Target​

Simplify Transformation Logic​

8. Monitoring and Tuning​

Real-Time Performance Dashboard​

Performance Trending​

9. Advanced Techniques​

Technique 1: Micro-Batching​

Technique 2: Table-Specific Tuning​

Technique 3: Load Balancing​

10. Performance Tuning Checklist​

Quick Wins (1-2 hours)​

Medium Impact (1-2 days)​

Long-Term (ongoing)​

Summary​