Module 32 - Performance Optimization
Learn techniques to profile and optimize Python code for better performance.
1. Profiling Code
Using timeit
import timeit
# Time a single statement
time = timeit.timeit('[x**2 for x in range(1000)]', number=1000)
print(f"Time: {time:.4f} seconds")
# Compare approaches
list_comp = timeit.timeit('[x**2 for x in range(1000)]', number=10000)
map_func = timeit.timeit('list(map(lambda x: x**2, range(1000)))', number=10000)
print(f"List comprehension: {list_comp:.4f}s")
print(f"Map function: {map_func:.4f}s")
Using cProfile
import cProfile
def slow_function():
total = 0
for i in range(1000000):
total += i ** 2
return total
cProfile.run('slow_function()')
2. Common Optimization Techniques
Use Built-in Functions
# ❌ Slow
def sum_list(numbers):
total = 0
for num in numbers:
total += num
return total
# ✅ Fast
total = sum(numbers)
List Comprehensions vs Loops
import timeit
# ❌ Slower
def with_loop():
result = []
for i in range(1000):
result.append(i ** 2)
return result
# ✅ Faster
def with_comprehension():
return [i ** 2 for i in range(1000)]
print(timeit.timeit(with_loop, number=10000))
print(timeit.timeit(with_comprehension, number=10000))
3. Generator Expressions
# Memory efficient for large datasets
numbers = (x ** 2 for x in range(1000000)) # Generator
# vs
numbers = [x ** 2 for x in range(1000000)] # List (uses more memory)
# Use generators for iteration
total = sum(x ** 2 for x in range(1000000))
4. Use Local Variables
# ❌ Slower (global lookup)
value = 100
def calculate():
return value ** 2
# ✅ Faster (local variable)
def calculate(value=100):
return value ** 2
5. Avoid Repeated Calculations
# ❌ Slow
for i in range(len(my_list)):
print(my_list[i])
# ✅ Fast
for item in my_list:
print(item)
# ❌ Slow (repeated len call)
for i in range(len(items)):
if i < len(items) - 1:
...
# ✅ Fast (calculate once)
length = len(items)
for i in range(length):
if i < length - 1:
...
6. Use Sets for Membership Testing
import timeit
data_list = list(range(10000))
data_set = set(range(10000))
# List lookup: O(n)
list_time = timeit.timeit('9999 in data_list', globals=globals(), number=10000)
# Set lookup: O(1)
set_time = timeit.timeit('9999 in data_set', globals=globals(), number=10000)
print(f"List: {list_time:.4f}s")
print(f"Set: {set_time:.4f}s")
print(f"Set is {list_time/set_time:.0f}x faster")
7. String Concatenation
# ❌ Slow for many strings
result = ""
for i in range(10000):
result += str(i)
# ✅ Fast
result = ''.join(str(i) for i in range(10000))
8. Use numpy for Numerical Operations
import numpy as np
import timeit
# Python list
python_list = list(range(1000000))
list_time = timeit.timeit(
lambda: [x * 2 for x in python_list],
number=10
)
# NumPy array
numpy_array = np.arange(1000000)
numpy_time = timeit.timeit(
lambda: numpy_array * 2,
number=10
)
print(f"List: {list_time:.4f}s")
print(f"NumPy: {numpy_time:.4f}s")
print(f"NumPy is {list_time/numpy_time:.0f}x faster")
9. Caching Results
from functools import lru_cache
# Without caching
def fibonacci(n):
if n < 2:
return n
return fibonacci(n-1) + fibonacci(n-2)
# With caching
@lru_cache(maxsize=None)
def fibonacci_cached(n):
if n < 2:
return n
return fibonacci_cached(n-1) + fibonacci_cached(n-2)
# fibonacci(35) takes seconds
# fibonacci_cached(35) is instant
10. Memory Profiling
pip install memory_profiler
from memory_profiler import profile
@profile
def memory_intensive():
big_list = [i for i in range(1000000)]
return sum(big_list)
memory_intensive()
Run: python -m memory_profiler script.py
Summary
✅ Profile before optimizing
✅ Use built-in functions and methods
✅ Prefer list comprehensions over loops
✅ Use generators for memory efficiency
✅ Cache expensive function results
✅ Use appropriate data structures
Next Steps
In Module 33, you'll learn:
- Design patterns in Python
- Singleton, Factory, Observer
- When to use each pattern
- Real-world examples
Practice Exercises
- Profile a slow function and identify bottlenecks
- Optimize a data processing pipeline
- Compare performance of different algorithms
- Implement caching for expensive operations
- Optimize memory usage in a large application :::