Module 8 - File I/O
This module covers file input/output in Python. You'll learn how to read and write files, use the with statement, work with different file formats, and use the modern pathlib module.
1. Opening and Closing Files
1.1 Basic File Operations
# Open file
file = open('example.txt', 'r')
# Read content
content = file.read()
print(content)
# Close file (important!)
file.close()
1.2 File Modes
| Mode | Description | Creates if Not Exists |
|---|---|---|
'r' | Read (default) | No (error) |
'w' | Write (overwrites) | Yes |
'a' | Append | Yes |
'x' | Exclusive create | No (error if exists) |
'r+' | Read and write | No |
'w+' | Read and write (overwrites) | Yes |
'a+' | Read and append | Yes |
'rb' | Read binary | No |
'wb' | Write binary | Yes |
# Read mode
file = open('data.txt', 'r')
# Write mode (overwrites)
file = open('output.txt', 'w')
# Append mode
file = open('log.txt', 'a')
# Binary mode
file = open('image.jpg', 'rb')
Failing to close files can lead to resource leaks and data loss. Use the with statement to ensure files are closed automatically.
2. The with Statement
2.1 Context Manager
# Automatically closes file
with open('example.txt', 'r') as file:
content = file.read()
print(content)
# File is automatically closed here
# Multiple files
with open('input.txt', 'r') as infile, \
open('output.txt', 'w') as outfile:
content = infile.read()
outfile.write(content.upper())
2.2 Why Use with?
# Without with (risky)
file = open('data.txt', 'r')
try:
data = file.read()
process(data)
finally:
file.close() # Must remember to close
# With with (safe)
with open('data.txt', 'r') as file:
data = file.read()
process(data)
# Automatically closed, even if error occurs
Always use the with statement for file operations. It ensures proper resource management.
3. Reading Files
3.1 Reading Methods
# read() - Read entire file
with open('file.txt', 'r') as file:
content = file.read()
print(content)
# readline() - Read one line
with open('file.txt', 'r') as file:
line1 = file.readline()
line2 = file.readline()
print(line1, line2)
# readlines() - Read all lines into list
with open('file.txt', 'r') as file:
lines = file.readlines()
for line in lines:
print(line.strip())
# Iterate over file (memory efficient)
with open('file.txt', 'r') as file:
for line in file:
print(line.strip())
3.2 Reading with Size Limit
# Read first 100 characters
with open('large_file.txt', 'r') as file:
content = file.read(100)
print(content)
# Read in chunks
with open('large_file.txt', 'r') as file:
while True:
chunk = file.read(1024) # 1KB chunks
if not chunk:
break
process(chunk)
3.3 Handling Encodings
# Specify encoding
with open('file.txt', 'r', encoding='utf-8') as file:
content = file.read()
# Handle encoding errors
with open('file.txt', 'r', encoding='utf-8', errors='ignore') as file:
content = file.read()
# Common encodings: utf-8, ascii, latin-1, cp1252
4. Writing Files
4.1 Writing Methods
# write() - Write string
with open('output.txt', 'w') as file:
file.write('Hello, World!\n')
file.write('Second line\n')
# writelines() - Write list of strings
lines = ['Line 1\n', 'Line 2\n', 'Line 3\n']
with open('output.txt', 'w') as file:
file.writelines(lines)
# Using print()
with open('output.txt', 'w') as file:
print('Hello, World!', file=file)
print('Second line', file=file)
4.2 Append Mode
# Append to existing file
with open('log.txt', 'a') as file:
file.write('New log entry\n')
# Append with timestamp
from datetime import datetime
with open('log.txt', 'a') as file:
timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
file.write(f'[{timestamp}] Event occurred\n')
4.3 Writing Formatted Data
# Format strings
data = [
('Alice', 25, 'NYC'),
('Bob', 30, 'LA'),
('Charlie', 35, 'Chicago')
]
with open('people.txt', 'w') as file:
for name, age, city in data:
file.write(f'{name:15} {age:3} {city:15}\n')
# CSV-like format
with open('data.csv', 'w') as file:
file.write('Name,Age,City\n')
for name, age, city in data:
file.write(f'{name},{age},{city}\n')
5. Working with Paths (pathlib)
5.1 Path Objects
from pathlib import Path
# Create path object
path = Path('data/file.txt')
path = Path.home() / 'documents' / 'file.txt'
# Current directory
current = Path.cwd()
print(current)
# Home directory
home = Path.home()
print(home)
# Absolute path
absolute = path.resolve()
print(absolute)
5.2 Path Properties
from pathlib import Path
path = Path('/home/user/documents/report.txt')
print(path.name) # 'report.txt'
print(path.stem) # 'report'
print(path.suffix) # '.txt'
print(path.parent) # '/home/user/documents'
print(path.parts) # ('/', 'home', 'user', 'documents', 'report.txt')
print(path.is_absolute()) # True
# Multiple extensions
path = Path('archive.tar.gz')
print(path.suffixes) # ['.tar', '.gz']
5.3 Path Operations
from pathlib import Path
# Check existence
path = Path('file.txt')
print(path.exists())
print(path.is_file())
print(path.is_dir())
# Create directory
directory = Path('new_folder')
directory.mkdir(exist_ok=True) # Don't error if exists
directory.mkdir(parents=True) # Create parent directories
# List directory contents
directory = Path('.')
for item in directory.iterdir():
print(item)
# Glob patterns
for txt_file in directory.glob('*.txt'):
print(txt_file)
# Recursive glob
for py_file in directory.rglob('*.py'):
print(py_file)
5.4 Reading/Writing with pathlib
from pathlib import Path
path = Path('data.txt')
# Read text
content = path.read_text()
print(content)
# Write text
path.write_text('Hello, World!')
# Read bytes
binary_data = path.read_bytes()
# Write bytes
path.write_bytes(b'Binary data')
# Read lines
lines = path.read_text().splitlines()
pathlib is the modern, object-oriented way to work with paths. Prefer it over os.path for new code.
6. Working with CSV Files
6.1 Reading CSV
import csv
# Read CSV
with open('data.csv', 'r') as file:
reader = csv.reader(file)
header = next(reader) # Skip header
for row in reader:
print(row)
# Read as dictionary
with open('data.csv', 'r') as file:
reader = csv.DictReader(file)
for row in reader:
print(row['name'], row['age'])
# Example data.csv:
# name,age,city
# Alice,25,NYC
# Bob,30,LA
6.2 Writing CSV
import csv
# Write CSV
data = [
['Name', 'Age', 'City'],
['Alice', 25, 'NYC'],
['Bob', 30, 'LA']
]
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerows(data)
# Write from dictionary
data = [
{'name': 'Alice', 'age': 25, 'city': 'NYC'},
{'name': 'Bob', 'age': 30, 'city': 'LA'}
]
with open('output.csv', 'w', newline='') as file:
fieldnames = ['name', 'age', 'city']
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(data)
6.3 CSV Options
import csv
# Custom delimiter
with open('data.tsv', 'r') as file:
reader = csv.reader(file, delimiter='\t')
# Custom quoting
with open('data.csv', 'w', newline='') as file:
writer = csv.writer(file, quoting=csv.QUOTE_ALL)
writer.writerow(['Value with, comma', 'Normal value'])
# Different dialect
with open('data.csv', 'w', newline='') as file:
writer = csv.writer(file, dialect='excel')
7. Working with JSON
7.1 Reading JSON
import json
# Read JSON file
with open('data.json', 'r') as file:
data = json.load(file)
print(data)
# Parse JSON string
json_string = '{"name": "Alice", "age": 25}'
data = json.loads(json_string)
print(data['name'])
# Example data.json:
# {
# "users": [
# {"name": "Alice", "age": 25},
# {"name": "Bob", "age": 30}
# ]
# }
7.2 Writing JSON
import json
# Write to JSON file
data = {
"name": "Alice",
"age": 25,
"city": "NYC",
"hobbies": ["reading", "coding"]
}
with open('output.json', 'w') as file:
json.dump(data, file, indent=4)
# Convert to JSON string
json_string = json.dumps(data, indent=2)
print(json_string)
# Pretty print
print(json.dumps(data, indent=4, sort_keys=True))
7.3 Custom JSON Encoding
import json
from datetime import datetime
# Custom encoder
class DateTimeEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
return super().default(obj)
data = {
"event": "Meeting",
"timestamp": datetime.now()
}
json_string = json.dumps(data, cls=DateTimeEncoder)
print(json_string)
8. File System Operations
8.1 Using os Module
import os
# Current directory
print(os.getcwd())
# Change directory
os.chdir('/path/to/directory')
# List directory
files = os.listdir('.')
print(files)
# Create directory
os.mkdir('new_folder')
os.makedirs('path/to/folder', exist_ok=True)
# Remove file
os.remove('file.txt')
# Remove directory
os.rmdir('empty_folder')
# Remove directory tree
import shutil
shutil.rmtree('folder_with_contents')
8.2 File Information
import os
from pathlib import Path
# File size
size = os.path.getsize('file.txt')
print(f'Size: {size} bytes')
# Using pathlib
path = Path('file.txt')
size = path.stat().st_size
print(f'Size: {size} bytes')
# File timestamps
mtime = path.stat().st_mtime # Modification time
ctime = path.stat().st_ctime # Creation time
from datetime import datetime
mod_time = datetime.fromtimestamp(mtime)
print(f'Modified: {mod_time}')
# Check permissions
print(f'Readable: {os.access("file.txt", os.R_OK)}')
print(f'Writable: {os.access("file.txt", os.W_OK)}')
print(f'Executable: {os.access("file.txt", os.X_OK)}')
8.3 Copy and Move Files
import shutil
# Copy file
shutil.copy('source.txt', 'destination.txt')
shutil.copy2('source.txt', 'dest.txt') # Preserve metadata
# Copy directory
shutil.copytree('source_dir', 'dest_dir')
# Move file/directory
shutil.move('source.txt', 'destination.txt')
# Using pathlib
from pathlib import Path
source = Path('source.txt')
destination = Path('destination.txt')
source.rename(destination) # Move/rename
9. Temporary Files
9.1 Working with Temporary Files
import tempfile
# Temporary file (auto-deleted)
with tempfile.TemporaryFile(mode='w+t') as temp:
temp.write('Temporary data')
temp.seek(0)
print(temp.read())
# File deleted automatically
# Named temporary file
with tempfile.NamedTemporaryFile(mode='w+t', delete=False) as temp:
temp.write('Temporary data')
temp_name = temp.name
print(f'Temp file: {temp_name}')
# Temporary directory
with tempfile.TemporaryDirectory() as temp_dir:
print(f'Temp directory: {temp_dir}')
# Use directory
# Directory deleted automatically
10. Error Handling
10.1 File Exceptions
# FileNotFoundError
try:
with open('nonexistent.txt', 'r') as file:
content = file.read()
except FileNotFoundError:
print('File not found!')
# PermissionError
try:
with open('/root/protected.txt', 'w') as file:
file.write('data')
except PermissionError:
print('Permission denied!')
# IsADirectoryError
try:
with open('some_directory', 'r') as file:
pass
except IsADirectoryError:
print('This is a directory, not a file!')
# Multiple exceptions
try:
with open('file.txt', 'r') as file:
content = file.read()
except (FileNotFoundError, PermissionError) as e:
print(f'Error: {e}')
10.2 Safe File Operations
from pathlib import Path
def safe_read(filename):
"""Safely read file with error handling."""
path = Path(filename)
if not path.exists():
return None
if not path.is_file():
raise ValueError(f'{filename} is not a file')
try:
return path.read_text()
except UnicodeDecodeError:
return path.read_text(encoding='latin-1')
except Exception as e:
print(f'Error reading file: {e}')
return None
def safe_write(filename, content):
"""Safely write file with backup."""
path = Path(filename)
# Create backup if file exists
if path.exists():
backup = path.with_suffix('.bak')
shutil.copy2(path, backup)
try:
path.write_text(content)
return True
except Exception as e:
print(f'Error writing file: {e}')
return False
11. Best Practices
11.1 File Handling Guidelines
# ✅ Good: Use with statement
with open('file.txt', 'r') as file:
content = file.read()
# ❌ Bad: Manual file closing
file = open('file.txt', 'r')
content = file.read()
file.close()
# ✅ Good: Use pathlib
from pathlib import Path
path = Path('data') / 'file.txt'
# ❌ Bad: String concatenation
path = 'data' + '/' + 'file.txt'
# ✅ Good: Iterate over file
with open('large_file.txt', 'r') as file:
for line in file:
process(line)
# ❌ Bad: Read entire file
with open('large_file.txt', 'r') as file:
lines = file.readlines()
for line in lines:
process(line)
11.2 Performance Tips
# Use buffering for better performance
with open('file.txt', 'r', buffering=8192) as file:
content = file.read()
# Process large files in chunks
def process_large_file(filename, chunk_size=1024*1024):
with open(filename, 'rb') as file:
while True:
chunk = file.read(chunk_size)
if not chunk:
break
process(chunk)
# Use generators for memory efficiency
def read_lines(filename):
with open(filename, 'r') as file:
for line in file:
yield line.strip()
for line in read_lines('large_file.txt'):
process(line)
12. Summary
| Concept | Description | Example |
|---|---|---|
| open() | Open file | open('file.txt', 'r') |
| with | Context manager | with open(...) as f: |
| read() | Read entire file | file.read() |
| readline() | Read one line | file.readline() |
| write() | Write to file | file.write('text') |
| Path | Path object | Path('file.txt') |
| csv | CSV operations | csv.reader(file) |
| json | JSON operations | json.load(file) |
- Always use
withstatement for file operations - Use
pathlibfor modern path handling - Handle file exceptions appropriately
- Use appropriate encodings (UTF-8 is standard)
- Process large files in chunks
- Use CSV/JSON modules for structured data
13. What's Next?
In Module 9 - Exception Handling, you'll learn:
- Try-except-finally blocks
- Raising exceptions
- Creating custom exceptions
- Exception hierarchies
- Best practices for error handling
14. Practice Exercises
Exercise 1: File Statistics
Create a program that analyzes a text file and reports:
- Number of lines
- Number of words
- Number of characters
- Most common words
Exercise 2: Log File Parser
Parse a log file and extract error messages into a separate file.
Exercise 3: CSV to JSON Converter
Convert a CSV file to JSON format.
Exercise 4: File Organizer
Create a script that organizes files in a directory by extension.
Exercise 5: Configuration File Handler
Create a module that reads and writes configuration files in JSON format.
Try solving these exercises on your own first. Solutions will be provided in the practice section.