File Operations Guide¶
Safe, Atomic File Operations in GehaSoftwareHub
Table of Contents¶
- Introduction
- Why Use the File Operations API?
- Quick Start
- Available Functions
- Exception Handling
- Best Practices
- Migration from Direct File I/O
Introduction¶
GehaSoftwareHub uses a centralized file operations system that provides:
- Atomic writes - Files are never left in a corrupted state
- Windows compatibility - Handles antivirus and file indexing locks
- Thread safety - Safe concurrent access to files
- PathDef integration - Type-safe paths only
- Consistent error handling - Clean exception hierarchy
Location: src/shared_services/file_operations/api.py
Why Use the File Operations API?¶
Problems with Direct File I/O¶
# WRONG - Direct file operations are unsafe
import json
with open("config.json", "w") as f:
json.dump(config, f)
# Problems:
# - Not atomic: crash during write = corrupted file
# - No fsync: data may be lost on power failure
# - Raw string path: no type safety
# - No retry logic: fails on Windows file locks
The Safe Approach¶
# CORRECT - Use the file operations API
from src.shared_services.file_operations.api import save_json
from src.shared_services.constants.paths import Config
save_json(Config.UserSettings, config)
# Benefits:
# - Atomic: temp file + rename pattern
# - fsync: data flushed to disk
# - PathDef: type-safe, validated paths
# - Windows retry: handles antivirus locks
Quick Start¶
from src.shared_services.file_operations.api import (
save_json, load_json,
save_msgpack, load_msgpack,
save_text, load_text,
)
from src.shared_services.constants.paths import MyConfigPath # PathDef
# JSON - for configuration, settings, human-readable data
config = {"key": "value", "items": [1, 2, 3]}
save_json(MyConfigPath, config)
loaded = load_json(MyConfigPath)
# MessagePack - for binary data, large datasets
data = {"items": [1, 2, 3], "nested": {"key": "value"}}
save_msgpack(DataPath, data)
loaded = load_msgpack(DataPath)
# Text - for logs, plain text files
save_text(LogPath, "Application started\n")
content = load_text(LogPath)
Available Functions¶
JSON Operations¶
from src.shared_services.file_operations.api import save_json, load_json
# Save dictionary/list as JSON
save_json(ConfigPath, {"key": "value"})
save_json(ConfigPath, data, indent=4) # Custom indentation
# Load JSON file
config = load_json(ConfigPath)
MessagePack Operations¶
from src.shared_services.file_operations.api import save_msgpack, load_msgpack
# Save object as MessagePack (efficient binary format)
save_msgpack(DataPath, {"items": [1, 2, 3]})
# Load MessagePack file
data = load_msgpack(DataPath)
Text Operations¶
from src.shared_services.file_operations.api import save_text, load_text
# Save text content
save_text(LogPath, "Log entry\n")
save_text(LogPath, text, encoding="latin-1") # Custom encoding
# Load text file
content = load_text(LogPath)
Raw Bytes Operations¶
from src.shared_services.file_operations.api import atomic_write, atomic_read
# For custom serialization
atomic_write(BinaryPath, my_bytes)
data = atomic_read(BinaryPath)
Exception Handling¶
All exceptions inherit from FileOperationError:
from src.shared_services.file_operations.api import (
save_json,
load_json,
FileOperationError,
AtomicWriteError,
FileLockError,
SerializationError,
)
# Catch all file operation errors
try:
save_json(ConfigPath, config)
except FileOperationError as e:
print(f"File operation failed: {e}")
# Specific error handling
try:
data = load_json(DataPath)
except FileNotFoundError:
data = {} # Default if file doesn't exist
except SerializationError:
print("Invalid JSON format")
| Exception | When Raised |
|---|---|
FileOperationError |
Base exception for all file operations |
AtomicWriteError |
Write operation failed |
FileLockError |
File locked by another process (after retries) |
SerializationError |
JSON/msgpack encode/decode failed |
Best Practices¶
Always Use PathDef¶
# CORRECT
from src.shared_services.constants.paths import Config
save_json(Config.UserSettings, data)
# WRONG - Raw strings are rejected
save_json("config.json", data) # TypeError!
Choose the Right Format¶
| Format | Use Case |
|---|---|
| JSON | Configuration, settings, human-readable data |
| MessagePack | Large datasets, binary data, performance-critical |
| Text | Logs, plain text, line-based files |
Handle Missing Files Gracefully¶
from src.shared_services.file_operations.api import load_json
from src.shared_services.path_management.api import get_path
# Check existence first
path = get_path(ConfigPath)
if path.exists():
config = load_json(ConfigPath)
else:
config = {}
# Or use try/except
try:
config = load_json(ConfigPath)
except FileNotFoundError:
config = {}
Let Errors Propagate for Critical Files¶
# For critical files, let errors propagate
config = load_json(Config.Critical) # Raises if missing/corrupt
Migration from Direct File I/O¶
Before (Unsafe)¶
import json
# Reading
with open(filepath, "r") as f:
data = json.load(f)
# Writing
with open(filepath, "w") as f:
json.dump(data, f, indent=2)
After (Safe)¶
from src.shared_services.file_operations.api import load_json, save_json
# Reading
data = load_json(path_def)
# Writing
save_json(path_def, data)
MessagePack Migration¶
# Before
import msgpack
with open(filepath, "rb") as f:
data = msgpack.unpackb(f.read(), raw=False)
# After
from src.shared_services.file_operations.api import load_msgpack
data = load_msgpack(path_def)
Summary¶
| Do | Don't |
|---|---|
Use save_json(), load_json() |
Use open() with json.dump()/load() |
Use save_msgpack(), load_msgpack() |
Use open() with msgpack.packb()/unpackb() |
Use save_text(), load_text() |
Use open() with .write()/.read() |
| Pass PathDef objects | Pass raw string paths |
Handle FileOperationError |
Use bare except clauses |
Reference: src/shared_services/file_operations/README.md