Skip to content

File Operations Guide

Safe, Atomic File Operations in GehaSoftwareHub


Table of Contents

  1. Introduction
  2. Why Use the File Operations API?
  3. Quick Start
  4. Available Functions
  5. Exception Handling
  6. Best Practices
  7. Migration from Direct File I/O

Introduction

GehaSoftwareHub uses a centralized file operations system that provides:

  • Atomic writes - Files are never left in a corrupted state
  • Windows compatibility - Handles antivirus and file indexing locks
  • Thread safety - Safe concurrent access to files
  • PathDef integration - Type-safe paths only
  • Consistent error handling - Clean exception hierarchy

Location: src/shared_services/file_operations/api.py


Why Use the File Operations API?

Problems with Direct File I/O

# WRONG - Direct file operations are unsafe
import json

with open("config.json", "w") as f:
    json.dump(config, f)
# Problems:
# - Not atomic: crash during write = corrupted file
# - No fsync: data may be lost on power failure
# - Raw string path: no type safety
# - No retry logic: fails on Windows file locks

The Safe Approach

# CORRECT - Use the file operations API
from src.shared_services.file_operations.api import save_json
from src.shared_services.constants.paths import Config

save_json(Config.UserSettings, config)
# Benefits:
# - Atomic: temp file + rename pattern
# - fsync: data flushed to disk
# - PathDef: type-safe, validated paths
# - Windows retry: handles antivirus locks

Quick Start

from src.shared_services.file_operations.api import (
    save_json, load_json,
    save_msgpack, load_msgpack,
    save_text, load_text,
)
from src.shared_services.constants.paths import MyConfigPath  # PathDef

# JSON - for configuration, settings, human-readable data
config = {"key": "value", "items": [1, 2, 3]}
save_json(MyConfigPath, config)
loaded = load_json(MyConfigPath)

# MessagePack - for binary data, large datasets
data = {"items": [1, 2, 3], "nested": {"key": "value"}}
save_msgpack(DataPath, data)
loaded = load_msgpack(DataPath)

# Text - for logs, plain text files
save_text(LogPath, "Application started\n")
content = load_text(LogPath)

Available Functions

JSON Operations

from src.shared_services.file_operations.api import save_json, load_json

# Save dictionary/list as JSON
save_json(ConfigPath, {"key": "value"})
save_json(ConfigPath, data, indent=4)  # Custom indentation

# Load JSON file
config = load_json(ConfigPath)

MessagePack Operations

from src.shared_services.file_operations.api import save_msgpack, load_msgpack

# Save object as MessagePack (efficient binary format)
save_msgpack(DataPath, {"items": [1, 2, 3]})

# Load MessagePack file
data = load_msgpack(DataPath)

Text Operations

from src.shared_services.file_operations.api import save_text, load_text

# Save text content
save_text(LogPath, "Log entry\n")
save_text(LogPath, text, encoding="latin-1")  # Custom encoding

# Load text file
content = load_text(LogPath)

Raw Bytes Operations

from src.shared_services.file_operations.api import atomic_write, atomic_read

# For custom serialization
atomic_write(BinaryPath, my_bytes)
data = atomic_read(BinaryPath)

Exception Handling

All exceptions inherit from FileOperationError:

from src.shared_services.file_operations.api import (
    save_json,
    load_json,
    FileOperationError,
    AtomicWriteError,
    FileLockError,
    SerializationError,
)

# Catch all file operation errors
try:
    save_json(ConfigPath, config)
except FileOperationError as e:
    print(f"File operation failed: {e}")

# Specific error handling
try:
    data = load_json(DataPath)
except FileNotFoundError:
    data = {}  # Default if file doesn't exist
except SerializationError:
    print("Invalid JSON format")
Exception When Raised
FileOperationError Base exception for all file operations
AtomicWriteError Write operation failed
FileLockError File locked by another process (after retries)
SerializationError JSON/msgpack encode/decode failed

Best Practices

Always Use PathDef

# CORRECT
from src.shared_services.constants.paths import Config
save_json(Config.UserSettings, data)

# WRONG - Raw strings are rejected
save_json("config.json", data)  # TypeError!

Choose the Right Format

Format Use Case
JSON Configuration, settings, human-readable data
MessagePack Large datasets, binary data, performance-critical
Text Logs, plain text, line-based files

Handle Missing Files Gracefully

from src.shared_services.file_operations.api import load_json
from src.shared_services.path_management.api import get_path

# Check existence first
path = get_path(ConfigPath)
if path.exists():
    config = load_json(ConfigPath)
else:
    config = {}

# Or use try/except
try:
    config = load_json(ConfigPath)
except FileNotFoundError:
    config = {}

Let Errors Propagate for Critical Files

# For critical files, let errors propagate
config = load_json(Config.Critical)  # Raises if missing/corrupt

Migration from Direct File I/O

Before (Unsafe)

import json

# Reading
with open(filepath, "r") as f:
    data = json.load(f)

# Writing
with open(filepath, "w") as f:
    json.dump(data, f, indent=2)

After (Safe)

from src.shared_services.file_operations.api import load_json, save_json

# Reading
data = load_json(path_def)

# Writing
save_json(path_def, data)

MessagePack Migration

# Before
import msgpack
with open(filepath, "rb") as f:
    data = msgpack.unpackb(f.read(), raw=False)

# After
from src.shared_services.file_operations.api import load_msgpack
data = load_msgpack(path_def)

Summary

Do Don't
Use save_json(), load_json() Use open() with json.dump()/load()
Use save_msgpack(), load_msgpack() Use open() with msgpack.packb()/unpackb()
Use save_text(), load_text() Use open() with .write()/.read()
Pass PathDef objects Pass raw string paths
Handle FileOperationError Use bare except clauses

Reference: src/shared_services/file_operations/README.md