Python Multiprocessing: A Finesse Approach
git commit -m "always keep learning"
A declaration
This content is not AI-generated whatsoever, so help me God.

Introduction
It’s 2025, and the 18-year-old me would never think I would be writing a blog on Python. As a platform engineer, I find myself running migrations of databases. As I advanced in these things, I found Python scripts are things I can depend on to get the job done, simulating migrations without hard writes, etc…
Today, we aren’t here to talk about migrations, but how I’m leveraging Python multiprocessing to run them quickly, and Multiprocessing is not concurrency but “true” Parallelism, aka doing multiple things at the same time.
A back story
Say you are a painter, and you have 100 rooms to paint in an apartment, options:
Single Processing (traditional approach)
You are one painter who has to paint room 1, finish goes to 2, and so on, till you reach room 100.
The only Advantage is the memory space one painter takes up space based on the room. On the downside is time, say you take 30 mins per room, that would be 30 × 100 =
3000 mins~ 50 hoursMulti Processing
Another approach, say you hire 10 painters, in this case, they represent processes, each painter takes up memory space (n), during execution,
painter 1takes up rooms 1 to 10,painter 2takes up rooms 11 to 20, and so forth. With a bit of memory sacrifice, you will save time, 50 hours being cut into:
10 rooms X 30 mins =300 mins~ 5 hours. Which is way faster
Python Interpretation
The above can be represented by the code snippet below
# This is like having one painter
for room in rooms:
paint_room(room) # Takes 30 minutes each
# This is like having multiple painters
with Pool(processes=10) as pool:
pool.map(paint_room, rooms) # All painters work simultaneously
Limitations to Multiprocessing
Based on my experience, it might be a bad thing for production usage depending on the use case.
Number of Cores
There is no faking till you make it here. And I learnt the hard way after seeing very weird behaviours in database updates.
The more the workers never means faster executionHere is a breakdown:import time from multiprocessing import Pool import matplotlib.pyplot as plt def cpu_intensive_task(n): total = 0 for i in range(10000000): total += i * n return total def benchmark_workers(max_workers=32, tasks=32): results = [] for num_workers in range(1, max_workers + 1): start = time.time() with Pool(processes=num_workers) as pool: pool.map(cpu_intensive_task, range(tasks)) elapsed = time.time() - start results.append({ 'workers': num_workers, 'time': elapsed, 'tasks_per_second': tasks / elapsed }) print(f"Workers: {num_workers:3d} | Time: {elapsed:.2f}s | Tasks/sec: {tasks/elapsed:.2f}") return results # Run benchmark results = benchmark_workers(max_workers=32, tasks=32) ``` ### Typical Results Pattern (8-core machine): ``` Workers: 1 | Time: 32.00s | Tasks/sec: 1.00 ← Baseline Workers: 2 | Time: 16.10s | Tasks/sec: 1.99 ← ~2x faster ✓ Workers: 4 | Time: 8.20s | Tasks/sec: 3.90 ← ~4x faster ✓ Workers: 6 | Time: 5.60s | Tasks/sec: 5.71 ← ~6x faster ✓ Workers: 8 | Time: 4.10s | Tasks/sec: 7.80 ← ~8x faster ✓ Workers: 10 | Time: 4.05s | Tasks/sec: 7.90 ← Marginal improvement Workers: 12 | Time: 4.02s | Tasks/sec: 7.96 ← Barely faster Workers: 16 | Time: 4.00s | Tasks/sec: 8.00 ← Plateauing Workers: 20 | Time: 4.01s | Tasks/sec: 7.98 ← No improvement Workers: 32 | Time: 4.15s | Tasks/sec: 7.71 ← SLOWER! ✗ ``` ## Why This Happens ### With 8 Cores, 8 Workers (Optimal) ``` Time → Core 1: [Worker 1 ████████████████] Core 2: [Worker 2 ████████████████] Core 3: [Worker 3 ████████████████] Core 4: [Worker 4 ████████████████] Core 5: [Worker 5 ████████████████] Core 6: [Worker 6 ████████████████] Core 7: [Worker 7 ████████████████] Core 8: [Worker 8 ████████████████] Each worker gets dedicated CPU time = FAST ``` ### With 8 Cores, 16 Workers (Oversubscribed) ``` Time → Core 1: [W1 ██][W9 ██][W1 ██][W9 ██] ← Context switching Core 2: [W2 ██][W10 ██][W2 ██][W10 ██] Core 3: [W3 ██][W11 ██][W3 ██][W11 ██] Core 4: [W4 ██][W12 ██][W4 ██][W12 ██] Core 5: [W5 ██][W13 ██][W5 ██][W13 ██] Core 6: [W6 ██][W14 ██][W6 ██][W14 ██] Core 7: [W7 ██][W15 ██][W7 ██][W15 ██] Core 8: [W8 ██][W16 ██][W8 ██][W16 ██] Workers fight for CPU time = OVERHEADShared Resources
Think of this like
all 10 painters trying to use the same brush at onceThat could be chaotic. Things like Database connections, File handling that includes reading, writing, etc, HTTP requests, Memory, and variables. For example:# WRONG - File handle can't be shared file = open('data.txt', 'w') def worker(data): file.write(data) # Will crash with Pool(5) as pool: pool.map(worker, ['data1', 'data2']) # CORRECT - Each process opens its own file def worker(data): with open(f'data_{os.getpid()}.txt', 'w') as file: file.write(data)The multiprocessing library has a manager that can manage variables
From above, we have already established that shared resources are tricky to handle in parallel. So how do you handle this?.
For non-mutating variables, it’s fine; each worker will get a copy of the global variable
LARGE_LOOKUP_TABLE = {'painter_name': 'Patrick', ...} def worker(item): value = LARGE_LOOKUP_TABLE.get(item)Shared Variables
from multiprocessing import Value, Array, Manager # Simple counter = Value('i', 0) def worker(n, counter): with counter.get_lock(): counter.value += 1 # Complex manager = Manager() results_list = manager.list() results_dict = manager.dict()
When to use Multiprocessing
As I sign out, as you are ready to prompt that agent to rewrite your backend APIs to use parallelism, there are things you need to consider.
Good for:
I/O bound tasks (database queries, API calls)
CPU-intensive tasks with independent work units
When you have many similar tasks
Not good for:
Tasks that need to share state frequently
Simple, fast operations (overhead might make it slower)
When you're near memory limits
This is just an introduction. Let me know if you are interested in advanced stuff like process intercommunications, which are quite similar to Golang channels, but a bit complicated.
Adios ✌️



