User Tools

Site Tools


java:python_differences

This is an old revision of the document!


Java and Python Differences

Java and Python are both garbage collected languages that run in a VM. Each of them has rich library support and large user bases. Both are good for allowing multiple people on a team to work on the same codebase.

However, there are some hard to reconcile differences between the two.

Parallelism Challenges

The de-facto way to parallelize code in Python is via multiprocessing. This can be done either by invoking os.fork() and managing them directly, or by using concurrent.futures.ProcessPoolExecutor.

In order to pass work between processes, the data MUST be serialized, typically using the pickle module. This has important consequences, as some language level constructs are not usable. When a task needs to be split up, (or split off), the program gathers the name and arguments, and spawn/forks off other processes to handle the work.

Functions can't be serialized

One problem with functions is that they cannot be serialized. Metadata about the function can be serialized, but the function itself cannot. This means closures don't work with multiprocessing:

def do_work(items: list[int]) -> int:
 
  def _worker(chunk: list[int]) -> int:
    count = 0
    for item in chunk:
      if item == 0:
        count += 1
    return count
 
  n = len(items)
  count = 0
  with concurrent.futures.ProcessPoolExecutor() as executor:
    count += sum(executor.map(_worker, items[:int(n / 2)]))
    count += sum(executor.map(_worker, items[int(n/2):]))
  return count
 
 
print(do_work([0, 1, 2, 3]))
 

Trying to run this we get:

  File "/usr/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'do_work.<locals>._worker'

Threads v.s. Processes contains an overview of how threads and processes are treated differently. Python can use threads for IO bound work, but this appears to be less used in favor of asyncio.

java/python_differences.1743716478.txt.gz · Last modified: by carl