Difference between revisions of "Python multiprocessing"

From ScientificComputing
Jump to: navigation, search
(Created page with "In this example we show how to launch parallel tasks in Python by using ProcessPoolExecutor in the concurrent.futures module. "The concurrent.futures module provides a high-...")
 
Line 55: Line 55:
 
Launch the Python script with  
 
Launch the Python script with  
  
  num_workers = 1
+
  num_processes = 1
  
 
  [jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
 
  [jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
Line 69: Line 69:
 
* "sys": the time which CPU spent in the system mode
 
* "sys": the time which CPU spent in the system mode
  
We focus on the "real" total time which is here 2.635 sec. Time can vary for each run and each computer. Then, we increase the number of workers to 2 and 4 to see the runtime.
+
We focus on the "real" total time which is here 2.635 sec. Time can vary for each run and each computer. Then, we increase the num_processes to 2 and 4 to see the runtime.
  
 
+
  num_processes = 2
  num_workers = 2
 
  
 
  [jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
 
  [jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
Line 81: Line 80:
 
  sys 0m0.024s
 
  sys 0m0.024s
  
  num_workers = 4
+
  num_processes = 4
  
 
  [jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
 
  [jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
Line 90: Line 89:
 
  sys 0m0.036s
 
  sys 0m0.036s
  
You can see that with number of workers = 2 the run time reduces to 1.366 sec and with num_workers = 4 the runtime reduces to 0.812 sec.
+
You can see that with num_processes = 2 the run time reduces to 1.366 sec and with num_processes = 4 the runtime reduces to 0.812 sec.

Revision as of 14:56, 18 June 2021

In this example we show how to launch parallel tasks in Python by using ProcessPoolExecutor in the concurrent.futures module.

"The concurrent.futures module provides a high-level interface for asynchronously executing callables.

 The asynchronous execution can be performed with threads, using ThreadPoolExecutor, or separate processes, using ProcessPoolExecutor. Both implement the same interface, which is defined by the abstract Executor class."

Source: https://docs.python.org/3/library/concurrent.futures.html

Load modules

Switch to the new software stack

$ env2lmod

or, set your default software stack to the new software stack

$ set_software_stack.sh new

Load a Python module

$ module load gcc/6.3.0 python/3.8.5

Code

Open a new file named process.py with a text editor and add the following code:

from concurrent.futures import ProcessPoolExecutor

def accumulate_sum(n_part):
    sum = 0
    for i in range(n_part):
        sum += i
    return sum

def main():

    n = 50_000_000
    num_workers = 1
    n_per_worker = [int(n/num_workers) for i in range(num_workers)]

    with ProcessPoolExecutor(max_workers=num_workers) as executor:
        results=executor.map(accumulate_sum, n_per_worker)

    print("The accumulated sum is {}".format(sum(results)))

if __name__ == '__main__':
    main()

Request an interactive session on a compute node

$ bsub -n 4 -Is bash
[jarunanp@eu-login-03 python_multiprocessing]$ bsub -n 4 -Is bash
Generic job.
Job <175831537> is submitted to queue <normal.4h>.
<<Waiting for dispatch ...>>
<<Starting on eu-ms-018-18>>
FILE: /sys/fs/cgroup/cpuset/lsf/euler/job.175831537.32301.1624026821/tasks
[jarunanp@eu-ms-018-18 python_multiprocessing]$

Launch the Python script with

num_processes = 1
[jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
The accumulated sum is 1249999975000000

real	0m2.635s
user	0m2.602s
sys	0m0.019s

The command line "time" measure the time and output:

  • "real": the total time which CPU spent to execute the program
  • "user": the time which CPU spent in the user mode
  • "sys": the time which CPU spent in the system mode

We focus on the "real" total time which is here 2.635 sec. Time can vary for each run and each computer. Then, we increase the num_processes to 2 and 4 to see the runtime.

num_processes = 2
[jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
The accumulated sum is 624999975000000

real	0m1.366s
user	0m2.603s
sys	0m0.024s
num_processes = 4
[jarunanp@eu-ms-009-45 python_multiprocessing]$ time python process.py
The accumulated sum is 312499975000000

real	0m0.812s
user	0m2.814s
sys	0m0.036s

You can see that with num_processes = 2 the run time reduces to 1.366 sec and with num_processes = 4 the runtime reduces to 0.812 sec.