What is multiprocessing,
Basically, multiprocessing means run two or more tasks parallely. So in python, We can use python’s inbuilt multiprocessing module to achive that. Imagine you have ten functions that takes ten seconds to run and your at a situation that you want to run that long running function ten times. Without a doubt, It will take hundred seconds to finish if you run it sequentially. That is where multiprocessing comes into action. By using multiprocessing, you can seperate those ten processes into ten sub-processes and complete all in ten seconds.
Different between multiprocessing and multithreading,
So didn’t you wonder why we use multiprocessing instead of multithreading? It is good to use multithreading in the above example, but if your function required more processing power and more memory, It is ideal to use multiprocessing because when you use multiprocessing, each sub-process will have a dedicated CPU and Memory slot. So it is ideal to use multiprocessing instead of multithreading if your long-running function required more processing power and memory.
Let’s see multiprocessing in action,
Imagine this is your long-running function,
def long_running_function():
time.sleep(10)
return
The above function will sleep for ten seconds and return. So it mimics the long-running operation.
If you want to run this function ten times without using multiprocessing or multithreading it will look something like this,
for _ in range(0, 10):
long_running_function()
What I have done is, I called long_running_function inside a for-loop. For-loop will run ten times and this process will take up to a hundred seconds to complete. Let’s see how to apply multiprocessing to this simple example.
First of all, you will have to import python’s multiprocessing module,
import multiprocessing
Then you have to make an object from the Process and pass the target function and arguments if any.
_process = multiprocessing.Process(target=long_running_function, args=())
As you can see, now we have an object called _process. Which is a multiprocessing process object. So now we can call its start method to start the execution of the long_running_function.
_process.start()
Then our for loop will look like this,
for _ in range(0, 10):
_process = multiprocessing.Process(target=long_running_function, args=())
_process.start()
After you calling _process.start(), python will start to execute our function. To wait until all the sub-process complete we have to call the join() methods of our multiprocessing process objects. For that, we have to keep track of all multiprocessing process objects which is created by for loop. To do that, we have to append all _process objects to a list called _processes just like below,
_processes = []for _ in range(0, 10):
_process = multiprocessing.Process(target=long_running_function, args=())
_process.start()
_processes.append(_process)
After doing that, we can loop through all objects in the _processes list and call the join() method of all process objects like below,
for _process in _processes:
_process.join()
So that is pretty much it. It will run all ten long_running_function calls in ten seconds. The final code will look like below,
import multiprocessing
import timedef long_running_function():
time.sleep(10)
return_processes = []for _ in range(0, 10):
_process = multiprocessing.Process(target=long_running_function, args=())
_process.start()
_processes.append(_process)for _process in _processes:
_process.join()
This is an old method of doing python multiprocessing but this is solid.
Originally published at https://medium.com on September 29, 2020.