问题
My issue is that if I reassign an item in a list such that the reassignment happens during a parallel process, then after the parallel processes are finished, the change reverts back to its original state.
In the below example- greatly simplified for ease of understanding-, I have a function that changes the list element NoZeros[0] to "chicken" and a second function that changes NoZeros[1] to "sandwich". I even put "global" in the second function just to demonstrate that this isn't a local vs global issue- it looks like one, but it indeed is not. As the print commands show you when you run the program, the list elements do actually change. The issue is that when calling NoZeros after these processes, NoZeros is what it was to begin with instead of "["chicken", "sandwich"].
I know that python's multiprocessing package has exactly the same issue, but it was solved by taking "one step up" of whatever it is you wanted to not revert back to and slapping "manager.list()" at the before it. My issue is that I can't for the life of me figure out what the equivalent is for Ray. For example, in Python's multiprocessing library, you would just write NoZeros=manager.list(NoZeros) somewhere before NoZeros gets altered, and that would be the end of it, but I can't find what equivalent there is for Ray or if there even is an equivalent.
HOW DO I CHANGE LISTS IN PARALLEL USING RAY? Many thanks.
Also: note that this script may throw you for a loop, because you may end up printing NoZeros BEFORE the parallel processes finish. This is another bug I'm having trouble with and would appreciate attention on, but it is not the priority. The point I'm trying to make is that you probably want to run print(NoZeros) in the next cell (Jupyter has this functionality, at least). In python's multiprocessing library, one would just do "process.join()" and that would solve end the process(es) in question, so that brings me to the bonus question:
Bonus question: How do i get ray.wait() to work; how do I tell my code to only proceed to my next command if the previous commands- even if these are parallel commands- are finished?
'''
import ray
ray.shutdown()
ray.init()
NoZeros=[0,0]
@ray.remote
def Chicken():
print("NoZeros[0] is",NoZeros[0],"but will change to chicken")
NoZeros[0]="chicken"
print("Now, NoZeros[0] is",NoZeros[0])
@ray.remote
def GlobalSandwich():
global NoZeros #This is just to show that "global" doesn't solve anything
print("NoZeros[1] is",NoZeros[1],"but will change to sandwich")
NoZeros[1]="sandwich"
print("Now, NoZeros[1] is",NoZeros[1])
Chicken.remote()
GlobalSandwich.remote()
#Unhash these 3 lines of code if youd like to try tackling another question: why does ray.wait() not work?
#How do i wait until parallel processes end, so i can continue my code?
#WaitList=[Chicken.remote(),GlobalSandwich.remote()]
#ray.wait(WaitList)
#print("If you see this first, ray.wait() isnt working")
#This line of code right here was executed in the next command line (Jupyter); this print command happens when the other processes are finished
print(NoZeros)
'''
回答1:
With Ray, mutable global state should live in Actors. For example, you could do something like:
@ray.remote
class ListActor:
def __init__(self, l):
self._list = l
def get(self, i):
return self._list[i]
def set(self, i, val):
self._list[i] = val
def to_list(self):
return self._list
Then in order to use it, you can pass it in as a parameter (again, you shouldn't rely on global variables).
NoZeros = ListActor.remote([0,0])
@ray.remote
def Chicken(NoZeros):
print("NoZeros[0] is",ray.get(NoZeros.get.remote(0)),"but will change to chicken")
NoZeros.set(0, "chicken")
print("Now, NoZeros[0] is",ray.get(NoZeros.get(0)))
# We need to make sure this function finishes executing before we print.
ray.get(Chicken.remote(NoZeros))
print(ray.get(NoZeros.to_list.remote()))
来源:https://stackoverflow.com/questions/62417320/lists-wont-change-with-ray-parallel-python