Python, Asyncio and Footguns

Ryan Collingham
3 min readNov 23, 2023

--

If you’ve been using Python in the last few years you will probably have come into contact with `asyncio`. For the uninitiated, `asyncio` is the latest attempt at providing a sane concurrency framework in the language, dancing on the grave of `concurrent.futures`, `twisted`, `threading` and a myriad of other concurrency frameworks and primitives. The core idea of `asyncio` revolves around an event loop and some `async/await` syntactic sugar that allows futures to be passed back and scheduled asynchronously. into said event loop. The end result is single threaded concurrency that works great for IO-bound tasks.

Recently when diving into a new project that used `asyncio` extensively, I began noticing some unusual event-loop related exceptions. After extensive debugging and learning about what is going on under the hood, I came up with an entirely reasonable looking yet incorrect program which reproduces the issue:

import asyncio

GLOBAL_LOCK = asyncio.Lock()

async def main():
await asyncio.gather(locker(), locker())

async def locker():
await GLOBAL_LOCK.acquire()
print('lock acquired')


if __name__ == '__main__':
asyncio.run(main())

Can you spot the issue? If not then read on…

The root cause is the use of the asyncio.Lock() object. Those familiar with the async-await style of concurrency may wonder why we would need to use a Lock at all. With multi-threading, locks are essential to avoid conflict when threads attempt to read and write to the same object concurrently. Asyncio largely avoids these issues by being single threaded, but there may still be situations in which you want to prevent multiple coroutines from entering the same critical section. For a fuller explanation see the top answer here: https://stackoverflow.com/questions/25799576/whats-python-asyncio-lock-for

The problem occurs because when the Lock object is initialised, it stores a reference to the current event loop (which is stored as a thread-local variable). However, when `asyncio.run()` is called, it throws away the previous event loop and creates a new one. The Lock object still holds a stale reference to the old event loop and so when it internally tries to create a future instance and `await` on that, the exception occurs.

Actually, the exception doesn’t occur the very first time the lock is acquired. In that case the lock is in the unlocked state, so all that happens is that it gets marked as locked and returns synchronously. Only when another coroutine tries to acquire the lock after does an `await` occur with the wrong event loop referenced.

As a result, if you have this error in your application you likely will not spot any issues when doing basic feature testing. Only when your application is under load and lock contention occurs will you start to see sporadic exceptions, which can be some of the hardest issues to debug!

I’m actually quite amazed that the Python team shipped such a hard to spot footgun for developers to trigger on themselves. Sure, use of global variables is not considered a best practice and proper scoping of the Lock instance would avoid the issue. However it’s also far from unheard of. Fortunately, this issue appears to be fixed in Python 3.10, as a side-effect of removing the `loop` parameter from Locks and other objects. But for anyone still using earlier versions of Python, beware.

--

--

Ryan Collingham
Ryan Collingham

No responses yet