Async Options
concurrent.futures
(thread pooling)threading
multiprocessing
asyncio
if io_bound:
if io_very_slow:
print("Use Asyncio") #Many connections
else:
print("Use Threads") #Limited connections
else:
print("Multi Processing") #CPU Bound
Asyncio
Why asyncio?
Well, Global Interpreter Lock aka GIL was introduced to make CPython’s memory handling easier and to allow better integrations with C (for example the extensions). The GIL is a locking mechanism that the Python interpreter runs only one thread at a time. Multiple process work, but are expensive.
The revolution is the event loop. An event loop basically waits for something to happen and then acts on the event. The event loop tracks different I/O events and switches to tasks which are ready and pauses the ones which are waiting on I/O. Thus we don’t waste time on tasks which are not ready to run right now. Single threaded, but gives apperance of parallelism through waiting on I/O.
Background
As of 9/8/18, async doesn't work very well in jupyter notebooks(probs displaying errors and underlying event loop already running)
Async python stand library, new in py version 3.4
Uses coroutines which allow a ft to return without losing its state i.e python yielding iterators. Difference with threads is these are collaborative and only one coroutine run at a time, while threads are parallel
ask is a wrapper for a coroutine and a subclass of Future.
Async/Await
Python has native support for async functions defining a coroutine. Calling these functions doesn't run the code, but returns a JS promise-like coroutine object
import asyncio
async def ping_server(ip):
pass
@asyncio.coroutine
def load_file(path):
pass
To actually call these fts, use await
inside another ft
async def ping_local():
return await ping_server('192.168.1.1')
Event Loop
asyncio.get_event_loop().run_until_complete(my_ft()) #run ft is blocking
To add something to event loop
asyncio.ensure_future(my_ft())
asyncio.run(main()) #open, run, and close event loop
loop.run_forever() #run forever
Important Fts
await asyncio.sleep(5) #to sleep
page.waitForSelector('h3 a', { timeout: 5000 }) # default timeout 30 seconds
Async For/With
Async for
async def main():
g = [i async for i in mygen()]
f = [j async for j in mygen() if not (j // 3 % 5)]
return g, f
g, f = asyncio.run(main())
Allows other coroutines to take turns inbetween loops
Tasks
These let you return values from a sync ft and ??
import asyncio
async def my_task(seconds):
print('This task is taking {} seconds to complete'.format(seconds))
await asyncio.sleep(seconds)
return 'task finished'
if __name__ == '__main__':
my_event_loop = asyncio.get_event_loop()
try:
print('task creation started')
task_obj = my_event_loop.create_task(my_task(seconds=2))
my_event_loop.run_until_complete(task_obj)
finally:
my_event_loop.close()
print("The task's result was: {}".format(task_obj.result()))
Example
We use aiohttp
for async requests, if we just used urllib or something then this whole async thing wouldn't be helpful.
Writing is still sync, but could use async file io lib
import aiohttp
import asyncio
import async_timeout
import os
async def download_coroutine(session, url):
with async_timeout.timeout(10):
async with session.get(url) as response:
filename = os.path.basename(url)
with open(filename, 'wb') as f_handle:
while True:
chunk = await response.content.read(1024)
if not chunk:
break
f_handle.write(chunk)
return await response.release()
async def main(loop):
urls = ["http://www.irs.gov/pub/irs-pdf/f1040.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040a.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040ez.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040es.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040sb.pdf"]
async with aiohttp.ClientSession(loop=loop) as session:
tasks = [download_coroutine(session, url) for url in urls]
await asyncio.gather(*tasks)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))
Last updated