multithreading - Python Multithread or Background Process? -


i know how optimize link-checker i've implemented webservice in python. cache responses database expires every 24 hours. every night have cron job refreshes cache cache never out-of-date.

things slow, of course, if truncate cache, or if page being checked has lot of links not in cache. no computer scientist, i'd advice , concrete on how optimize using either threads, or processes.

i thought of optimizing requesting each url background process (pseudocodish):

    # part of code gets response codes not in cache...     responses = {}      # begin, create dict of url process in background     processes = {}     url in urls:         processes[url] = popen("curl " url + " &")      # loop through again , responses     url in processes         response = processes[url].communicate()         responses[url] = response      # have responses dict has responses keyed url 

this cuts down time of script @ least 1/6 of use-cases opposed looping through each url , waiting each response, concerned overloading server script running on. considered using queue , having batches of maybe 25 or @ time.

would multithreading better overall solution? if so, how using multithreading module?


Comments

Popular posts from this blog

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -

Trying to import CSV file to a SQL Server database using asp.net and c# - can't find what I'm missing -