We need a web crawler that will go through 8 specific websites that contain directories of American manufacturers (e.g. [login to view URL]) to pull information on the companies listed in the directories. The company listings should be stored in a database that can be easily exported into an Excel / CSV / Tab delimited text file.
At a minimum, for each company, we would need to store: Product Category/Subcategory, Company Name, Hyperlink to Company Website, Street Address, City, State, Phone Number, Fax Number. We do not need all information from the website, just information on all the Companies listed (per specific instructions that will be distributed once the project is assigned).
Script should be performance-based and not be taxing on the directory servers. Immediate need is to run the crawler on 8 specific websites. Request could be expanded to further websites and therefore script may need to be adaptable.