Hi,
We were recently working on a Scraping and Extraction project from real-estate websites to extract and retrieve specific attributes for another client using both API as well as scraping methods and would be interested in your project. We have worked on scraping, crawling, extraction, aggregation and synchronization for data consistency from various unstructured data, websites and have assembled it in useful way in Excel, CSV formats storing them into databases and synchronizing any updates to the website with the schema via cronjobs. We have also extracted information from Weather, Groupon, Wikipedia, Youtube websites and have used primarily PHP and Perl and a bit of Scrapy framework.
Please find below our short experience summary.
* Have several years experience developing Text Mining and Information Extraction and Analytics for web crawling, scraping, extraction and aggregation from unstructured big data such as web-pages and text corpus, assembling and populating them into databases, datastores and search-indexes(Lucene, Solr) for analysis, search, reporting and dashboard.
* Extensive experience using Perl, PHP, Python, C, Java, .NET with MySql, Oracle, MS-SQL Server
* Information Extraction Tools : Scrapy, Weka, R, Excel, Perl-CPAN Packages for Extraction.
Estimated Budget : ~ 240 USD ( Timeline : 15-20 days )
Price,milestones and timelines flexible and negotiable based on exact project specifications and details or for any additional project work.