Scraping of Topsy, Google, Zacks (only todays EPS/SALES)
$250-750 USD
Paid on delivery
Dear Mr/Mrs,
I would like to be able to scrape certain values from the web page [url removed, login to view] on demand via the windows command line prompt. Then should these scraped values be stored into a csv file.
This script should I be able to use different input parameters so I can control the scraping.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
A.) TOPSY SCRAPING
1. Scraping mode (probably best is to use a input txt file)
The different input parameters are (see [url removed, login to view]
for the attributes on the page):
- Past 1 Hour
- Past 1 Day
- Past 30 Day
- All Time
(Search)
- Everything
- Links
- Tweets
- Experts
(Network)
- Google Plus
(Language)
- All Languages
- Different languages
The attributes are used in the HTTP of [url removed, login to view]
e.g. [url removed, login to view] is attribute "Past 7 Day"
The attributes to be scraped are:
- Number of hits
- 10 result details
these attributes are written to CVS file and with the current date
if there are already entries then are the result appended.
2. Scraping mode
@Input (probably best is to use a input txt file)
- Be able to use all the different attributes for scraping on this page
[url removed, login to view]
with these different ways:
scrapingPart
- 24 hour
- 12 hour
- 6 hour
- 2 hour
- 1 hour
Time Period: how far back time the scraping should be done up till which date
@Result
Retrieve these output results
The attribute to be scraped is:
- Number of hits
for every ScrapingPart(24 hours, 12 hours ....) is a scraping done and the @Result is saved as a record in the CSV file
For example
selected Time Period: [url removed, login to view] (start) - [url removed, login to view]
Are the number of days 365 and there are 365 scrapings with the specified keyword(s)
the output is written to a CVS file.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
B.) Scraping of [url removed, login to view] / [url removed, login to view]
The same scraping we would like to have as well for [url removed, login to view]
or [url removed, login to view]
Where user can specify last hour, last 24 hours
and use keywords and scraping input like this
test site:.de
for scraping on specific domain.
The attributes to be scraped are:
- Number of hits
- 10 result details
These attributes are written to CVS file and with the current date
If there are already entries then is the result appended.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
C.) Scraping Zacks Earnings
The page to scraped is [url removed, login to view]
The tables
TODAY'S EPS SURPRISES
- Positive Surprises
- Negative Surprises
Should be scraped with colon values and merged into one CVS file with current date as additional colon
TODAY'S SALES SURPRISES
- Positive Surprises
- Negative Surprises
Should be scraped with colon values and merged into one CVS with the current date additional colon
Then there are two different output CVS files [url removed, login to view] and SALES_suprises.csv.
This script must be possible to run from the command line. Every result will be appended. In case there are the same record (Same Company & Time) should no action be performed.
I should be able to run this script to create new records every day.
I will be able on Skype every day for the project support.
The code will be belong to the project requester and is not allowed to be distributed to third party.
I am looking forward to quality fast coding.
Regards,
Thomas
Project ID: #1635439