Our client has a feature which scrapes external url text/imagedata.
However on many cases there is alot of headers and footers being extracted and in many cases there is paragraphs of full text being extracted. We need a smart application that recognizes as such and can highlight paragraphs and reduce text scraped without losing the main principal.
For example the application scrapes a wikipedia page including header footer of wikipedia and thousands of words within that article.
So we need an application that can simplify text and cut down on articles using a smart algorithm. So condense information on a page without losing touch of what the information is about.
Please only bid if you can deliver.
If you have created something similar in the past please send a PMB.
I look forward to your bids.
Deliver within 2 days.
Thanks.
Can you be more specific in what you mean by 'condense'? Are you talking about stripping out unnecessary formatting and html tags etc? That's what my bid is for. Or is the heart of this project about processing the body of articles and pages to produce summaries in a more summly-esque manner?
Hi, I am interested in this project. I have experience with both windows form and batch form projects and http scripts. Please provide an example of an input, I'll stick with 150 till then.
Hi, Electronics Engineer here with R&D experience in backgroup. I was a software development leader in a company. I can create an algorithm and its flowchart ASAP.