Find Jobs
Hire Freelancers

Create a scraper for CFR regulations

$250-750 USD

Closed
Posted over 10 years ago

$250-750 USD

Paid on delivery
File Scraper, downloader, and file processing. This project consists of two parts: 1. Spider through a website ad download all files that result from the spidering 2. Format each file downloaded to a specific format Part one: You will be given a batch of starting URL's that look like this: [login to view URL] You will follow each of these URL's that will lead to another page with links that look like this: [login to view URL] You will follow each of these URL's that will lead again to another page with links that look like this: [login to view URL] You will now follow each of these links that leads to a page that links to specific documents. The links within the pages tend to look like this: <table width="480"><tr> <td><table width="120"> <tr><td> <a class="tpl" href="/cgi/t/text/text-idx?c=ecfr&SID=f68f503ab8017206c54fb367aaaa7851&amp;rgn=div8&amp;view=text&amp;node=10:1.0.1.1.4.1.56.1&amp;idno=10"> &sect;5.100</a></td></tr> </table></td> <td><table width="354"> <tr><td>Purpose and effective date.</td></tr> </table></td> </tr></table> each of these links leads to a page that needs to be saved with the following naming structure that looks like this: [login to view URL] other examples of naming structures: 6cfrAppendix A to Part [login to view URL] Part two of this project: After you have downloaded each file, you will need to put each file into a specific html page structure. 1. You will first strip all of the information before <!-- startDynamic --> and after the <!-- endDynamic --> 2. You will now need to create a header for each record that looks like the files that are part of the samples. 3. You will need to replace the string in the text when it comes across a graphic: example string: Please replace: <img src="/graphics/ With this string: <img src="[login to view URL] AND replace this string: <a href="/graphics/pdfs/ With this string: <a href="[login to view URL] 4. You will need to create a footer at the bottom of each section, after the p class=” cita, that looks this this example: <p class="cita">[54 FR 53314, Dec. 28, 1989]</p> <br><p><center>Copyright 2013 Compliance Publishing Corporation (877) 500-6737</center> </body> </html> 5. You must be able to accommodate both regular regulations and the Appendix sections 6. Some of the titles have one less level. This program must be able selectable to how many levels deep the individual text is located. 7. All of the search and replace definitions must be kept ‘outside’ of the program in text files that can be modified as needed. 8. We require the source code as well as the finished program at the end of the project 9. Attached is a program that completed most of these tasks, but no longer works correctly because of a minor change in the text formatting (the programmer is no longer available). You may wish to use this program as a guide. 10. Attached are raw data documents and finished documents to be used as a guide. Please review the information carefully before you provide a bid, as there will be no changes to the contract price once we accept your bid. Please view the attached file for a sample of what the file format will be when completed. There are both regular and appendix text in this sample. We provide all funds in an escrow account. You must complete this project within 30 days (or less) You must reply to all communications within 24 hours
Project ID: 5335612

About the project

20 proposals
Remote project
Active 10 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
20 freelancers are bidding on average $605 USD for this job
User Avatar
Hi, our team is interested in your project. We specialize in web crawler. We already implemented a number of web scrappers. We propose to use C# for development. Best regards, Kate ProTeam SPb
$752 USD in 15 days
4.9 (32 reviews)
6.5
6.5
User Avatar
Hi I review the requirements and check your attachment. Every thing is clear and i am ready to process the project. Of course i am ready to provide a demo on your request. Thanks
$1,263 USD in 10 days
4.9 (74 reviews)
6.0
6.0
User Avatar
Hi, I didn't see finished documents in the attachment. Let me know which one of the files is final. Do you have the source code of the program that is attached? My development estimates are around 50 hours. Please check my rating and feedback. Regards, Artur
$750 USD in 15 days
5.0 (25 reviews)
5.3
5.3
User Avatar
Hi Alan, We can complete this project easily like previous projects. Please award us and discuss more. Thanks and best regards, Mai.
$366 USD in 5 days
5.0 (4 reviews)
3.3
3.3
User Avatar
I've done several successful scraping jobs with excellent feedback (see profile). I can do this task with python in a week or so. I can only accept funds either though freelancer.com or with a transfer in paypal. Let me know if that works for you
$400 USD in 10 days
5.0 (1 review)
2.3
2.3
User Avatar
Dear Sir, I have six years experience in .NET and C# and can complete your work up to your satisfaction. I have done projects like University Automation system. Site Scraping, OCR System, Counseling System, Stock and inventory System etc. I am new to freelancer and need reputation than money. I will provide you quality output if you award this project to me. Regards, Munish Kumar Matolia
$555 USD in 30 days
5.0 (1 review)
1.5
1.5
User Avatar
Hello, Sir I am from vSol CORP. We are 19 people team, worked on more than 500 completed projects. We provide round the clock service. Sir we have work together earlier too. Can this be done manually? We are really like to discuss with you regarding this project. We are open for any type of negotiation. Let's speak together. Thanks vSol CORP
$257 USD in 10 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hello I have read your description. Yes, I can do it. I will create a chrome addon, it will do the thing you want. Replace content and download file as you want. Do you have mockup for this ? Anyway, please discuss together, we will find out more good thing for this. Regards
$722 USD in 28 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Dear Employee, You project sounds interesting and I can make it. I'm programmer for 20+ years. Your specification is pretty good but I'll have questions if you choose me. Sincerely, Laszlo Nyakas
$555 USD in 15 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hi, I can create this kind of application using c#, but is there any special reason why you choose to scrape the html, not just using their xml provided data and then formatted as you wanted? Regards,
$833 USD in 30 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I have 7+rs of experience as programmer in various languages If you want to have a demo on requirement i'll prepare it for you Hope for the positive response
$250 USD in 10 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hello, I am a skilled application developer with 5+ years of experience of enterprise development. I am new to this website as I just recently began the transition to self-employment through my LLC, Luntspark Systems. Upon reviewing your requirements, I feel that I am highly-qualified for your project as I have a years of experience with data-intensive systems, including .NET applications, HTML5/JavaScript/CSS front-ends, and scripting. Parsing out the required information from HTML documents would not be difficult. From there, it's just a matter of composing an HTML template that is to your liking and formatting the data appropriately. If you're interested, we could make this a customizable template that you could modify down the line if you would like to. Also, I am not a fan of embedding "magic strings" within an application. I will be sure to make all of the options configurable - i.e., the target and replacement URL's for graphics. You might find my bid to be on the low side. My main goal right now is not to make as much money as possible but rather to provide the best service that I possibly can. I take great pride in my work and I would love to give you a product that goes over and above your expectations and inspires you to not only recommend me, but to come back with any future needs. Please do not hesitate to contact me. I appreciate your time and look forward to hearing from you! Best Regards, Tony Lunt Luntspark Systems, LLC
$416 USD in 15 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
Edina, United States
4.9
76
Payment method verified
Member since Aug 13, 2008

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.