Find Jobs
Hire Freelancers

Create a script for data extraction from images

€250-750 EUR

Closed
Posted about 5 years ago

€250-750 EUR

Paid on delivery
Expected behavior: 1) User opens a web page with file upload form 2) User uploads a scanned document 3) Data is extracted in the backend via a script 4) Data is saved in MySQL db via a script Please see the link below for scanned document examples: [login to view URL] Data from example document#2 that should be extracted: Invoice Number - Pavadzīme Nr. ABT 2930 Date Supplier - Piegādātājs AB Transystems SIA Supplier address - [login to view URL] Maskavas iela 227,Rīga,LV-1019 Supplier Registration Number - Reģ. Nr. 40003741261 Vat Nr - PVN Nr - LV40003741261 Bank account Nr. - Konts - LV91HABA0551009805564 Sum without VAT - Summa bez PVN: 6.66 Sum with VAT - Summa ar PVN (EUR) 8.06 Product Fields - Preču nosaukums, Mērv, Daudz, Cena, Summa (JTH 48B M10x30 regulējošās kājiņas gabals 4 0,590 2,39) There will be multiple scanned document templates with different designs/looks. If you’re up for this job - please provide us with the necessary information: a) Which programming language/libraries will you be using? b) For multiple templates, will the same data extraction logic/pattern be applied, or will it be needed to customize for each template? c) What would be the minimum requirements for the scanned document in terms of quality and dimensions(px) for the script to work? d) When can you start work on this project? For this project - it would be best to use an already available solution. I would suggest using Apache Tika ([login to view URL])
Project ID: 19019769

About the project

7 proposals
Remote project
Active 5 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
7 freelancers are bidding on average €552 EUR for this job
User Avatar
Hello, I have gone through your job posting and become very much interested to work with you. I am an expert in this field. I have already completed several projects like this. For evidence you can see my profile. Please visit : https://www.freelancer.com/u/schoudhary1553 I have excellent command over English. I am a hard worker, productive and worthy of your attention I hope, I would be the right candidate for this post. Awaiting an affirmative response from you. Kinds Regards, Sandeep
€500 EUR in 6 days
4.9 (271 reviews)
7.8
7.8
User Avatar
Hi there,I am Data extraction automation expert from Bosnia & Herzegovina,Europe. I have carefully gone through with your requirements and I would like to help you with this project ! I can start immediately and finish it within the agreed deadline. Check out my profile, portfolio and former clients feedback - that'll let you know everything about me. Please feel free to contact me so that we can discuss further details. Thank you for taking the time to read my proposal.I am looking forward to hearing from you. Best regards, Miljan
€338 EUR in 5 days
4.9 (134 reviews)
7.5
7.5
User Avatar
Hiya. How is your day? I just checked your project “Create a script for data extraction from images” and I am interested in your project. I am an experienced developer in PHP, Angular, React and Node and I can handle your design to good I am ready to provide full service from design to maintenance for you. I would like to discuss more details via chat and I hope we will make the good relationship in our project.
€500 EUR in 10 days
4.6 (82 reviews)
7.3
7.3
User Avatar
Hi there, The requirements look quite clear and straightforward to implement. However, tika is certainly not the tool you're looking for. It targets metadata and structured text. However, if I'm not missing anything here, what you need is optical character recognition aka OCR to parse the data from scan images, those 2 are very different things. Rolling up a solution from scratch is way out of the scope for this project and an overkill in the first place so I suggest to use tesseract-ocr a renowned open source engine for this type of work. I have used it several times with pretty good results. I should note that though as is the case with any ocr implementation success rate won't be 100% meaning there'll be files that it won't be able to parse, e.g a badly scanned image. About your questions, 1. I'm planning to use python utilizing tesseract 2. It won't work on different templates. The effort needed to make it work for another template is directly related to the difference between the templates. For instance, if it's a completely different template with a new layout, font, color, ect. a brand new parser should be created for it from scratch. 3. It's impossible to give any decent figure for that. Layout, font, coloring, clarity effects everything. For instance, the last page of the example document is very hard to parse if not impossible at all 4. I can start on next Wednesday, 27th of March and expect this to take 15 days. I'll need lots of these files to train the engine
€1,000 EUR in 15 days
5.0 (46 reviews)
6.0
6.0
User Avatar
hello,how are you. i read your bid carefully. i am ocr expert and have full experience for 10 years. c/c++, opencv is my top skill and i can build yoru project fully. i can provide most quality and high speed. if you want to success, please contact me. Then, I will give you good result to the proposals. hire me.
€500 EUR in 10 days
5.0 (10 reviews)
5.9
5.9
User Avatar
Hello, I have read the details provided and i am positive i can provide quality work,please contact me to discuss more on the project deadline and some other few things
€250 EUR in 10 days
4.9 (8 reviews)
4.7
4.7

About the client

Flag of LATVIA
Riga, Latvia
0.0
0
Member since Feb 9, 2019

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.