In Progress

Write a script to convert formatted PDF exams to XML/HTML files.

I have hundreds of PDF exam files which are all in the same format, I want all the exams formatted into a database, but for this I need them to be something I can parse easily like XML/HTML.

The info I need from each exam is:

Exam name

For each question:

1. Question number (and if the exam is divided to topics, which topic it belongs to)

2. Question Text (the actual question.

3. If the question has multiple choices the text of each choice. (the question title specifies if it is a multiple choice question or not).

4. Question answer.

5. Question Answer Explanation.

The hard part is that fields 2-5 might contain images in them, if there is an image, it should be extracted to a file, and referenced to from the correct place.

I don't care if the script/program that you'll supply will handle one exam at a time and I'll create a script the runs it on on all the files.

Attaching a sample exam file, there is an image in question #4, I'll supply later 2 more exams that will basically cover all the possible cases of how an exam should look like.

Skills: PDF, Python, Software Architecture

See more: topics on which i can write, image in html 5, c++ parse html 5, parsing html files using shell script, convert xml html perl script, convert xml html perl script example, perl script convert xml html, free convert html page pdf script, script convert html files csv, php script convert pdf html, html files convert pdf page breaks, convert html files xml files without losing data, convert html pdf script

About the Employer:
( 5 reviews ) Tel-Aviv, Israel

Project ID: #8182201

Awarded to:

try67

Hi, My name is Gilad (originally from Israel, by the way), and I specialize in creating custom-made tools for PDF files. I had a look at your file and read the instructions and I believe I can develop for you either More

$350 USD in 5 days
(107 Reviews)
6.6
mwarrenschultz

Hello. I have read your project description, and I would enjoy creating the program for you. Converting from one format to another is generally pretty easy, but as you noted, these may contain images. The way I would s More

$277 USD in 3 days
(58 Reviews)
6.4

18 freelancers are bidding on average $246 for this job

nitelfreelance

Hi Hire is my previous project that is like yours: [login to view URL] It first converts pdf file to xml and then extract some useful data from xml. I will write a python script that extract the in More

$299 USD in 5 days
(42 Reviews)
6.3
pnvasko

Hello, I'm a novice freelancer with great experience in the development, I want to make the most quickly and efficiently. Send a more detailed this job! Any question welcome! Best regards, Vasiliy

$150 USD in 5 days
(35 Reviews)
6.3
NomiHD

I have experience of writing scripts using python to my clients life easier . I wrote so many scripts to extract information from different resources . I can provide you with the output of all the files according to More

$150 USD in 5 days
(43 Reviews)
5.5
sergpooh

Hello, I have a experience with PDF processing. It's interesting task for me. All the files have same structure?

$222 USD in 10 days
(12 Reviews)
5.3
AnilTejwani73

Hi, I am confident to deliver more than your expectation, if given a chance. Kindly have a look into my profile, and if it interests you, lets discuss more of the project. I plan to do it preferably in PHP, el More

$249 USD in 15 days
(8 Reviews)
4.3
zkutch

Hello. More 20 years programming experience. I need more details to set real time and price. Regards. ---------------------------------------------------------------------------------------------------------------- More

$250 USD in 3 days
(19 Reviews)
4.7
binarycodersvw

Hi.I am a Visual Basic .NET programmer with 10 years experience(I know you hear things like these a lot but I can show proof). This is an automated robot placed bid but I have read the description of your project and I More

$456 USD in 7 days
(15 Reviews)
3.9
lismanb

Hi, I have 4 years of Python coding experience. I saw that pdfminer does a good job on extracting the text and images for your project. I can have this script ready and tested locally in 2-3 days, and for the rest More

$555 USD in 7 days
(6 Reviews)
3.8
ibaydan

Bir öneri henüz sağlanmadı

$155 USD in 1 day
(5 Reviews)
3.2
brindusealex

Hi, I have a lot of experience with python and I'm sure I can finish your project by Monday. We can talk more details on private!

$277 USD in 3 days
(1 Review)
1.5
kiranreddy85

A proposal has not yet been provided

$155 USD in 3 days
(0 Reviews)
0.0
joshusre

I have extensive experience in processing large amounts of data (databases with 100 of millions of rows) and changing it into a different format (data on a website into a searchable database, a searchable database into More

$155 USD in 7 days
(0 Reviews)
0.0
Amoruka

Hello! I'm sorry that I waste your time. Here I don't compite with other freelancers and I don't applay for a payment. I want to upgrade own skills on real tasks and to get experience. I want to take your project f More

$244 USD in 10 days
(0 Reviews)
0.0
qilei2011

一个有效的提议尚未被提供

$155 USD in 3 days
(0 Reviews)
0.0
ruhshan

A proposal has not yet been provided

$111 USD in 5 days
(0 Reviews)
0.0
rlcrane22

Hello I have a good script idea to read in a folder and get all exam files in the folder. it will then create a new folder with the new xml files along with a picture folder to reference which picture came from witch e More

$222 USD in 15 days
(0 Reviews)
0.0