Parse and compare Excel data in different files and show differences -- Use Perl, Python or any scripting language -- Needs to run on Linux or Windows
$10-30 AUD
Closed
Posted almost 10 years ago
$10-30 AUD
Paid on delivery
RULES:
Do NOT ask us what our budget is. Bid exactly what you will charge us.
Your feedback MUST be greater than 5.
Your completion rate MUST be better than 65%.
DO NOT waste our time if you do not meet the above criteria. If you do, we will flame you as a result. Respect our rules and we will respect you.
BRIEF:
I have numerous Excel files that contain about 65,000 rows of customer data each. Here are the column names:
State
txtRentalOrderId
txtCreationDate
txtCustomerName
txtContactName
txtTerm
txtFromDate
txtSalesperson
txtToDate
txtStatus
txtAssetList
txtOrderTotal
Each file:
a) is saved in the format of: "Orders - June 7 [login to view URL]" (where the date part changes with each save and depends on the date the file is saved; right now I am trying to generate a new file 3 times a week).
b) contains details of client orders. The Unique ID for each order can be generated by merging column "State" & "txtRentalOrderid" using the concatenate function.
c) is a representation/snapshot of all orders at a given point in time.
Each older file (i.e "Orders - June 1 [login to view URL]" contains data that is a subset of the file after it, i.e. "Orders - June 7 [login to view URL]". Each newer file contains all data from the previous file PLUS all data that has been generated since.
SOLUTIONS I AM SEEKING:
1) Determine which order ("txtRentalOrderId") has changed the sales exec ("txtSalesperson") associated with it over time.
For example, a particular client has had an open order in the system with Mike. Mike quits the company and assigns all his orders to Jack. I now need to track which orders changed hands and who they were assigned to. This is particularly important as Mike and Jack were good friends at work. So by assigning all his likely bookings to Jack, Mike is giving Jack an unfair advantage over the rest of the team.
2) Determine which order ("txtRentalOrderId") had its client ID changed ("txtCustomerName").
For example, "txtCustomerName" was "Lenovo" to start with. Then, the sales exec who was dealing with Lenovo was told by its rep that they will not proceed with their booking due to a change in requirements. Now, if the sales exec closes the quote as LOST, it will affect their conversion (one of the KPIs). So instead of marking the booking as LOST, the sales exec just changes the "txtCustomerName" to a different client (i.e. HP) when a new quote request comes through from HP (instead of creating a totally new quote in the name of HP). That way, the sales exec can keep on changing the "txtCustomerName" until some client agrees to the quote and the quote is closed as WON. Such behaviour manipulates the conversion rate of the sales exec in question and makes it seem that they are winning a lot more quotes.
3) Determine which order ("txtRentalOrderId") had its start date changed ("txtFromDate").
Sales execs should only be allowed to change the start date of a quote once. For example, a client initially indicates that they want to start their event on July 20. So the entry in the "txtFromDate" column becomes "July 20". However, closer to the date, the client decides to change the start date to August 20. The sales exec then changes the "txtFromDate" to August 20. However, closer to the date, the client once again is undecided and changes the date to October 20. At this point, the sales exec should close the quote as LOST and move on. However, what I am seeing is that the sales exec keeps on pushing the date forward endlessly. This protects their conversion rate as the quote is not marked as LOST by August 20.
INPUT
Any number of Excel files
OUTPUT
Show the results of the comparisons between any files compared. So if we are parsing 20 files, we need to see what changes had occurred over time. (For example, if a particular client had changed hands between 4 different sales people, the output file would show such a change 3 times).
START DATE
ASAP
DEADLINE
1 week
MILESTONE RELEASES
Released within 24 hours of successful UAT completion.
Good luck!
Hello, I can write a system that will import your excel files into database and will perform all 3 outlined actions. Also you'll be able either to clean all database before you'll upload more files or just select timeframe (that will be determined by filenames) you wish to work on.
I can lower bid by $200 if you can provide files in csv format instead of XLS (or XLSX).
MS Excel have an option to save file as csv.
Thanks!
Hello,
Greetings from Shweta in Bangalore India.
I have expertise in Perl and can provide you with a viable solution to handle this.
I can also provide you with a GUI based application.
Kindly revert back.
Regards,
Shweta
Hi there,
I've been working with Excel since before '98 and I've been using all possible versions of it.
I have years of experience working with macros and writing code / debugging VBA code.
This is what I intend to use for this application: VBA(Excel) or Visual Basic 6.
Do not hesitate to contact me for further details.
Kind Regards,
Jon
Hello,
I can write this script in PHP or Python.
Let me know which language you prefer and quality of the work is assured.
You can open my profile and see I have 70+ feedback and more than 65% completion rate.
Thanks.
Hi I am a Masters student in Embedded system student and during my course work I have processed excel files using Perl. I used the Win32::OLE::const 'Microsoft Excel" perl module to process the file in Windows. I need to look up how it will work in Linux, Since excel cannot be used for Linux, so you will be using open-office?. We can discuss further if you are interested to work with me.
Public Accountant, management expert Excel, Word, Power Point, my goal is to achieve customer satisfaction in a timely manner, with this continuing to work for you with the assurance that the service requiring a server will always be quality all aspects.
Hello,
MY SOLUTION:
I can do the project for you using Javascript(Nashorn), which is based on Java 8.0, so it can run only any system.
I am going to provide a file that can be run in command line. I will provide base command line options.
ABOUT ME:
I am an experienced JavaScript / Java programmer. I do proper unit test when coding, so the code quality is guaranteed. I put my client's satisfaction at top priority. My past clients are very satisfied with my work.
MILESTONE:
Should you award the project to me, please pay 70% when I deliver a runnable file that can do the work. Please pay the rest when you are totally satisfied.
ALTERNATIVE SOUTIONS:
I wish to provide a couple alternative solutions to you for your reference.
Because the xls files carry large amount of repetitive data, it is most suitable to do using database. Each time when taking snapshots, update only the new data to database. Then you can query the database about whatever you want at any time. This kind of query is very fast. Using the above solution, it is very slow for doing any query, because it will need to read through all xls files.
I wonder how many people need to access the data. If more than 1, it is best to host the database online.
Or, a cheaper solution will be to put the data on an online xls file on Google Doc. Multiple people can access it. It will save the cost of hosting database online.