My organization owns and operates the website [url removed, login to view], a legal portal which provides assistance to individuals and entities who are on the receiving end of mass-John Doe copyright infringement and hacking cases. We maintain a large (>1000) library of complaints each of which represents a single case filed somewhere in the USA.
What we need is for someone to go through the library (which is organized by both Plaintiff and Jurisdiction) of complaints, identify the work or works at issue in each complaint (usually identified on page 2-4 of the complaint, or in an exhibit at the end of the complaint), and make a list of all of the works in all of the complaints.
The "work product" at the end will be an Excel spreadsheet:
- Column A being the alphabetical list of all works;
- Column B being the number of complaints in which the work is at issue; and
- Columns C-..., identifying the individual cases in which the work is at issue.
Most cases involve just one work, others involve multiple works. The task is pretty straightforward and while a script could get someone pretty far, some of the complaint .pdf's are poor scans and you need to open the file and scroll through to find the work at issue