I've worked with both Tropo and Twilio many times including doing applications for both that do outbound voice and SMS calls and record inbound calls. So I'm already very familiar with that part. Also, I have over 8 years experience with php, over a dozen years writing code, and can handle anything you need. I've worked with ffmpeg before and can combine audio files without a problem.
The only thing I have questions about is the voice recognition software that you say isn't working very well. I'm curious to know what the issue is. For example, is it something where the quality of the audio provided to it is not good so it is not getting a good result? In any event, whatever the issue is, if it is with php or with server software I am the best person for the job and will be able to fix it for you. Check my profile for more details, and let's chat to discuss more.