Project Description

While the FRDCSA has a ridiculous number of projects ongoing at any time, a recently developing project has taken my fancy because it promises to offer good dividends. The project, called "Raiders of the FTP Sites", well, speaks for itself. It is a system that searches FTP sites for interesting artifacts and retrieves them. It works by correlating subject matter of interest with FTP sites, extracting a recursive directory listing of the FTP site, and then performing several analyzes of the contents. While still a very immature system, it has yielded up a labelled resume corpus which will help the job-search program.

