.
FRDCSA | git codebases | mdr
Homepage

[Project image]
mdr

Code: GitHub

Jump to: Project Description | Capabilities

Project Description

mdr is a library detect and extract listing data from HTML page. It implemented base on the `Finding and Extracting Data Records from Web Pages `_ but change the similarity to tree alignment proposed by `Web Data Extraction Based on Partial Tree Alignment `_ and `Automatic Wrapper Adaptation by Tree Edit Distance Matching `_.

Capabilities

  • Fix the mdr algorithm sometime.
  • Finish that part of mdr
  • Finish mdr finally.
  • Work with mdr to make lists of all the tabled information.
  • Process metasites with mdr
  • Complete implementation of mdr
  • Here are some of the things we need to do next: fix the way PSE craps out all the time, get Clairvoyance working and with some basic document management systems, including for instance authorized reports to ensure back to me that various people are learning various things (based on testing), find some way to get those recipes normalized, get shops up and running and taking inventory of everything I have, add this information to Verber, get mdr (minimum detection route) planning operational, get new tagsets for AWB/predator working, fix problem with script determining perl dependencies, make packages of my systems and upload, fix mini-dinstall problem, write tutorial on agentification, agentify or otherwise get command line bugzilla working, fix up manager with sleep learning capabilities, create a sample course for clairvoyance, and run backup. You're such a lazy guy, you know that?


This page is part of the FWeb package.
Last updated Sat Oct 26 17:00:32 EDT 2019 .