FRDCSA | internal codebases | Classify

[Project image]

Architecture Diagram: GIF

Jump to: Project Description | Parent Description | Capabilities

Project Description

Before our project became committed to 100% FOI, we planned to classify various parts of the system. Now classify is mainly used to help release software to the public. It performs de-identification, that is, removes personal data from data to be released. It also checks the licensing of data sets and software we wish to release.


  • Make kbfs2 very robust, add a lot of features. Then use it to classify different files for release. Have it inspect files using a decision agent, and auto-redact parts as needed. Make as part of a separate codebase to myfrdcsa. Have it have multiple modes about reckoning about files. Then start asserting a lot of stuff, rewrite academician and NLU to use it, and so on.
  • classify should use named entity recognition to recognize names in documents and see if these names relate to me such as from previous jobs, and if so, suggest the material to be relocated.
  • Use kbfs to tag which files are known to be software archives, versus which are known to be data sets, versus which are not known to be either, versus which we know nothing about at all. This database could be bootstrapped by iterating over /var/lib/myfrdcsa/codebases/external versus /var/lib/myfrdcsa/codebases/datasets, and then tagging several of neither kind. Then extract features of those data points, using Sayer, etc, to do machine learning to automatically classify items as being software collections or not.
  • Add a rule to the system to classify messages that are psychotic references into schizophrenai
  • Redo classify.el to be able to mark files using kbfs
  • classify incoming messages according to whether they represent desires or not.
  • Write a political speech filter and add it to classify
  • Look at the entries marked not released and released and derive rules to implement into classify for declassifying stuff.
    ("depends" "100277" "100275")
  • critic many entries in order to determine their releasability through classify.
    ("depends" "100275" "100272")
    ("depends" "100277" "100275")
  • This will be useful in creating the models. Note, we should also classify interest in topics, not just documents, since document granularity is not as meaningful. Also, I suppose we can tag authors as being good or not.
  • Need to use corpus to classify email/aim logs for what to do with them.
  • corpus can determine when there is not enough information available for a given classifier to classify an Item, in which case it does various checks, defaulting to asking the user.
  • The functionality classifier help to classify items that are marked generally as requirements.
  • Could classify corpus entries along emotional lines.
  • I mean classify
  • We will use a classifier to classify writings into topics automatically using the code I just wrote for critic.
  • classify emacs logs as legitimate versus illegitimate, clean them and use text classifier.
  • Should read my early writings, but classify out angry parts.
  • Write code to classify the entries in my todo.kif file into projects.
  • Use question classifier (QC.tar) to classify searches as well
  • Develop a decision tree classifier so that I can ask straightforward questions to classify functionality.
  • Should have a bard -d record,index,classify
  • Make an effort to classify existing unilang entries.
    ("pse-has-property" "81599" "habitual")
  • KBS, MySQL:freekbs:default query ("unilang-message-classify" nil "political-action-item")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "44537" "political-action-item")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "45505" "deleted")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "28838" "observation")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "60762" "observation")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "50420" "political-action-item")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "57737" "political-action-item")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "60841" "observation")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "59191" "delete")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "60303" "icodebase-capability-request")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "55600" "icodebase-capability-request")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "59603" "icodebase-capability-request")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "59525" "event")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "59642" "complex-statement")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "55512" "icodebase-resource")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "59642" "rant")
  • KBS, MySQL:freekbs:default assert ("unilang-message-classify" "59611" "icodebase-resource")
  • Force myself to classify 1000 entries as a start, finish the critic classifier subsystem.
    ("due-date-for-entry" "60341" "Important")
  • Write classify user manual
  • Develop a system to classify files on your desktop and put them in certain places.
  • classify should also check the filenames.
  • Can use a spam filter type assignment for classify.
  • classify should maintain a database of files (and their content) which has been approved for release. Maybe do this as part of subversion or something.
  • Obviously classify should not get rid of its data files.
  • classify performs de-identification of text to remove references to personal information.
  • Determine what the good free named entity recognizers were. now that I've forgotten, and use them with classify.
  • Apply classify to unilang log entries.
  • Setup various key combinations to classify email messages.
    ("comment" "25407" "This is brilliant, store info about email messages in freekbs.")
  • For classify, get in touch with people who make network information detection software and ask whether they have tools.
  • classify is really a system to detect private information.
  • using kissinger corpus, formalize domains they are discussing, and represent communication actions formally, classify them, etc.
  • Use text classification tools to classify resources into appropriate categories.
  • Use Tabari in audience to classify communications.
  • There are various methods of doing matching between agents and capabilities. Perhaps subsumption reasoning is overkill. For instance, maybe we can just use various texts that the user has written and use bayesian text classification to classify problem descriptions to these texts to determine who is likely to be interested in them.
  • If it's not obvious, unilang will be using corpus to classify the users entries.
  • numerous ideas on the subject: should have classes be a type hierarchy. Maybe build a grpahical tool to interactivcely classify these. Convert these system over to kbfs::Cache, keeping a copy of current method. For each icodebase and common (agentified), add a class. For instance, rather than classifying into a vague category, you could classy directly to "pse, goal", etc.
  • RSR can use the critic::Classifier system to classify its events into habits.
  • corpus must first chunk, then classify.
  • We should have it so we have to review and classify all thoughts at the end of the day.
  • audience should use ConceptNet's ability to judge the gyst and the emotion of a letter to classify angry letters. We should see how that works after setting up the XMLRPC server.
  • You can use typing speed to (help) classify related thoughts in corpus.
  • Justin suggests that we use a LVCSR but have it record the words it doesn't understand, and then manually classify these later, kind of like the way OCR does text and pictures.
  • Use language features and words to classify political factions.
  • You could use a classifier to classify text files into various soft clusters for projects.
  • Also, note that, presently Memcons are difficult as classify is not complete.
  • Obviously we need to classify this file.
  • Need to figure out a way to classify my unilang.log adequately.
  • Function request: classify software into the correct category by searching it.
  • Watch all pages I go to and index them as part of the system, require online classification, say for instance, when I arrive at the page, launch a popup, which asks me to classify it, or perhaps, when I hit a globally bound key.
  • Need to consolidate software that recognizes file types, perhaps using MIME, and even learns like my software for files, apply this to both files and URL prediction - i.e. learn all information from the name and context possible - that's a big thing, the context. In fact this is even a bigger problem, but can be factored, that deciding how data is to be organized. We have to look into the primary classification rules, for instance, that one can classify with a formula, by file type, etc, and come up with a scheme for classifying this data. Ultimately this is indeed the kbfs, and so should this predictive capability lie within its domain? Or shouldn't it? These questions are interesting, yet my previous background indicates to me they are trivial, just not from our vantage.
  • Machievelli ought to parse unilang-client input and automatically classify it to an appropriate level.
  • should use machine learning software in order to classify and suggest tags for a given sentence. Like whether it is accusatorial, whether it refers to my parents, whether it is a feature request, in order to boot strap the process of reviewing my thoughts.
  • I find the tools for browsing CPAN somewhat inadequate. Hey, I just realized, that one program is very similar: dselect, allows you to brute force browse things, although I don't think it allows you to classify things etc.
  • I feel bad I haven't gotten anything useful done. I need to get the ability to classify perl modules tonight.

This page is part of the FWeb package.
Last updated Sat Oct 26 16:50:49 EDT 2019 .