Package edu.msu.cme.rdp.classifier
Class Classifier
java.lang.Object
edu.msu.cme.rdp.classifier.Classifier
This is the class to do the classification.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intstatic final intstatic final intstatic final intThe minimum number of bases per sequence. -
Method Summary
Modifier and TypeMethodDescriptionvoidaddConfidence(HierarchyTree node, HashMap map) increase the count of the RankAssignment in the map if match that node or any ancestor of that node.classify(ClassifierSequence seq, int min_bootstrap_words) Takes a query sequence, returns the classification result.classify(edu.msu.cme.rdp.readseq.readers.Sequence seq) Takes a query sequence, returns the classification result.
-
Field Details
-
MIN_SEQ_LEN
public static final int MIN_SEQ_LENThe minimum number of bases per sequence. Initially set to 200.- See Also:
-
MAX_SEQ_LEN
public static final int MAX_SEQ_LEN- See Also:
-
MIN_GOOD_WORDS
public static final int MIN_GOOD_WORDS- See Also:
-
MIN_BOOTSTRSP_WORDS
public static final int MIN_BOOTSTRSP_WORDS- See Also:
-
-
Method Details
-
getTrainRank
-
classify
public ClassificationResult classify(edu.msu.cme.rdp.readseq.readers.Sequence seq) throws IOException Takes a query sequence, returns the classification result. For each query sequence, first assign it to a genus node using all the words for calculation. Then randomly chooses one-eighth of the all overlapping words in the query to calculate the joint probability. The number of times a genus was selected out of the number of bootstrap trials was used as an estimate of confidence in the assignment to that genus.- Throws:
ShortSequenceException- if the sequence length is less than the minimum sequence length.IOException
-
classify
-
classify
Takes a query sequence, returns the classification result. For each query sequence, first assign it to a genus node using all the words for calculation. Then randomly chooses one-eighth of the all overlapping words in the query to calculate the joint probability. The number of times a genus was selected out of the number of bootstrap trials was used as an estimate of confidence in the assignment to that genus.- Throws:
ShortSequenceException- if the sequence length is less than the minimum sequence length.
-
addConfidence
increase the count of the RankAssignment in the map if match that node or any ancestor of that node.- Parameters:
node-map-
-