c# - How do I get a list of found words using Lucene.Net? -


i have indexed documents. have content:

document 1:

green table stood in room. room small.

document 2:

green tables stood in room. room large.

i'm looking "green table". find document1 , document2. want show phrases found. found in first document - "green table". found in second document - "greens table". how list of founds words ("green table" , "greens table")? i'm using lucene.net version 3.0.3.

you can use highlighter mark "found words". if want find them reason can still use highlighter , using regex (or simple substring loop) extract words.

for example:

query objquery = new termquery(new term("content", strquery));  queryscorer scorer = new queryscorer(objquery , "content");  simplehtmlformatter formatter = new simplehtmlformatter("<b>","</b>");  highlighter = new highlighter(formatter, scorer); highlighter.textfragmenter = new simplefragmenter(9999);  (int = 0; < toprealteddocs.scoredocs.length; i++) {      tokenstream stream = tokensources.getanytokenstream(searcher.indexreader, toprealteddocs.scoredocs[i].doc, "content", analyzer);       string strsnippet = highlighter.getbestfragment(stream, doc.getvalue("content"));       // here can want snippet. add result or example extract words (not regex - example here! use ever need):      list<string> foundphrases = new list<string>();      while (strsnippet.indexof("<b>") > -1)      {           int indexstart = strsnippet.indexof("<b>");           int indexend = strsnippet.indexof("</b>");            foundphrases.add(strsnippet.substring(indexstart, indexend - indexstart));            strsnippet = strsnippet.substring(indexend);       } } 

omri


Comments

Popular posts from this blog

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -

Trying to import CSV file to a SQL Server database using asp.net and c# - can't find what I'm missing -