regex - Search through list of strings and determine if there is an exact match in separate list of strings. python. sentiment analysis -


suppose have list of keywords , list of sentences:

keywords = ['foo', 'bar', 'joe', 'mauer'] listofstrings = ['i frustrated', 'this task foobar', 'mauer awesome'] 

how can loop through listofstrings , determine if contain of keywords...must exact match! such that:

>>for in listofstrings:     p in keywords:        if p in i:          print  >> 'mauer awesome' 

(because 'foobar' not exact match 'foo' or 'bar', function should catch 'foobar' if keyword)

i suspect re.search may way, cant figure out how loop through list, using variables rather verbatim expressions using re module.
thanks

instead of checking if each keyword contained anywhere in string, can break sentences down words, , check whether each of them keyword. won’t have problems partial matches.

here, re_word defined regular expression of word-boundary, @ least 1 character, , word boundary. can use re.findall() find words in string. re.compile() pre-compiles regular expression doesn’t have parsed scratch every line.

frozenset() efficient data structure can answer question “is given word in frozen set?” faster possible scanning through long list of keywords , trying every 1 of them.

#!/usr/bin/env python2.7  import re  re_word = re.compile(r'\b[a-za-z]+\b')  keywords = frozenset(['foo', 'bar', 'joe', 'mauer']) listofstrings = ['i frustrated', 'this task foobar', 'mauer awesome']  in listofstrings:     word in re_word.findall(i):         if word in keywords:             print             continue 

Comments

Popular posts from this blog

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -

Trying to import CSV file to a SQL Server database using asp.net and c# - can't find what I'm missing -