# 默认模式 seg_list = pseg.cut("今天哪里都没去,在家里睡了一天") for word, flag in seg_list: print(word + " " + flag) """ 使用 jieba 默认模式的输出结果是: 我 r Prefix dict has been built successfully. 今天 t 吃 v 早饭 n 了 ul """
# paddle 模式 words = pseg.cut("我今天吃早饭了",use_paddle=True) """ 使用 paddle 模式的输出结果是: 我 r 今天 TIME 吃 v 早饭 n 了 xc """
writer.add_document( title=u"document1", path="/tmp1", content=u"Tracy McGrady is a famous basketball player, the elegant basketball style of him attract me") writer.add_document( title=u"document2", path="/tmp2", content=u"Kobe Bryant is a famous basketball player too , the tenacious spirit of him also attract me") writer.add_document( title=u"document3", path="/tmp3", content=u"LeBron James is the player i do not like")
for keyword in ("basketball", "elegant"): print("searched keyword ",keyword) query= parser.parse(keyword) results = searcher.search(query) for hit in results: print(hit.highlights("content")) print("="*50)
上面代码中,使用 add_document() 把一个文档添加到了 index 中。在这些文档中,搜索含有 “basketball”和 “elegant” 的文档。
打印结果如下:
1 2 3 4 5 6 7 8 9 10 11
Building prefix dict from the default dictionary ... Loading model from cache C:\Users\wyzane\AppData\Local\Temp\jieba.cache Loading model cost 0.754 seconds. Prefix dict has been built successfully. searched keyword basketball McGrady is a famous <b class="match term0">basketball</b> player, the elegant...<b class="match term0">basketball</b> style of him attract me Bryant is a famous <b class="match term0">basketball</b> player too , the tenacious ================================================== searched keyword elegant basketball player, the <b class="match term0">elegant</b> basketball style ==================================================
更换搜索词时:
1 2 3 4 5 6 7
for keyword in ("LeBron", "Kobe"): print("searched keyword ",keyword) query= parser.parse(keyword) results = searcher.search(query) for hit in results: print(hit.highlights("content")) print("="*50)
搜索结果如下:
1 2 3 4 5 6 7 8 9 10
Building prefix dict from the default dictionary ... Loading model from cache C:\Users\wyzane\AppData\Local\Temp\jieba.cache Loading model cost 0.801 seconds. Prefix dict has been built successfully. searched keyword LeBron <b class="match term0">LeBron</b> James is the player i do not like ================================================== searched keyword Kobe <b class="match term0">Kobe</b> Bryant is a famous basketball player too , the tenacious ==================================================
for keyword in ("篮球", "麦迪"): print("searched keyword ",keyword) query= parser.parse(keyword) results = searcher.search(query) for hit in results: print(hit.highlights("content")) print("="*50)
结果如下:
1 2 3 4 5 6 7 8 9 10 11
Building prefix dict from the default dictionary ... Loading model from cache C:\Users\wyzane\AppData\Local\Temp\jieba.cache Loading model cost 0.780 seconds. Prefix dict has been built successfully. searched keyword 篮球 麦迪是一位著名的<b class="match term0">篮球</b>运动员,他飘逸的打法深深吸引着我 科比是一位著名的<b class="match term0">篮球</b>运动员,他坚韧的精神深深的感染着我 ================================================== searched keyword 麦迪 <b class="match term0">麦迪</b>是一位著名的篮球运动员,他飘逸的打法深深吸引着我 ==================================================