A few days ago, I watched the live broadcast of the Big Data Forum Summit held by Da Kuai, and I was pleasantly surprised that the hanlp2.0 version was released. Hanlp2.0 version will support any number of languages, and it feels pretty good! But for more information about hanlp2.0, it may take a while to see, I can only wait for a while! The following is an article by a great god, which is about the experiment of using pycharm to call hanlp under ubuntu.
The following is the full text:
First click File, select Settings, click Project Interpreter under Project, and click the plus sign on the right:
Search for JPype and choose the version of JPype you need to install according to the python version.
After that, at https://github.com/hankcs/HanLP/releases
Download the hanlp.jar package, model data package, and configuration file hanlp.properties from the website, create a new folder Hanlp,
Put hanlp.jar and hanlp.properties; then you need to create a new folder hanlp and put data in;
Modify the path under Hanlp to the path of the current data. Because I put data under /home/javawork/hanlp, so: root=/home/javawork/hanlp/
Next, create a new file demo_hanlp.py, the code is as follows:
#! /usr/bin/env python2.7
from jpype import *
startJVM(getDefaultJVMPath(), "-Djava.class.path=/home/qinghua/javawork/Hanlp/hanlp-1.2.7.jar:/home/qinghua/javawork/Hanlp")
HanLP = JClass('com.hankcs.hanlp.HanLP')
print(HanLP.segment('Hello, welcome to call HanLP's API in Python'))
testCases = [
"Goods and Services",
"Married and unmarried are indeed interfering with the participle",
"Buy fruits and come to the Expo site and finally die at the Expo"]
for sentence in testCases: print(HanLP.segment(sentence))
NLPTokenizer = JClass('com.hankcs.hanlp.tokenizer.NLPTokenizer')
print(NLPTokenizer.segment('Professor Zong Chengqing from the Institute of Computing Technology of the Chinese Academy of Sciences is teaching natural language processing courses'))
document = "Chen Mingzhong, Director of the Department of Water Resources of the Ministry of Water Resources, revealed at a press conference held by the Information Office of the State Council on September 29," \
"According to the assessment of the water resources management system just completed, some provinces are close to the red line indicator," \
"Some provinces exceed the red line indicators. For some places that exceed the red line, Chen Mingzhong said that regional approvals for some water extraction projects will be restricted." \
"Strictly carry out water resources demonstration and approval of water extraction permits."
print(HanLP.extractKeyword(document, 2))
print(HanLP.extractSummary(document, 3))
print(HanLP.parseDependency("Mr. Xu also specifically helped him determine the painting of eagles, squirrels and sparrows as the main targets."))
shutdownJVM()
It should be noted that the path separator of ubuntu is ":", and the window is ";"
Attached is a collection of frequently asked questions about hanlp calling:
github.com/hankcs/HanLP/issues?page=3&q=is%3Aissue+is%3Aopen
Author: imperfect00
Recommended Posts