Pycharm calls Hanlp practice sharing under ubuntu

A few days ago, I watched the live broadcast of the Big Data Forum Summit held by Da Kuai, and I was pleasantly surprised that the hanlp2.0 version was released. Hanlp2.0 version will support any number of languages, and it feels pretty good! But for more information about hanlp2.0, it may take a while to see, I can only wait for a while! The following is an article by a great god, which is about the experiment of using pycharm to call hanlp under ubuntu.

The following is the full text:

First click File, select Settings, click Project Interpreter under Project, and click the plus sign on the right:

Search for JPype and choose the version of JPype you need to install according to the python version.

After that, at https://github.com/hankcs/HanLP/releases

Download the hanlp.jar package, model data package, and configuration file hanlp.properties from the website, create a new folder Hanlp,

Put hanlp.jar and hanlp.properties; then you need to create a new folder hanlp and put data in;

Modify the path under Hanlp to the path of the current data. Because I put data under /home/javawork/hanlp, so: root=/home/javawork/hanlp/

Next, create a new file demo_hanlp.py, the code is as follows:

#! /usr/bin/env python2.7

coding=utf-8

from jpype import *

startJVM(getDefaultJVMPath(), "-Djava.class.path=home/javawork/Hanlp/hanlp-1.2.7.jar;home/javawork/Hanlp/", "-Xms1g", "-Xmx1g")

startJVM(getDefaultJVMPath(), "-Djava.class.path=/home/qinghua/javawork/Hanlp/hanlp-1.2.7.jar:/home/qinghua/javawork/Hanlp")

HanLP = JClass('com.hankcs.hanlp.HanLP')

Chinese word segmentation

print(HanLP.segment('Hello, welcome to call HanLP's API in Python'))

testCases = [

"Goods and Services",

"Married and unmarried are indeed interfering with the participle",

"Buy fruits and come to the Expo site and finally die at the Expo"]

for sentence in testCases: print(HanLP.segment(sentence))

Named entity recognition and part-of-speech tagging

NLPTokenizer = JClass('com.hankcs.hanlp.tokenizer.NLPTokenizer')

print(NLPTokenizer.segment('Professor Zong Chengqing from the Institute of Computing Technology of the Chinese Academy of Sciences is teaching natural language processing courses'))

Keyword extraction

document = "Chen Mingzhong, Director of the Department of Water Resources of the Ministry of Water Resources, revealed at a press conference held by the Information Office of the State Council on September 29," \

"According to the assessment of the water resources management system just completed, some provinces are close to the red line indicator," \

"Some provinces exceed the red line indicators. For some places that exceed the red line, Chen Mingzhong said that regional approvals for some water extraction projects will be restricted." \

"Strictly carry out water resources demonstration and approval of water extraction permits."

print(HanLP.extractKeyword(document, 2))

Automatic summary

print(HanLP.extractSummary(document, 3))

Dependency parsing

print(HanLP.parseDependency("Mr. Xu also specifically helped him determine the painting of eagles, squirrels and sparrows as the main targets."))

shutdownJVM()

It should be noted that the path separator of ubuntu is ":", and the window is ";"

Attached is a collection of frequently asked questions about hanlp calling:

github.com/hankcs/HanLP/issues?page=3&q=is%3Aissue+is%3Aopen

Author: imperfect00

Recommended Posts

Pycharm calls Hanlp practice sharing under ubuntu
Solution to pycharm unable to import classes under Ubuntu
Install node.js under Ubuntu
Install python3.6 under Ubuntu 16.04
Install mysql under Ubuntu 16.04
Install Thrift under ubuntu 14.10
Install OpenJDK10 under Ubuntu
Install Caffe under Ubuntu 14.04
Python MySQLd under Ubuntu
Start working under ubuntu
[python] python2 and python3 under ubuntu
Use iptables under ubuntu
2018-09-11 Install arduino under Ubuntu
LNMP installation under Ubuntu
Network configuration under Ubuntu
Install ROS under ROS Ubuntu 18.04[2]
Install MySQL under Ubuntu
Install Yarm-PM2 under Ubuntu