Nutch: setup tomcat


After building war file for nutch with:

ant war

Put nutch-*.war  and nutch.xml into

/opt/tomcat/apache-tomcat-7.0.12/webapps

Restart tomcat and it will expand war file automatically.

I also created softlink in

/opt/tomcat/apache-tomcat-7.0.12/conf/Catalina/localhost/nutch.xml to point to /opt/tomcat/apache-tomcat-7.0.12/webapps/nutch.xml  Don’t know if it’s necessary.  Once you create this you’ll have to restart tomcat.

Edit nutch.xml to tell it where index is located:

<Parameter override="false" name="searcher.dir" value="/opt/nutch/apache-nutch-1.2
/wks"/>

In theory, it's now setup and you can access it at http://<hostname>:8080/nutch

Also, to make sure tomcat can search chinese, edit 

/opt/tomcat/apache-tomcat-7.0.12/conf/server.xml

to say:
 <Connector port="8080" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443"

URIEncoding="UTF-8" useBodyEncodingForURI="true"
/>

Obviously, restart tomcat.

Leave a Comment