You are here

Create a multi-core Solr 4 service running on Tomcat 6 on Ubuntu 12.04 LTS

Development

Install Tomcat 6 and some of its accompanying packages

Let's begin with a clean Ubuntu 12.04 LTS server and install Tomcat 6:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install tomcat6 tomcat6-admin tomcat6-common tomcat6-user

The following will be installed:

tomcat6: Servlet and JSP engine
tomcat6-admin: Admin web applications
tomcat6-common: Common files
tomcat6-user: Tools to create user instances

If the install went well you should be able to connect to your server on port 8080.

Give it a try at http://localhost:8080 or http://ServerNameOrIP:8080. You should see the default ‘It works’ page.

Please note that Solr comes with no default security! If your server allows port 8080 without restrictions, anyone will be able to post data towards your system or completely remove an index. So, best thing to do is add IP restrictions on port 8080! Don’t forget it!

Download and install Solr 4

Download the latest stable solr 4 build to your home directory.

Get it from one of the mirrors at http://www.apache.org/dyn/closer.cgi/lucene/solr.

cd ~/
wget apache.cu.be/lucene/solr/4.0.0/apache-solr-4.0.0.tgz
tar –zxvf apache-solr-4.0.0.tgz

Include the Solr application as a Tomcat webapp

Now we are going to tell Tomcat to load the Solr application as one of its webapps.

Your Tomcat 6 installation is probably located at the following locations:

/etc/tomcat6
/usr/share/tomcat6

Tip: You can use the following command to find out where Tomcat6 was installed:

whereis tomcat6

Copy the solr.war file from your extracted Solr build to the Tomcat webapps directory:

sudo mkdir /usr/share/tomcat6/webapps
sudo cp ~/apache-solr-4.0.0/dist/apache-solr-4.0.0.war /usr/share/tomcat6/webapps/solr4.war

You may of course place it elsewhere but make sure the tomcat6 group has sufficient rights to access it.

Now, copy the entire extracted solr application to a new directory:

sudo cp -R ~/apache-solr-4.0.0/ /usr/share/solr4/

We will change the included ‘example’ solr application later on to be viable for Drupal 7.

For now rename the underlying ‘example’ folder to a more company/client/… friendly name:

( rename {mycompanyname} as desired throughout the rest of the guide )

sudo mv /usr/share/solr4/example /usr/share/solr4/{mycompanyname}

Along with that you may remove some unnecessary example subfolders:

sudo rm -r /usr/share/solr4/{mycompanyname}/example-DIH
sudo rm -r /usr/share/solr4/{mycompanyname}/exampledocs
sudo rm -r /usr/share/solr4/{mycompanyname}/solr-webapp

Create a config file in order for Tomcat to recognize the Solr home folder.

sudo nano /etc/tomcat6/Catalina/localhost/solr4.xml

And add the following:

<Context docBase="/usr/share/tomcat6/webapps/solr4.war" debug="0" privileged="true" 
         allowLinking="true" crossContext="true">
    <Environment name="solr/home" type="java.lang.String"
                 value="/usr/share/solr4/{mycompanyname}/multicore" override="true" />
</Context>

Manage the Tomcat 6 application

We installed the tomcat6-admin and tomcat6-user package to be able to make use of the Tomcat6 manager web application.

In order to have access to this application we need to edit its permissions:

sudo nano /etc/tomcat6/tomcat-users.xml

Change it so it more less looks like this:

<tomcat-users>
    <role rolename="admin"/>
    <role rolename="manager"/>
    <user username="{mycompanyuser}" password="{mycompanypassword}" roles="admin,manager"/>
</tomcat-users>

We'll restart the Tomcat service soon, but first we’re going to configure Solr’s multi-core capability.

Drupal-specific Solr & multi-core configuration

At the time of writing you need the latest version of the apachesolr Drupal module to be able to communicate with Solr 4. You can e.g. download the latest development release athttp://ftp.drupal.org/files/projects/apachesolr-7.x-1.x-dev.tar.gz.

cd ~/
wget ftp.drupal.org/files/projects/apachesolr-7.x-1.x-dev.tar.gz
tar –zxvf apachesolr-7.x-1.x-dev.tar.gz

The apachesolr module comes packed with custom Drupal Solr config files. In the apachesolr module subdirectory ‘solr-conf’ you can find the needed files for each different Solr version.

In order for Solr to understand Drupal requests you need to copy these config files to your Solr core ‘conf’ directory. So let's copy them to Solr's exemplary 'core0' sub-directory:

sudo cp ~/apachesolr/solr-conf/solr-4.x/schema.xml /usr/share/solr4/{mycompanyname}/multicore/core0/conf/schema.xml
sudo cp ~/apachesolr/solr-conf/solr-4.x/solrconfig.xml /usr/share/solr4/{mycompanyname}/multicore/core0/conf/solrconfig.xml
sudo cp ~/apachesolr/solr-conf/solr-4.x/protwords.txt /usr/share/solr4/{mycompanyname}/multicore/core0/conf/protwords.txt
sudo cp ~/apachesolr/solr-conf/solr-4.x/solrcore.properties /usr/share/solr4/{mycompanyname}/multicore/core0/conf/solrcore.properties

Personally I like to use the latest Solr configuration files as a base and make adjustments to those files. This way it’s easier to spot Solr configuration differences after an upgrade.

E.g. the attached configuration package conf-en-nl.zip is optimized for a Dutch / English bilingual website. Feel free to use that for reference purposes if you’d like to setup a multilingual Drupal site. You can download the file and overwrite the entire /usr/share/solr4/{mycompanyname}/multicore/core0/conf/directory with its content.

After copying those files, restart the Tomcat service and you'll be able to connect your Drupal 7 site to the Solr service.

Use the following command to restart the Tomcat service:

sudo /etc/init.d/tomcat6 restart

Permissions

If you can no longer connect to http://localhost:8080 or http://ServerNameOrIP:8080 you probably have a permissions issue!

It might be useful to check the Tomcat logs at the following location:

/var/log/tomcat6/

How to fix it?

Set the solr directory to be part of the tomcat6 group:

sudo chgrp -R tomcat6 /usr/share/solr4

Change the access permissions in the solr directory:

sudo chmod -R 2750 /usr/share/solr4

Make the core folders writable, by default the index files and spellcheck files will be place underneath each core folder. When adding a new core, make sure the permissions are OK:

sudo chmod -R 2770 /usr/share/solr4/{mycompanyname}/multicore/

That should do the trick. Now restart the Tomcat service:

sudo /etc/init.d/tomcat6 restart

If all goes well, you should be able to connect to http://localhost:8080 or http://ServerNameOrIP:8080and you should also see a newly created data folder at/usr/share/solr4/{mycompanyname}/multicore/core0/.

Connect your Drupal 7 site to Solr 4

Finally the easy part: telling Drupal where to find the Solr service.

The most important module is the 'Apache Solr search' module (http://drupal.org/project/apachesolr).

As stated before, make sure you use the latest dev version because Solr 4 - being quite new - integration hasn't yet been incorporated in the stable release.

After installing and enabling the module you simply need to fill in the Solr server and core details:

And you're good to go!

What's next?

  • If you want to make use of the multilingual capability, check out the 'Apache Solr Multilingual' module (http://drupal.org/project/apachesolr_multilingual).
  • if you're using the attached bilingual configuration example: Tika is also included! So go ahead and install the 'Apache Solr Attachments' module (http://drupal.org/project/apachesolr_attachments) as well and you'll be able to index all sorts of documents!
  • If you want to use the 'more like this' feature from the core 'Apache Solr search' module: it's broken at the moment because Solr 4 introduced the forward slash as being a special character for improved regex operations. The Drupal module doesn't yet take this into account and sends an unescaped forward slash to the Solr server when requesting a 'more like this' query. This will probably be fixed soon but for now you can use the attached patch apachesolr-solr4-escape-characters-mlt.zip to include forward slash escaping.

This guide doesn't go into too much detail on how to create cores in Solr and how to configure different languages and so on. The attached configuration should be a good base for some self-learning investigation but if you have further questions regarding Solr setup and configuration, feel free to ask!

Kudos

There are several Drupal 6 / 7 + Solr 1.4 / 3.6 guides out there which served as a good base to write this up-to-date Solr 4 guide.

Thanks go out to Nick Veenhof for being one of the core contributors of the apachesolr module and for writing several nice tutorials and guides on his website at http://www.nickveenhof.be.

The blog post at http://florezgroup.com/blog/steve/configuring-apache-solr-multi-core-drupal-and-tomcat-ubuntu-910 by Steve Edwards was also a useful addition to Nick's blog entries, specifically regarding permissions issues.