Liferay Portal is an open source project, so you won’t be surprised to learn that the default search engine that ships with Liferay Portal is also an open source project. Elasticsearch is a highly scalable, full-text search and analytics engine that ships with Liferay Portal.
By default, Liferay Portal runs Elasticsearch as an embedded search engine, but it’s only supported in production locally or remotely, as a separate server or cluster. This guide walks you through that process.
If you’d rather use Solr, it’s also supported. See here for information on installing and configuring Solr.
If you just want to get up and running quickly with Elasticsearch, refer to the Configuring Search article. It assumes that you only want to know what’s necessary for the installation and configuration of Elasticsearch in a single server environment, and it doesn’t include all the clustering and tuning instructions found here.In this article you’ll learn how to configure Elasticsearch for use in Liferay Portal production environments.
If you’ve come here looking for information on search engines in general, or the low level search infrastructure of Liferay Portal, refer to the developer tutorial Introduction to Liferay Search.
These terms will be useful to understand as you read this guide:
Elasticsearch Home refers to the root folder of your unzipped Elasticsearch installation (for example,
Liferay Home refers to the root folder of your Liferay Portal installation. It contains the
licensefolders, among others.
Embedded vs. Remote Operation Mode
When you install Liferay Portal, there’s an embedded Elasticsearch already installed. In embedded mode, Elasticsearch search runs with Liferay Portal as a library in the same JVM. This is done by default to make it easy to test-drive Liferay Portal with minimal configuration. Running Elasticsearch and Liferay Portal in the same process has drawbacks:
- Your Elasticsearch configuration uses the same JVM options as Liferay Portal.
- Liferay Portal and Elasticsearch compete for resources.
You wouldn’t run an embedded database like HSQL in production, and you shouldn’t run Elasticsearch in embedded mode in production either. Instead, you want your Liferay Portal installation to run alongside Elasticsearch. This is called remote operation mode, as a standalone server or cluster of server nodes. The first step is to install Elasticsearch.
Install Elasticsearch, and then you can begin configuring it to use with Liferay Portal.
Follow the instructions here to find the version of Elasticsearch that matches your installation and download it.
Extract the contents of the compressed file you downloaded to your Liferay Home folder.
Before continuing, make sure you have set the
If you have multiple JDKs installed, make sure Elasticsearch and Liferay Portal are using the same version. You can specify this in
Install the following required Elasticsearch plugins:
To install these plugins, navigate to Elasticsearch Home and enter
./bin/plugin install [plugin-name]
Replace [plugin-name] with the Elasticsearch plugin’s name.
[Elasticsearch_Home]/bin/elasticsearch.shfile executable (if you’re on Linux).
For more details refer to the Elasticsearch installation guide
Once you have Elasticsearch installed, you need to configure it for Liferay Portal.
For detailed Elasticsearch configuration information, refer to the Elasticsearch documentation.
The name of your Elasticsearch cluster is important. When you’re running Elasticsearch in remote mode, the cluster name is used by Liferay Portal to recognize the Elasticsearch cluster. To learn about setting the Elasticsearch cluster name on the Liferay Portal side, refer below to the section called Configuring the Liferay Elasticsearch adapter.
Elasticsearch’s configuration files are written in YAML and kept in the
[Elasticsearch Home]/config folder:
elasticsearch.ymlis for configuring Elasticsearch modules
logging.ymlis for configuring Elasticsearch logging
To set the name of the Elasticsearch cluster, open
[Elasticsearch Home]/config/elasticsearch.yml and specify
LiferayElasticsearchCluster is the default name given to the cluster in Liferay Portal, this would work just fine. Of course, you can name your cluster whatever you’d like (we humbly submit the recommendation
clustery_mcclusterface).1 You can configure your node name using the same syntax (setting the
If you’d rather work from the command line than in the configuration file, navigate to Elasticsearch Home and enter
./bin/elasticsearch --cluster.name clustery_mcclusterface --node.name nody_mcnodeface
Feel free to change the node name or the cluster name. Once you configure Elasticsearch to your liking, start it up.
Start Elasticsearch by navigating to Elasticsearch Home and typing
if you run Linux, or
if you run Windows.
To run as a daemon in the background, add the
-d switch to either command:
Now that you have Elasticsearch itself installed and running, and Liferay Portal installed and running (do that if you haven’t already) you need to introduce Liferay Portal and Elasticsearch to each other. Fortunately, Liferay provides an adapter that helps it find and integrate your Elasticsearch cluster.
Configuring the Liferay Elasticsearch Adapter
Liferay Portal has an Elasticsearch adapter that ships with Liferay Portal. It’s a module from the Liferay Foundation Suite that’s deployed to the OSGi runtime, titled Liferay Portal Search Elasticsearch. This adapter provides integration between Elasticsearch and Liferay Portal. Before you configure the adapter, make sure Elasticsearch is running.
There are two ways to configure the adapter:
Use the System Settings application in the Control Panel.
Manually create an OSGi configuration file.
It’s convenient to configure the Elasticsearch adapter from System Settings, but this is often only possible during development and testing. If you’re not familiar with System Settings, you can read about it here. Even if you need a configuration file so you can use the same configuration on another Liferay Portal system, you can still use System Settings. Just make the configuration edits you need, then export the
.config file with your configuration.
Here are the steps to configure the Elasticsearch adapter from the System Settings application:
- Start Liferay Portal.
- Navigate to Control Panel → Configuration → System Settings → Foundation.
Find the Elasticsearch entry (scroll down and browse to it or use the search box) and click the Actions icon (), then Edit.
Change Operation Mode to Remote, and then click Save.
After you switch operation modes (
REMOTE), you must trigger a re-index. Navigate to Control Panel → Server Administration, find the Index Actions section, and click Execute next to Reindex all search indexes.
When preparing a system for production deployment, you want to set up a repeatable deployment process. Therefore, it’s best to use the OSGi configuration file, where your configuration is maintained in a controlled source.
Follow these steps to configure the Elasticsearch adapter using an OSGi configuration file:
Create the following file:
Add this to the configuration file you just created:
operationMode="REMOTE" # If running Elasticsearch from a different computer: #transportAddresses="ip.of.elasticsearch.node:9300" # Highly recommended for all non-prodcution usage (e.g., practice, tests, diagnostics): #logExceptionsOnly="false"
Start Liferay Portal or re-index if Liferay Portal is already running.
As you can see from the System Settings entry for Elasticsearch, there are a lot more configuration options available that help you tune your system for optimal performance. For a detailed accounting of these, refer to the reference article on Elasticsearch Settings.
What follows here are some known-good configurations for clustering Elasticsearch. These, however, can’t replace the manual process of tuning, testing under load, and tuning again, so we encourage you to examine the settings as well as the Elasticsearch documentation and go through that process once you have a working configuration.
Configuring a Remote Elasticsearch Host
In production systems Elasticsearch and Liferay Portal are installed on different servers. To make Liferay Portal aware of the Elasticsearch cluster, set
transportAddresses=[IP address of Elasticsearch Node]:9300
in the Elasticsearch adapter’s OSGi configuration file. List as many or as few Elasticsearch nodes in this property as you’d like. This tells Liferay Portal the IP address or host name where search requests are to be sent. If using System Settings, set the value in the Transport Addresses property.
On the Elasticsearch side, set the
network.host property in your
elaticsearch.yml file. This property simultaneously sets both the bind host (the host Elasticsearch listens on for requests) and the publish host (the host name or IP address Elasticsearch uses to communicate with other nodes). See here for more information.
Clustering Elasticsearch in Remote Operation Mode
Clustering Elasticsearch is easy. Each time you run the Elasticsearch start script, a new node is added to the cluster. If you want four nodes, for example, just run
./bin/elasticsearch four times. If you only run the start script once, you have a cluster with just one node.
Elasticsearch’s default configuration works for a cluster of up to ten nodes, since the default number of shards is
5, while the default number of replica shards is
index.number_of_shards: 5 index.number_of_replicas: 1
For more information on configuring an Elasticsearch cluster, see the documentation on Elasticsearch Index Settings.
Advanced Configuration of the Liferay Elasticsearch Adapter
The default configurations for Liferay’s Elasticsearch adapter module are set in a Java class:
While the Elasticsearch adapter has a lot of configuration options out of the box, you might find an Elasticsearch configuration you need that isn’t provided by default. In this case, you can add the configuration options you need. If you can configure something for Elasticsearch, you can configure it using the Elasticsearch adapter in Liferay Portal.
Adding Settings and Mappings to the Liferay Elasticsearch Adapter
Liferay Portal has divided the available configuration options into two groups: the ones you’ll use most often by default, and a catch-all for everything else. If you need to configure the local Elasticsearch client when running in remote mode, but the necessary setting isn’t available by default, you can still configure it with the Liferay Elasticsearch adapter. Just specify the settings you need by using one or more of the
additionalConfigurations is used to define extra settings (defined in YAML) for the embedded Elasticsearch or the local Elasticsearch client when running in remote mode. In production, only one additional configuration can be added here:
The rest of the settings for the client are available as default configuration options in the Liferay Elasticsearch adapter. See the Elasticsearch Settings reference article for more information. See the Elasticsearch documentation for a description of all the client settings and for an example.
additionalIndexConfigurations is used to define extra settings (in JSON or YAML format) that are applied to the Liferay Portal index when it’s created. For example, you can create custom analyzers and filters using this setting. For a complete list of available settings, see the Elasticsearch reference.
additionalTypeMappings is used to define extra field mappings for the
LiferayDocumentType type definition, which are applied when the index is created. Add these field mappings in using JSON syntax. For more information see here and here
Multi-line YAML Configurations
If you configure the settings from the last section using an OSGi configuration file, you might find yourself needing to write YML snippets that span multiple lines. The syntax for that is straightforward and just requires appending each line with
\n\, like this:
additionalConfigurations=\ cluster.routing.allocation.disk.threshold_enabled: false\n\ cluster.service.slow_task_logging_threshold: 600s\n\ index.indexing.slowlog.threshold.index.warn: 600s\n\ index.search.slowlog.threshold.fetch.warn: 600s\n\ index.search.slowlog.threshold.query.warn: 600s\n\ monitor.jvm.gc.old.warn: 600s\n\ monitor.jvm.gc.young.warn: 600s
Sometimes things don’t go as planned. If you’ve set up Liferay Portal with Elasticsearch in remote mode, but Liferay Portal can’t connect to Elasticsearch, check these things:
Cluster name: The value of the
cluster.nameproperty in Elasticsearch must match the
clusterNameproperty you configured for Liferay’s Elasticsearch adapter.
Transport address: The value of the
transportAddressproperty in the Elasticsearch adapter must match the port where Elasticsearch is running. If Liferay Portal is running in embedded mode, and you start a standalone Elasticsearch node or cluster, it detects that port
9300is taken and switches to port
9301. If you then set Liferay’s Elasticsearch adapter to remote mode, it continues to look for Elasticsearch at the default port (
Now that you have Elasticsearch configured for use with Liferay Portal, if you’re a Liferay Portal customer, you can read here to learn about configuring Shield to secure your Elasticsearch data.