Nouveau: CouchDB's New Full-Text Search Engine
Apache CouchDB has introduced an exciting new addition to its ecosystem: Nouveau, a powerful query server designed for full-text search indexing using Apache Lucene. This new feature brings flexible, scalable, and efficient text search capabilities to CouchDB, making it easier to find and retrieve documents based on their content.
In this guide, we will explore what Nouveau is, how to install and configure it, and how to create and query full-text search indexes using Apache Lucene. Whether you are a seasoned CouchDB user or just getting started, this article will break down the concepts and provide step-by-step instructions to get you up and running with Nouveau.
Full-Text Search in CouchDB Before Nouveau: Using Elasticsearch
Before the introduction of Nouveau, CouchDB developers relied on Elasticsearch for full-text search functionality. Elasticsearch, a distributed search engine based on Apache Lucene, was commonly used alongside CouchDB via the couchdb-lucene plugin or through external indexing pipelines.
How Elasticsearch Was Used with CouchDB
- Replication to Elasticsearch: Documents were replicated from CouchDB into an external Elasticsearch cluster for indexing.
- couchdb-lucene Plugin: A middleware service that connected CouchDB’s views to an Elasticsearch backend.
- Direct Indexing from CouchDB: Some users built custom pipelines to push document updates from CouchDB to Elasticsearch in real-time.
Challenges with Elasticsearch Integration
- Complexity: Managing an external Elasticsearch cluster added operational overhead.
- Consistency Issues: Data replication between CouchDB and Elasticsearch was asynchronous, leading to potential inconsistencies.
- Performance Overhead: Running an additional search engine required additional resources.
- Dependency Management: Maintaining separate services and ensuring compatibility between CouchDB and Elasticsearch versions could be challenging.
The Shift to Nouveau
With Nouveau, full-text search is now natively integrated into CouchDB, eliminating the need for external dependencies like Elasticsearch. This new approach simplifies deployment, improves consistency, and reduces maintenance efforts while still leveraging Lucene’s powerful search capabilities.
By adopting Nouveau, CouchDB users gain an efficient, built-in full-text search solution without the complexities of external search services like Elasticsearch.
What is Nouveau and Why Was It Introduced?
Nouveau is a new query server for CouchDB that provides full-text search capabilities using Apache Lucene. Unlike traditional views or Mango queries, Nouveau allows developers to perform complex text-based searches with features like tokenization, analyzers, stemming, faceting, and sorting.
Key Benefits of Nouveau:
- Lucene-powered indexing: Utilizes Apache Lucene for high-performance text search.
- Flexible search queries: Supports phrase searches, wildcards, fuzzy matching, and more.
- Efficient indexing: Designed to handle large datasets with optimized indexing performance.
- Improved search relevance: Supports custom analyzers for better search accuracy.
- Integration with CouchDB: Works seamlessly within CouchDB’s existing architecture.
Nouveau is an experimental feature in CouchDB, which means it is still under development, and its APIs may change in future versions. However, it provides a powerful alternative to CouchDB's deprecated search features based on Erlang and SpiderMonkey.
Installing and Configuring Nouveau in CouchDB
To get started with Nouveau, you need to install and enable it in CouchDB. The following sections outline the prerequisites, installation steps, and configuration settings required to set up Nouveau.
Prerequisites
Before installing Nouveau, ensure your system meets the following requirements:
- CouchDB 3.4.0 or later
- Java runtime (Required for running the Nouveau server)
Step 1: Install Java if needed
- Check to see which Java version you are running:
→ java -version openjdk version "17.0.14" 2025-01-21 OpenJDK Runtime Environment (build 17.0.14+7-Debian-1deb12u1) OpenJDK 64-Bit Server VM (build 17.0.14+7-Debian-1deb12u1, mixed mode, sharing)
- If it's not very recent and you need to update, then install one:
# OSX brew install openjdk # Ubuntu/Debian sudo apt install default-jdk
See how to install Homebrew ifbrew
command is not working
Step 2: Configure CouchDB for Nouveau
Modify the CouchDB configuration file (local.ini
) to enable Nouveau:
[nouveau]
enable = true
url = http://127.0.0.1:5987 # This is the default, you can omit it
Save and restart CouchDB:
sudo systemctl restart couchdb
Step 3: Start the Nouveau Server
Launch Nouveau manually or set it up as a system service:
# OSX
java -jar "/Applications/Apache CouchDB.app/Contents/Resources/couchdbx-core/nouveau/lib/nouveau-1.0-SNAPSHOT.jar" server "/Applications/Apache CouchDB.app/Contents/Resources/couchdbx-core/etc/nouveau.yaml"
# Ubuntu/Debian
java -jar /opt/couchdb/nouveau/lib/nouveau-1.0-SNAPSHOT.jar server /opt/couchdb/etc/nouveau.yaml
This starts Nouveau with its default configuration file (nouveau.yaml
).
To run the CouchDB Nouveau server as a daemon on system startup in Debian/Ubuntu, follow these steps:
Step 1: Create a Systemd Service File
- Open a terminal and create a new service file:
sudo nano /etc/systemd/system/couchdb-nouveau.service
- Add the following content to define the service:
[Unit]
Description=CouchDB Nouveau Server
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/java -jar /opt/couchdb/nouveau/lib/nouveau-1.0-SNAPSHOT.jar server /opt/couchdb/etc/nouveau.yaml
WorkingDirectory=/opt/couchdb/nouveau
User=couchdb
Group=couchdb
Restart=always
RestartSec=5
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=couchdb-nouveau
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Step 2: Create the CouchDB User (If Needed)
If a couchdb user does not already exist, create it:
sudo adduser --system --no-create-home --group couchdb
Ensure the user has permissions to access the necessary files:
sudo chown -R couchdb:couchdb /opt/couchdb
sudo chmod -R 750 /opt/couchdb
Step 3: Reload Systemd and Enable the Service
- Reload systemd to apply the changes:
sudo systemctl daemon-reload
- Enable the service to start on boot:
sudo systemctl enable couchdb-nouveau
- Start the service immediately:
sudo systemctl start couchdb-nouveau
Creating Full-Text Search Indexes with Nouveau
Nouveau allows you to define search indexes inside CouchDB design documents using JavaScript functions.
Example: Defining a Search Index
Create a design document with a search index in your database (mydb
):
{
"_id": "_design/mysearch",
"nouveau": {
"test1": {
"index": "function(doc){
if (doc.name) ( index('text', 'name', doc.name ) )
if (doc.author) ( index('text', 'author', doc.author ) )
},"
}
}
}
This index tells Nouveau to index the name
and author
fields in the database.
Indexing Field Types
Nouveau supports different field types for indexing:
- text – Tokenized full-text search (e.g.,
"content": "This is a test document."
) - string – Exact match, non-tokenized (e.g.,
"category": "news"
) - double – Numeric values (e.g.,
"price": 9.99
) - stored – Fields stored in the index but not analyzed (useful for metadata)
Querying a Nouveau Index
Once your index is created, you can query it using Lucene Query Parser Syntax.
Example Query
Search for documents where name
contains "couchdb":
curl -X GET "http://localhost:5984/mydb/_design/mysearch/_nouveau/test1?q=name:couchdb&include_docs=true"
To include additional search conditions, use the logical operators inside the query:
curl -X GET "http://localhost:5984/mydb/_design/mysearch/_nouveau/test1?q=name:couchdb AND author=john&include_docs=true"
Like with MapReduce views, you can also pass the parameters into a POST request body:
{
"q": "name:my query",
"sort": "rating",
"limit": 5
}
Building a more advanced index
When searching across multiple fields, it's more efficient to build the index a bit different. We can concatenate all the text fields we're interested in and then query everything altogether with one request.
{
"_id": "_design/mysearch",
"nouveau": {
"test2": {
"index": "function(doc){,
let text = ''
if (doc.name) ( text += doc.name + ' ' )
if (doc.author) ( text += doc.author + ' ' )
// any further fields can be added
if (text) { index('text', 'default', text) }
},"
}
}
}
When emitting to the index, the default
name allow for a query to be made without specifying the index name.
curl -X GET "http://localhost:5984/mydb/_design/mysearch/_nouveau/test2?q=couchdb&include_docs=true"
Advanced Querying
- Wildcard Search:
title:couch*
- Phrase Search:
"couchdb is great"
- Fuzzy Search:
title:couchdb~
- Range Query:
date:[2023-01-01 TO 2023-12-31]
Advanced Features
Faceting
Faceted search allows for grouping results by field values.
curl -X GET "http://localhost:5984/mydb/_design/mysearch/_search/default?q=title:couchdb&facet=true&facet.field=category"
Sorting
You can sort search results based on fields.
curl -X GET "http://localhost:5984/mydb/_design/mysearch/_search/default?q=title:couchdb&sort=price<double>"
Best Practices for Efficient Indexing
- Use the correct field types: Avoid indexing everything as
text
whenstring
ordouble
would be better. - Limit the number of fields: Indexing too many fields can increase storage and slow down queries.
- Optimize queries: Use specific field queries instead of generic full-text searches.
- Regularly update indexes: Keep your indexes refreshed to ensure accurate search results.
Experimental Status and Future Considerations
Since Nouveau is still experimental, be aware of:
- API Changes: Features may change in future releases.
- Performance Optimization: It’s still being tuned for large-scale deployments.
- Community Contributions: Since it's open-source, contributions and feedback are encouraged.
Conclusion
Nouveau is a powerful, Lucene-based search engine that enhances full-text search capabilities in CouchDB. It provides rich querying, indexing flexibility, and advanced search features, making it an excellent tool for developers working with CouchDB.
Alongside with the MapReduce and Mango queries, there are now multiple ways in which you can access your data. You can even do SQL queries if that's what you're looking for by leveraging Structured Query Server system from the great people at Neighbourhoodie Software.
To get up and running, start Nouveau, configure indexes, and experiment with queries to see how it improves your document search experience. As CouchDB continues to evolve, Nouveau will likely become an essential part of its ecosystem.
Are you looking to integrate Nouveau into your projects? Get in touch with us to see how we can help!