Rebuilding a Collection
When running SolrCloud, there may be a reason that you need to rebuild a Collection from scratch. The approach of using a “primary” Collection and a “reindex” Collection is well suited for rebuilding a Collection without requiring an outage. Here are the steps to accomplish rebuilding a Collection.
Identify the "Reindex" Collection
The "reindex" Collection is only used for the full reindex process. Therefore, this Collection is technically "offline" and can be deleted/recreated without impacting the "primary" Collection the clients are using.
The first step is to identify which Collection is being referenced by the "reindex" alias. That can be accomplished by hitting the following endpoint on one of your Solr servers.
http://your-solr-server:8983/solr/admin/collections?action=CLUSTERSTATUS
Toward the bottom of the output you will see the mapping of the aliases to the collections:
"aliases": {
"catalog": "catalogCollection0",
"catalog-reindex": "catalogCollection1",
}
The "reindex" collection (catalog-reindex in this example) is using the catalogCollection1 collection. That collection is the one that we can target to be rebuilt.
Delete the "reindex" Collection & Alias
With the "reindex" Collection identified, we can delete that Collection and then recreate it. As part of the process we will delete/recreate the Alias just to make sure the Alias is referencing the newly created Collection.
http://your-solr-server:8983/solr/admin/collections?action=DELETE&name=catalogCollection1
http://your-solr-server:8983/solr/admin/collections?action=DELETEALIAS&name=catalog-reindex
Recreate the "reindex" Collection and Associated Alias
Now that the Collection and Alias has been deleted, they can now be recreated. One important configuration item needed when creating a Collection is the "collection.configName". This is the path in ZookKeeper to the Solr schema configuration. ZooKeeper is a centralized configuration service and for SolrCloud implementations the schema data would have already been uploaded to the ZooKeeper server. The configuration path can be found in the Solr admin at this endpoint:
http://your-solr-server:8983/solr/#/~cloud?view=tree
From that endpoint, the path under "/configs" is the path needed (the path to the Solr schema.xml file).
With the config path available, the Collection can be recreated with the following endpoint, The path discovered above will be referenced in "collection.configName" parameter:
http://your-solr-server:8983/solr/admin/collections?action=CREATE&name=catalogCollection1&collection.configName=path-to/conf&numShards=1&replicationFactor=3'
Note the following parameters:
- action - we calling the CREATE action
- name - the new Collection name (we are recreating the collection just delete)
- collection.configName - the config path discovered above
- numShards - should be 1
- replicationFactor - the number of nodes in your cluster
The last task in this step is to recreate the Alias. The following call can be made to accomplish that - again, make sure to reference the correct collection (catalogCollection1 in our example):
http://your-solr-server:8983/solr/admin/collections?action=CREATEALIAS&name=catalog-reindex&collections=catalogCollection1
Reindex
Now that you have a newly created empty Collection, a full reindex can now be performed. This can be done in the Admin (Setting / Solr Indexer).