debian-mirror-gitlab/doc/administration/troubleshooting/elasticsearch.md

400 lines
17 KiB
Markdown
Raw Normal View History

2021-01-29 00:20:46 +05:30
---
stage: Enablement
group: Global Search
2021-02-22 17:27:13 +05:30
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
2021-01-29 00:20:46 +05:30
---
2019-12-21 20:55:43 +05:30
# Troubleshooting Elasticsearch
2019-10-12 21:52:04 +05:30
2020-05-24 23:13:21 +05:30
To install and configure Elasticsearch, and for common and known issues,
visit the [administrator documentation](../../integration/elasticsearch.md).
2019-12-21 20:55:43 +05:30
Troubleshooting Elasticsearch requires:
2019-10-12 21:52:04 +05:30
- Knowledge of common terms.
- Establishing within which category the problem fits.
## Common terminology
- **Lucene**: A full-text search library written in Java.
2020-04-22 19:07:51 +05:30
- **Near real time (NRT)**: Refers to the slight latency from the time to index a
2019-10-12 21:52:04 +05:30
document to the time when it becomes searchable.
- **Cluster**: A collection of one or more nodes that work together to hold all
the data, providing indexing and search capabilities.
- **Node**: A single server that works as part of a cluster.
- **Index**: A collection of documents that have somewhat similar characteristics.
- **Document**: A basic unit of information that can be indexed.
- **Shards**: Fully-functional and independent subdivisions of indices. Each shard is actually
a Lucene index.
- **Replicas**: Failover mechanisms that duplicate indices.
## Troubleshooting workflows
The type of problem will determine what steps to take. The possible troubleshooting workflows are for:
- Search results.
- Indexing.
- Integration.
- Performance.
2021-04-17 20:07:23 +05:30
- Advanced Search Migrations.
2019-10-12 21:52:04 +05:30
### Search Results workflow
2019-12-21 20:55:43 +05:30
The following workflow is for Elasticsearch search results issues:
2019-10-12 21:52:04 +05:30
```mermaid
graph TD;
B --> |No| B1
B --> |Yes| B4
B1 --> B2
B2 --> B3
B4 --> B5
B5 --> |Yes| B6
B5 --> |No| B7
B7 --> B8
2019-12-21 20:55:43 +05:30
B{Is GitLab using<br>Elasticsearch for<br>searching?}
2019-10-12 21:52:04 +05:30
B1[Check Admin Area > Integrations<br>to ensure the settings are correct]
B2[Perform a search via<br>the rails console]
2019-12-21 20:55:43 +05:30
B3[If all settings are correct<br>and it still doesn't show Elasticsearch<br>doing the searches, escalate<br>to GitLab support.]
B4[Perform<br>the same search via the<br>Elasticsearch API]
2019-10-12 21:52:04 +05:30
B5{Are the results<br>the same?}
B6[This means it is working as intended.<br>Speak with GitLab support<br>to confirm if the issue lies with<br>the filters.]
B7[Check the index status of the project<br>containing the missing search<br>results.]
B8(Indexing Troubleshooting)
```
### Indexing workflow
2019-12-21 20:55:43 +05:30
The following workflow is for Elasticsearch indexing issues:
2019-10-12 21:52:04 +05:30
```mermaid
graph TD;
C --> |Yes| C1
C1 --> |Yes| C2
C1 --> |No| C3
C3 --> |Yes| C4
C3 --> |No| C5
C --> |No| C6
C6 --> |No| C10
C7 --> |GitLab| C8
2019-12-21 20:55:43 +05:30
C7 --> |Elasticsearch| C9
2019-10-12 21:52:04 +05:30
C6 --> |Yes| C7
C10 --> |No| C12
C10 --> |Yes| C11
C12 --> |Yes| C13
C12 --> |No| C14
C14 --> |Yes| C15
C14 --> |No| C16
C{Is the problem with<br>creating an empty<br>index?}
2019-12-21 20:55:43 +05:30
C1{Does the gitlab-production<br>index exist on the<br>Elasticsearch instance?}
C2(Try to manually<br>delete the index on the<br>Elasticsearch instance and<br>retry creating an empty index.)
C3{Can indices be made<br>manually on the Elasticsearch<br>instance?}
2019-10-12 21:52:04 +05:30
C4(Retry the creation of an empty index)
2019-12-21 20:55:43 +05:30
C5(It is best to speak with an<br>Elasticsearch admin concerning the<br>instance's inability to create indices.)
2019-10-12 21:52:04 +05:30
C6{Is the indexer presenting<br>errors during indexing?}
2019-12-21 20:55:43 +05:30
C7{Is the error a GitLab<br>error or an Elasticsearch<br>error?}
2019-10-12 21:52:04 +05:30
C8[Escalate to<br>GitLab support]
2019-12-21 20:55:43 +05:30
C9[You will want<br>to speak with an<br>Elasticsearch admin.]
2019-10-12 21:52:04 +05:30
C10{Does the index status<br>show 100%?}
C11[Escalate to<br>GitLab support]
C12{Does re-indexing the project<br> present any GitLab errors?}
C13[Rectify the GitLab errors and<br>restart troubleshooting, or<br>escalate to GitLab support.]
2019-12-21 20:55:43 +05:30
C14{Does re-indexing the project<br>present errors on the <br>Elasticsearch instance?}
C15[It would be best<br>to speak with an<br>Elasticsearch admin.]
2019-10-12 21:52:04 +05:30
C16[This is likely a bug/issue<br>in GitLab and will require<br>deeper investigation. Escalate<br>to GitLab support.]
```
### Integration workflow
2019-12-21 20:55:43 +05:30
The following workflow is for Elasticsearch integration issues:
2019-10-12 21:52:04 +05:30
```mermaid
graph TD;
D --> |No| D1
D --> |Yes| D2
D2 --> |No| D3
D2 --> |Yes| D4
D4 --> |No| D5
D4 --> |Yes| D6
2020-03-13 15:44:24 +05:30
D{Is the error concerning<br>the Go indexer?}
2019-12-21 20:55:43 +05:30
D1[It would be best<br>to speak with an<br>Elasticsearch admin.]
2019-10-12 21:52:04 +05:30
D2{Is the ICU development<br>package installed?}
D3>This package is required.<br>Install the package<br>and retry.]
D4{Is the error stemming<br>from the indexer?}
D5[This would indicate an OS level<br> issue. It would be best to<br>contact your sysadmin.]
D6[This is likely a bug/issue<br>in GitLab and will require<br>deeper investigation. Escalate<br>to GitLab support.]
```
### Performance workflow
2019-12-21 20:55:43 +05:30
The following workflow is for Elasticsearch performance issues:
2019-10-12 21:52:04 +05:30
```mermaid
graph TD;
F --> |Yes| F1
F --> |No| F2
F2 --> |No| F3
F2 --> |Yes| F4
F4 --> F5
F5 --> |No| F6
F5 --> |Yes| F7
2019-12-21 20:55:43 +05:30
F{Is the Elasticsearch instance<br>running on the same server<br>as the GitLab instance?}
F1(This is not advised and will cause issues.<br>We recommend moving the Elasticsearch<br>instance to a different server.)
F2{Does the Elasticsearch<br>server have at least 8<br>GB of RAM and 2 CPU<br>cores?}
F3(According to Elasticsearch, a non-prod<br>server needs these as a base requirement.<br>Production often requires more. We recommend<br>you increase the server specifications.)
2019-10-12 21:52:04 +05:30
F4(Obtain the <br>cluster health information)
F5(Does it show the<br>status as green?)
2019-12-21 20:55:43 +05:30
F6(We recommend you speak with<br>an Elasticsearch admin<br>about implementing sharding.)
2019-10-12 21:52:04 +05:30
F7(Escalate to<br>GitLab support.)
```
2021-04-17 20:07:23 +05:30
### Advanced Search Migrations workflow
2021-03-08 18:12:59 +05:30
```mermaid
graph TD;
D --> |No| D1
D --> |Yes| D2
D2 --> |No| D3
D2 --> |Yes| D4
D4 --> |No| D5
D4 --> |Yes| D6
D6 --> |No| D8
D6 --> |Yes| D7
D{Is there a halted migration?}
D1[Migrations run in the<br>background and will<br>stop when completed.]
D2{Does the elasticsearch.log<br>file contain errors?}
D3[This is likely a bug/issue<br>in GitLab and will require<br>deeper investigation. Escalate<br>to GitLab support.]
D4{Have the errors<br>been addressed?}
D5[Have an Elasticsearch admin<br>review and address<br>the errors.]
D6{Has the migration<br>been retried?}
D7[This is likely a bug/issue<br>in GitLab and will require<br>deeper investigation. Escalate<br>to GitLab support.]
D8[Retry the migration from<br>the Admin > Settings ><br>Advanced Search UI.]
```
2019-10-12 21:52:04 +05:30
## Troubleshooting walkthrough
2019-12-21 20:55:43 +05:30
Most Elasticsearch troubleshooting can be broken down into 4 categories:
2019-10-12 21:52:04 +05:30
- [Troubleshooting search results](#troubleshooting-search-results)
- [Troubleshooting indexing](#troubleshooting-indexing)
- [Troubleshooting integration](#troubleshooting-integration)
- [Troubleshooting performance](#troubleshooting-performance)
2021-04-17 20:07:23 +05:30
- [Troubleshooting Advanced Search migrations](#troubleshooting-advanced-search-migrations)
2019-10-12 21:52:04 +05:30
Generally speaking, if it does not fall into those four categories, it is either:
- Something GitLab support needs to look into.
2019-12-21 20:55:43 +05:30
- Not a true Elasticsearch issue.
2019-10-12 21:52:04 +05:30
2019-12-21 20:55:43 +05:30
Exercise caution. Issues that appear to be Elasticsearch problems can be OS-level issues.
2019-10-12 21:52:04 +05:30
### Troubleshooting search results
2019-12-21 20:55:43 +05:30
Troubleshooting search result issues is rather straight forward on Elasticsearch.
2019-10-12 21:52:04 +05:30
2019-12-21 20:55:43 +05:30
The first step is to confirm GitLab is using Elasticsearch for the search function.
2019-10-12 21:52:04 +05:30
To do this:
2021-01-03 14:25:43 +05:30
1. Confirm the integration is enabled in **Admin Area > Settings > General**.
2021-01-29 00:20:46 +05:30
1. Confirm searches use Elasticsearch by accessing the rails console
2019-10-12 21:52:04 +05:30
(`sudo gitlab-rails console`) and running the following commands:
```rails
u = User.find_by_email('email_of_user_doing_search')
s = SearchService.new(u, {:search => 'search_term'})
2021-04-29 21:17:54 +05:30
pp s.search_objects.class
2019-10-12 21:52:04 +05:30
```
2020-01-01 13:55:28 +05:30
The output from the last command is the key here. If it shows:
2019-10-12 21:52:04 +05:30
2019-12-21 20:55:43 +05:30
- `ActiveRecord::Relation`, **it is not** using Elasticsearch.
- `Kaminari::PaginatableArray`, **it is** using Elasticsearch.
2019-10-12 21:52:04 +05:30
2019-12-21 20:55:43 +05:30
| Not using Elasticsearch | Using Elasticsearch |
2019-10-12 21:52:04 +05:30
|--------------------------|------------------------------|
| `ActiveRecord::Relation` | `Kaminari::PaginatableArray` |
2019-12-21 20:55:43 +05:30
If all the settings look correct and it is still not using Elasticsearch for the search function, it is best to escalate to GitLab support. This could be a bug/issue.
2019-10-12 21:52:04 +05:30
2021-04-29 21:17:54 +05:30
Moving past that, it is best to attempt the same [search via the Rails console](../../integration/elasticsearch.md#i-indexed-all-the-repositories-but-i-cant-get-any-hits-for-my-search-term-in-the-ui)
or the [Elasticsearch Search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html),
and compare the results from what you see in GitLab.
2019-10-12 21:52:04 +05:30
If the results:
2020-06-23 00:09:42 +05:30
- Sync up, then there is not a technical "issue." Instead, it might be a problem
2019-12-21 20:55:43 +05:30
with the Elasticsearch filters we are using. This can be complicated, so it is best to
2019-10-12 21:52:04 +05:30
escalate to GitLab support to check these and guide you on the potential on whether or
not a feature request is needed.
- Do not match up, this indicates a problem with the documents generated from the
project. It is best to re-index that project and proceed with
[Troubleshooting indexing](#troubleshooting-indexing).
### Troubleshooting indexing
Troubleshooting indexing issues can be tricky. It can pretty quickly go to either GitLab
2021-03-11 19:13:27 +05:30
support or your Elasticsearch administrator.
2019-10-12 21:52:04 +05:30
The best place to start is to determine if the issue is with creating an empty index.
2019-12-21 20:55:43 +05:30
If it is, check on the Elasticsearch side to determine if the `gitlab-production` (the
name for the GitLab index) exists. If it exists, manually delete it on the Elasticsearch
2019-10-12 21:52:04 +05:30
side and attempt to recreate it from the
2021-01-03 14:25:43 +05:30
[`recreate_index`](../../integration/elasticsearch.md#gitlab-advanced-search-rake-tasks)
2020-04-22 19:07:51 +05:30
Rake task.
2019-10-12 21:52:04 +05:30
2019-12-21 20:55:43 +05:30
If you still encounter issues, try creating an index manually on the Elasticsearch
2019-10-12 21:52:04 +05:30
instance. The details of the index aren't important here, as we want to test if indices
can be made. If the indices:
2021-03-11 19:13:27 +05:30
- Cannot be made, speak with your Elasticsearch administrator.
2019-10-12 21:52:04 +05:30
- Can be made, Escalate this to GitLab support.
If the issue is not with creating an empty index, the next step is to check for errors
2021-06-08 01:23:25 +05:30
during the indexing of projects. If errors do occur, they stem from either the indexing:
2019-10-12 21:52:04 +05:30
- On the GitLab side. You need to rectify those. If they are not
something you are familiar with, contact GitLab support for guidance.
2021-03-11 19:13:27 +05:30
- Within the Elasticsearch instance itself. See if the error is [documented and has a fix](../../integration/elasticsearch.md#troubleshooting). If not, speak with your Elasticsearch administrator.
2019-10-12 21:52:04 +05:30
2021-06-08 01:23:25 +05:30
If the indexing process does not present errors, check the status of the indexed projects. You can do this via the following Rake tasks:
2019-10-12 21:52:04 +05:30
2021-01-03 14:25:43 +05:30
- [`sudo gitlab-rake gitlab:elastic:index_projects_status`](../../integration/elasticsearch.md#gitlab-advanced-search-rake-tasks) (shows the overall status)
- [`sudo gitlab-rake gitlab:elastic:projects_not_indexed`](../../integration/elasticsearch.md#gitlab-advanced-search-rake-tasks) (shows specific projects that are not indexed)
2019-10-12 21:52:04 +05:30
If:
- Everything is showing at 100%, escalate to GitLab support. This could be a potential
bug/issue.
- You do see something not at 100%, attempt to reindex that project. To do this,
run `sudo gitlab-rake gitlab:elastic:index_projects ID_FROM=<project ID> ID_TO=<project ID>`.
If reindexing the project shows:
- Errors on the GitLab side, escalate those to GitLab support.
2019-12-21 20:55:43 +05:30
- Elasticsearch errors or doesn't present any errors at all, reach out to your
2021-03-11 19:13:27 +05:30
Elasticsearch administrator to check the instance.
2019-10-12 21:52:04 +05:30
### Troubleshooting integration
Troubleshooting integration tends to be pretty straight forward, as there really isn't
much to "integrate" here.
If the issue is:
2020-03-13 15:44:24 +05:30
- With the Go indexer, check if the ICU development package is installed.
This is a required package so make sure you install it.
Go indexer was a beta indexer which can be optionally turned on/off, but in 12.3 it reached stable status and is now the default.
- Not concerning the Go indexer, it is almost always an
2021-03-11 19:13:27 +05:30
Elasticsearch-side issue. This means you should reach out to your Elasticsearch administrator
2019-10-12 21:52:04 +05:30
regarding the error(s) you are seeing. If you are unsure here, it never hurts to reach
out to GitLab support.
2021-06-08 01:23:25 +05:30
Beyond that, review the error. If it is:
2019-10-12 21:52:04 +05:30
- Specifically from the indexer, this could be a bug/issue and should be escalated to
GitLab support.
2021-06-08 01:23:25 +05:30
- An OS issue, you should reach out to your systems administrator.
2020-07-28 23:09:34 +05:30
- A `Faraday::TimeoutError (execution expired)` error **and** you're using a proxy,
2021-04-17 20:07:23 +05:30
[set a custom `gitlab_rails['env']` environment variable, called `no_proxy`](https://docs.gitlab.com/omnibus/settings/environment-variables.html)
2020-07-28 23:09:34 +05:30
with the IP address of your Elasticsearch host.
2019-10-12 21:52:04 +05:30
### Troubleshooting performance
2019-12-21 20:55:43 +05:30
Troubleshooting performance can be difficult on Elasticsearch. There is a ton of tuning
2019-10-12 21:52:04 +05:30
that *can* be done, but the majority of this falls on shoulders of a skilled
2019-12-21 20:55:43 +05:30
Elasticsearch administrator.
2019-10-12 21:52:04 +05:30
Generally speaking, ensure:
2019-12-21 20:55:43 +05:30
- The Elasticsearch server **is not** running on the same node as GitLab.
- The Elasticsearch server have enough RAM and CPU cores.
2019-12-04 20:38:33 +05:30
- That sharding **is** being used.
2019-10-12 21:52:04 +05:30
2020-04-22 19:07:51 +05:30
Going into some more detail here, if Elasticsearch is running on the same server as GitLab, resource contention is **very** likely to occur. Ideally, Elasticsearch, which requires ample resources, should be running on its own server (maybe coupled with Logstash and Kibana).
2019-10-12 21:52:04 +05:30
2019-12-21 20:55:43 +05:30
When it comes to Elasticsearch, RAM is the key resource. Elasticsearch themselves recommend:
2019-10-12 21:52:04 +05:30
- **At least** 8 GB of RAM for a non-production instance.
- **At least** 16 GB of RAM for a production instance.
- Ideally, 64 GB of RAM.
2019-12-21 20:55:43 +05:30
For CPU, Elasticsearch recommends at least 2 CPU cores, but Elasticsearch states common
2019-10-12 21:52:04 +05:30
setups use up to 8 cores. For more details on server specs, check out
2019-12-21 20:55:43 +05:30
[Elasticsearch's hardware guide](https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html).
2019-10-12 21:52:04 +05:30
2019-12-21 20:55:43 +05:30
Beyond the obvious, sharding comes into play. Sharding is a core part of Elasticsearch.
2019-10-12 21:52:04 +05:30
It allows for horizontal scaling of indices, which is helpful when you are dealing with
a large amount of data.
With the way GitLab does indexing, there is a **huge** amount of documents being
2019-12-21 20:55:43 +05:30
indexed. By utilizing sharding, you can speed up Elasticsearch's ability to locate
2019-10-12 21:52:04 +05:30
data, since each shard is a Lucene index.
If you are not using sharding, you are likely to hit issues when you start using
2019-12-21 20:55:43 +05:30
Elasticsearch in a production environment.
2019-10-12 21:52:04 +05:30
Keep in mind that an index with only one shard has **no scale factor** and will
likely encounter issues when called upon with some frequency.
If you need to know how many shards, read
2019-12-21 20:55:43 +05:30
[Elasticsearch's documentation on capacity planning](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/capacity-planning.html),
2019-10-12 21:52:04 +05:30
as the answer is not straight forward.
The easiest way to determine if sharding is in use is to check the output of the
2019-12-21 20:55:43 +05:30
[Elasticsearch Health API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html):
2019-10-12 21:52:04 +05:30
- Red means the cluster is down.
- Yellow means it is up with no sharding/replication.
- Green means it is healthy (up, sharding, replicating).
For production use, it should always be green.
Beyond these steps, you get into some of the more complicated things to check,
such as merges and caching. These can get complicated and it takes some time to
2019-12-21 20:55:43 +05:30
learn them, so it is best to escalate/pair with an Elasticsearch expert if you need to
2019-10-12 21:52:04 +05:30
dig further into these.
Feel free to reach out to GitLab support, but this is likely to be something a skilled
2021-03-11 19:13:27 +05:30
Elasticsearch administrator has more experience with.
2019-10-12 21:52:04 +05:30
2021-04-17 20:07:23 +05:30
### Troubleshooting Advanced Search migrations
2021-03-08 18:12:59 +05:30
2021-04-17 20:07:23 +05:30
Troubleshooting Advanced Search migration failures can be difficult and may
require contacting an Elasticsearch administrator or GitLab Support.
2021-03-08 18:12:59 +05:30
2021-04-17 20:07:23 +05:30
The best place to start while debugging issues with an Advanced Search
migration is the [`elasticsearch.log` file](../logs.md#elasticsearchlog).
2021-06-08 01:23:25 +05:30
Migrations log information while a migration is in progress and any
2021-04-17 20:07:23 +05:30
errors encountered. Apply fixes for any errors found in the log and retry
the migration.
2021-03-08 18:12:59 +05:30
If you still encounter issues after retrying the migration, reach out to GitLab support.
2019-10-12 21:52:04 +05:30
## Common issues
All common issues [should be documented](../../integration/elasticsearch.md#troubleshooting). If not,
feel free to update that page with issues you encounter and solutions.
## Replication
2020-03-13 15:44:24 +05:30
Setting up Elasticsearch isn't too bad, but it can be a bit finicky and time consuming.
2019-10-12 21:52:04 +05:30
2020-06-23 00:09:42 +05:30
The easiest method is to spin up a Docker container with the required version and
2019-10-12 21:52:04 +05:30
bind ports 9200/9300 so it can be used.
2020-06-23 00:09:42 +05:30
The following is an example of running a Docker container of Elasticsearch v7.2.0:
2019-10-12 21:52:04 +05:30
2020-03-13 15:44:24 +05:30
```shell
2019-10-12 21:52:04 +05:30
docker pull docker.elastic.co/elasticsearch/elasticsearch:7.2.0
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.2.0
```
From here, you can:
2020-06-23 00:09:42 +05:30
- Grab the IP of the Docker container (use `docker inspect <container_id>`)
2019-10-12 21:52:04 +05:30
- Use `<IP.add.re.ss:9200>` to communicate with it.
2019-12-21 20:55:43 +05:30
This is a quick method to test out Elasticsearch, but by no means is this a
2019-10-12 21:52:04 +05:30
production solution.