HA Deployment Problem in EKS Cluster
TLDR Dylan was having issues setting up an HA deployment in their EKS cluster. Prabhat and Hengfei advised them to update to a newer version (0.7.0) which fixed the problem.
1
1
1
Nov 14, 2023 (2 weeks ago)
Dylan
08:59 PMIt looks like it's trying to retrieve the schema from localhost so looks like there might be some sort of communication failure between the router and querier.
Any ideas? Thanks!
helmfile.yaml
- name: openobserve-{{ .Environment.Name }}
chart: openobserve/openobserve
version: 0.6.4
labels:
service-name: openobserve
is-elk: true
namespace: {{ .Environment.Name }}
values:
- ./openobserve/values/{{ .Environment.Name }}.yaml
dev.yaml
config:
ZO_S3_BUCKET_NAME: ""
serviceAccount:
annotations:
: arn:aws:iam:::```
kubectl --context development -n openobserve logs openobserve-development-router-7689f4c577-c2v7q | grep 503[2023-11-14T20:48:11Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/streams?type=logs&fetchSchema=true HTTP/1.1" 503 43 "-" "http://localhost:5080/web/logs?stream=default&period=15m&refresh=0&org_identifier=default&sql_mode=false" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000134
[2023-11-14T20:48:13Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/summary HTTP/1.1" 503 43 "-" "http://localhost:5080/web/?org_identifier=default" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000230
[2023-11-14T20:48:14Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/streams?type=logs&fetchSchema=true HTTP/1.1" 503 43 "-" "http://localhost:5080/web/logs?org_identifier=default" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000150
[2023-11-14T20:48:19Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/streams?type=logs&fetchSchema=true HTTP/1.1" 503 43 "-" "http://localhost:5080/web/logs?org_identifier=default" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000212
[2023-11-14T20:48:38Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/streams?type=logs&fetchSchema=true HTTP/1.1" 503 43 "-" "http://localhost:5080/web/logs?org_identifier=default" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000303
[2023-11-14T20:49:06Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/streams?type=logs&fetchSchema=true HTTP/1.1" 503 43 "-" "http://localhost:5080/web/logs?org_identifier=default" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000210
[2023-11-14T20:49:12Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/streams?type=logs HTTP/1.1" 503 43 "-" "http://localhost:5080/web/metrics?org_identifier=default" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000197
[2023-11-14T20:49:12Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/streams?type=metrics&fetchSchema=true HTTP/1.1" 503 43 "-" "http://localhost:5080/web/metrics?org_identifier=default" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000182
[2023-11-14T20:49:13Z INFO actix_web::middleware::logger] 127.0.0.1 "GET /api/default/streams?type=logs&fetchSchema=true HTTP/1.1" 503 43 "-" "http://localhost:5080/web/logs?org_identifier=default" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 0.000191
kubectl --context development -n openobserve logs openobserve-development-querier-5f87588bcc-bd8bz[2023-11-14T20:47:26Z INFO openobserve] Starting OpenObserve v0.6.4
[2023-11-14T20:47:26Z INFO openobserve] System info: CPU cores 64, MEM total 511451 MB, Disk total 511 GB, free 84 GB
[2023-11-14T20:47:27Z INFO openobserve::common::infra::cluster] Start watching node_list
[2023-11-14T20:47:27Z INFO openobserve::common::infra::cluster] [CLUSTER] Register to cluster ok
[2023-11-14T20:47:27Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 3, uuid: "b9b83ece-2f42-4b06-8355-444f5861ab43", name: "openobserve-development-router-7689f4c577-c2v7q", http_addr: "http://10.19.226.124:5080", grpc_addr: "http://10.19.226.124:5081", role: [Router], cpu_num: 64, status: Prepare, broadcasted: false }
[2023-11-14T20:47:27Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 4, uuid: "53f45f30-46e9-4302-a603-426f3cbc44ee", name: "openobserve-development-alertmanager-687dfc4cd5-77mfc", http_addr: "http://10.19.226.240:5080", grpc_addr: "http://10.19.226.240:5081", role: [AlertManager], cpu_num: 64, status: Prepare, broadcasted: false }
[2023-11-14T20:47:28Z INFO openobserve::service::db::user] Start watching user
[2023-11-14T20:47:28Z INFO openobserve::service::db::user] Users Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::functions] Start watching function
[2023-11-14T20:47:28Z INFO openobserve::service::db::metrics] Start watching prometheus cluster leader
[2023-11-14T20:47:28Z INFO openobserve::service::db::schema] Start watching stream schema
[2023-11-14T20:47:28Z INFO openobserve::service::db::compact::retention] Start watching stream deleting
[2023-11-14T20:47:28Z INFO openobserve::service::db::triggers] Start watching Triggers
[2023-11-14T20:47:28Z INFO openobserve::service::db::alerts::destinations] Start watching alert destinations
[2023-11-14T20:47:28Z INFO openobserve::service::db::alerts] Start watching alerts
[2023-11-14T20:47:28Z INFO openobserve::service::db::alerts::templates] Start watching alert templates
[2023-11-14T20:47:28Z INFO openobserve::service::db::schema] Stream schemas Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::functions] Functions Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::metrics] Prometheus cluster leaders Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::alerts::templates] Alert templates Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::alerts::destinations] Alert destinations Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::alerts] Alerts Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::triggers] Triggers Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::syslog] SyslogRoutes Cached
[2023-11-14T20:47:28Z INFO openobserve::service::db::syslog] SyslogServer settings Cached
[2023-11-14T20:47:28Z INFO object_store::aws] Using WebIdentity credential provider
[2023-11-14T20:47:28Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 3, uuid: "b9b83ece-2f42-4b06-8355-444f5861ab43", name: "openobserve-development-router-7689f4c577-c2v7q", http_addr: "http://10.19.226.124:5080", grpc_addr: "http://10.19.226.124:5081", role: [Router], cpu_num: 64, status: Online, broadcasted: false }
[2023-11-14T20:47:28Z INFO openobserve::service::db::file_list::remote] Load file_list [file_list/] gets 0 files
[2023-11-14T20:47:28Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 4, uuid: "53f45f30-46e9-4302-a603-426f3cbc44ee", name: "openobserve-development-alertmanager-687dfc4cd5-77mfc", http_addr: "http://10.19.226.240:5080", grpc_addr: "http://10.19.226.240:5081", role: [AlertManager], cpu_num: 64, status: Online, broadcasted: false }
[2023-11-14T20:47:41Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 5, uuid: "be53aeed-6ff7-440d-a358-f525f316baf8", name: "openobserve-development-ingester-0", http_addr: "http://10.19.226.138:5080", grpc_addr: "http://10.19.226.138:5081", role: [Ingester], cpu_num: 64, status: Prepare, broadcasted: false }
[2023-11-14T20:47:42Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 5, uuid: "be53aeed-6ff7-440d-a358-f525f316baf8", name: "openobserve-development-ingester-0", http_addr: "http://10.19.226.138:5080", grpc_addr: "http://10.19.226.138:5081", role: [Ingester], cpu_num: 64, status: Online, broadcasted: false }
kubectl --context development -n openobserve logs openobserve-development-router-7689f4c577-c2v7q | grep CLUSTER[2023-11-14T20:47:27Z INFO openobserve::common::infra::cluster] [CLUSTER] Register to cluster ok
[2023-11-14T20:47:27Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 4, uuid: "53f45f30-46e9-4302-a603-426f3cbc44ee", name: "openobserve-development-alertmanager-687dfc4cd5-77mfc", http_addr: "http://10.19.226.240:5080", grpc_addr: "http://10.19.226.240:5081", role: [AlertManager], cpu_num: 64, status: Prepare, broadcasted: false }
[2023-11-14T20:47:28Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 3, uuid: "b9b83ece-2f42-4b06-8355-444f5861ab43", name: "openobserve-development-router-7689f4c577-c2v7q", http_addr: "http://10.19.226.124:5080", grpc_addr: "http://10.19.226.124:5081", role: [Router], cpu_num: 64, status: Online, broadcasted: false }
[2023-11-14T20:47:28Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 4, uuid: "53f45f30-46e9-4302-a603-426f3cbc44ee", name: "openobserve-development-alertmanager-687dfc4cd5-77mfc", http_addr: "http://10.19.226.240:5080", grpc_addr: "http://10.19.226.240:5081", role: [AlertManager], cpu_num: 64, status: Online, broadcasted: false }
[2023-11-14T20:47:41Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 5, uuid: "be53aeed-6ff7-440d-a358-f525f316baf8", name: "openobserve-development-ingester-0", http_addr: "http://10.19.226.138:5080", grpc_addr: "http://10.19.226.138:5081", role: [Ingester], cpu_num: 64, status: Prepare, broadcasted: false }
[2023-11-14T20:47:42Z INFO openobserve::common::infra::cluster] [CLUSTER] join Node { id: 5, uuid: "be53aeed-6ff7-440d-a358-f525f316baf8", name: "openobserve-development-ingester-0", http_addr: "http://10.19.226.138:5080", grpc_addr: "http://10.19.226.138:5081", role: [Ingester], cpu_num: 64, status: Online, broadcasted: false }```
Prabhat
09:01 PMPrabhat
09:02 PMPrabhat
09:03 PMPrabhat
09:03 PMDylan
09:03 PMDylan
09:04 PMPrabhat
09:04 PMDylan
09:04 PMPrabhat
09:04 PMDylan
09:06 PMPrabhat
09:06 PMPrabhat
09:06 PMPrabhat
09:06 PM1
Dylan
09:06 PMPrabhat
09:14 PM1
Prabhat
09:14 PMimage:
repository: public.ecr.aws/zinclabs/openobserve
pullPolicy: IfNotPresent
# Overrides the image tag whose default is the chart appVersion.
tag: "0.7.0"
Dylan
09:14 PMPrabhat
09:14 PMhelm repo update
helm -n openobserve -f values.yaml upgrade --install zo1 openobserve/openobserve
Dylan
09:18 PM1
Dylan
09:18 PMNov 15, 2023 (2 weeks ago)
Hengfei
01:18 AMOpenObserve
Indexed 404 threads (74% resolved)
Similar Threads
Troubleshooting openobserve HA Self-hosted Set Up Error
Dhananjay faced an issue querying logs with their openobserve HA setup. Ashish, Prabhat, and Joaquin provided troubleshooting steps. The issue was resolved by increasing the read-timeout.
Trouble Running OpenObserve with Docker and GCS Storage
Jay had trouble running openobserve with Docker and GCS. Prabhat suggested adding different environments. After resolving this, Jay still had data storage issues, which were clarified by Hengfei.
Testing Openobserve: Fluent-bit Config & Network Errors.
vasanth needed help testing Openobserve. Ashish provided information for fluent-bit configuration and addressed network error questions.
Exploring OpenObserve Dashboard and Alert Installation, Metrics, Help with Data
Alexander seeks solutions for OpenObserve dashboards and alerts. Prabhat provides advice and highlights future plans. Chirag requires help with Kubernetes dashboards and Prabhat assists to resolve issues.
Integrating OpenObserve with a User Solution and Handling Errors.
Gaurav requested advice about integrating OpenObserve into their solution and disabling authentication. Kirtan and Prabhat provided advice and steps, but Gaurav continues to face issues. The thread is unresolved.