Issues with Partial Kubernetes Label Ingestion into OpenObserve

TLDR Hakan is facing problem with Kubernetes label ingestion into OpenObserve from different services. Hengfei suggested several possibilities including, the use of 'gzip', and verified the absence of data filtration in OpenObserve. The bug remains unresolved.

Photo of Hakan
Hakan
Tue, 12 Sep 2023 11:21:30 UTC

Hi, I don't understand why only some of the service logs (or their labels) from different clusters are making it into openobserve. Mainly, I am interested in filtering by `kubernetes_labels_app_kubernetes_io_name` which translates to the Kubernetes label `` but somehow, I only see a handful of the resources with that label showing up. I have this in fluent-bit configured: ``` [OUTPUT] Name http Match * URI /api/test/test-a1/_json Host Port 443 tls On Format json Json_date_key _timestamp Json_date_format iso8601 HTTP_User HTTP_Passwd Complexpass#123 compress gzip``` Any ideas? Thanks for your input.

Photo of Hengfei
Hengfei
Tue, 12 Sep 2023 11:23:42 UTC

i am not sure why only some labels ingested, but i am thinking we don't support `gzip` compress, change it to `none` ?

Photo of Hakan
Hakan
Tue, 12 Sep 2023 11:28:54 UTC

thanks for the suggestion, I changed that, although it looks like it did not block logs from getting ingested before and didn't have any impact on the described behaviour

Photo of Hengfei
Hengfei
Tue, 12 Sep 2023 11:45:46 UTC

thanks, i will test with `compress:gzip`

Photo of Hengfei
Hengfei
Tue, 12 Sep 2023 11:46:25 UTC

but there is no reason, only some labels ingested, we accept data by row, accept a row or drop a row, and never drop a label.

Photo of Hengfei
Hengfei
Tue, 12 Sep 2023 11:46:50 UTC

did you add some function for the stream?

Photo of Hakan
Hakan
Tue, 12 Sep 2023 11:47:14 UTC

nope, none so far

Photo of Hakan
Hakan
Tue, 12 Sep 2023 11:48:23 UTC

although I was just thinking about how to eliminate all the keys with `null` values

Photo of Ashish
Ashish
Tue, 12 Sep 2023 11:55:00 UTC

can you please share an sample log record

Photo of Ashish
Ashish
Tue, 12 Sep 2023 11:55:22 UTC

and also mention which all labels are diffrenet

Photo of Hakan
Hakan
Tue, 12 Sep 2023 11:58:56 UTC

```{ "_p": "F", "_timestamp": 1694519849343055, "kubernetes_annotations_timestamp": "20230905074108", "kubernetes_container_hash": "", "kubernetes_container_image": "", "kubernetes_container_name": "microservice-example-core", "kubernetes_docker_id": "d8bddd5097b3e43bc5851d6bf5db544bdd8aa60541a42c8b39d3a68654262635", "kubernetes_host": "ip-10-67-54-204.eu-central-1.compute.internal", "kubernetes_labels_app_kubernetes_io_instance": "web-service-example-backend-1", "kubernetes_labels_app_kubernetes_io_name": "microservice-example-core", "kubernetes_labels_pod_template_hash": "77b5867969", "kubernetes_namespace_name": "web", "kubernetes_pod_id": "3f6ec1f0-c5e5-46e6-91ae-0a273b619654", "kubernetes_pod_name": "microservice-example-core-77b5867969-w8mk4", "log": "removeProcessedAiFiles", "stream": "stdout", "time": "2023-09-12T11:57:29.343055801Z" }```

Photo of Hakan
Hakan
Tue, 12 Sep 2023 12:02:17 UTC

on this cluster, the logs of the first service get ingested, but not of the second service: ```service-example/component: microservice-example-core : web-service-example-backend-1 : Helm : microservice-example-core : 1.0 : microservice-example-core-0.0.0 service-example/component: microservice-attachments : web-service-example-backend-4 : Helm : microservice-attachments : 1.0 : microservice-attachments-0.0.0```

Photo of Hakan
Hakan
Tue, 12 Sep 2023 12:04:19 UTC

however, on another cluster, the second service does get ingested, with exactly the same label set

Photo of Hengfei
Hengfei
Tue, 12 Sep 2023 12:05:18 UTC

OpenObserve has no filter for logs. i am thinking if the fluent-bit deployed to each node?

Photo of Hakan
Hakan
Tue, 12 Sep 2023 12:08:03 UTC

yes, made sure every node has a running fluent-bit deployed, especially the node with the not-logged service

Photo of Hakan
Hakan
Tue, 12 Sep 2023 12:16:38 UTC

just wanted to add that on the second cluster, where the second application logs from the above example get ingested, there are other application that again do not get ingested. seems like there isn't an obvious pattern here

Photo of Hengfei
Hengfei
Tue, 12 Sep 2023 12:18:36 UTC

i am thinking maybe we can debug for fluent-bit like print the logs instead ingest to OpenObserve to confirm if it collects the logs of that service.

Photo of Hakan
Hakan
Tue, 12 Sep 2023 12:21:47 UTC

will try that