Synchronizing MySQL table with OpenObserve

TLDR dlonghi asked how to synchronize a MySQL table with OpenObserve. Prabhat advised using the same `log_id` field, and addressed a potential duplicity concern, though deletion is still not available. A feature request was created for this issue.

Photo of dlonghi
dlonghi
Mon, 03 Jul 2023 11:57:30 UTC

How could we sync a MySQL table with OpenObserve ? It is a log table where an old php app writes to. It has 3 fields `log_id`, `timestamp`, `message`; In Elastisearch we have been doing this kind of sync using an UPSERT operation with ES doc `_id` being equal to the table's `log_id`.

Photo of Prabhat
Prabhat
Mon, 03 Jul 2023 14:02:40 UTC

OpenObserve does not have an equivalent special `_id` field . You can however insert the same `log_id` or `_id` field in OpenObserve and it will gladly accept it and you can continue using it. The only difference compared to ES would be that if you send 2 records with same value for `_id` field OpenObserve will not update the first record, but will create 2 records.

Photo of dlonghi
dlonghi
Mon, 03 Jul 2023 14:06:22 UTC

Thanks. The goal for this use case is to avoid the duplicates. Can documents be deleted from OO ? I looking at the docs but couldn't find info about that.

Photo of Prabhat
Prabhat
Mon, 03 Jul 2023 14:07:28 UTC

Deletion is not a functionality available right now. We have a feature request for enabling deduplication but that is not yet implemented.

Photo of Prabhat
Prabhat
Mon, 03 Jul 2023 14:07:59 UTC

Don't you have deduplicated data already in mysql

Photo of Prabhat
Prabhat
Mon, 03 Jul 2023 14:08:36 UTC

IF log)id is primary key or has UNIQUE index then it should be deduplicated already

Photo of dlonghi
dlonghi
Mon, 03 Jul 2023 14:18:29 UTC

Yes the data is unique dedup. The ingesting script ir poor though : / no control of what data was already indexed.. no error handling... If it fails, run again and the Elasticsearch upsert and _id would handle the dupes. Certainly this ingesting script should be worked on : )

Photo of Prabhat
Prabhat
Mon, 03 Jul 2023 14:48:59 UTC

Ah got it. Valid use case. Created this issue to track it -

Photo of Prabhat
Prabhat
Mon, 03 Jul 2023 14:49:53 UTC

You can still query using SQL mode and use DISTINCT and you will get deduplicate values today.

Photo of Prabhat
Prabhat
Mon, 03 Jul 2023 14:49:57 UTC

if that helps

Photo of dlonghi
dlonghi
Mon, 03 Jul 2023 18:30:56 UTC

Thank you Mr Prabhat.