The use of parquet files and S3 as the main data storage seems interesting, but can you elaborate on how telemetry is optimized for frequent data inserts, since S3 only allows replacing a full file and data cannot be appended directly to it?
Does it mean you use some sort of partitions with many files under the hood or a local buffer (memory/file/db) to make batch updates every x seconds on S3 for better performance?
The use of parquet files and S3 as the main data storage seems interesting, but can you elaborate on how telemetry is optimized for frequent data inserts, since S3 only allows replacing a full file and data cannot be appended directly to it?
Does it mean you use some sort of partitions with many files under the hood or a local buffer (memory/file/db) to make batch updates every x seconds on S3 for better performance?
Would be happy to give it a try!