This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Blog

Release 2.1.0

The Apache HoraeDB(incubating) team are pleased to announce that v2.1.0 is released, which has closed over 60 issues, including two major features:

1. New WAL implementation based on local disk.

In previous version, there is a RocksDB-based WAL. Although it works well in most cases, it has following issues:

  • Compiling from source can be a challenging task, especially since RocksDB is primarily written in C++.
  • For WAL, RocksDB can be somewhat overkill. If you are not familiar with RocksDB, tuning it can be very challenging.

With this new WAL, the above two problems are solved very well, and in performance test, the new WAL slightly outperforms the previous implementation, giving a solid foundation for future optimizations.

Comparison of Write throughout
Comparison of Write throughout
Comparison of Replay time
Comparison of Replay time

Interested readers can refer to the design documentation here for more details on this feature.

How to enable

[analytic.wal]
type = "Local"
data_dir = "/path/to/local/wal"

2. Access object store with Apache OpenDAL

OpenDAL (Open Data Access Layer) is a project that provides a unified API for accessing various data storage backends. It offers several advantages for developers and organizations. Here are some key benefits:

  • Unified API. OpenDAL provides a consistent and unified API for accessing different storage backends, such as AWS S3, Azure Blob Storage, and local file systems.
  • Optimized for Efficiency: OpenDAL is built with performance in mind. It includes optimizations to ensure efficient data access and manipulation, making it suitable for high-performance applications.
  • Comprehensive Documentation: The project provides detailed documentation, making it easier for developers to get started and understand how to use the library effectively.

In newer versions of OpenDAL, object_store integration is provided, which is very beneficial for HoraeDB code migration, as the APIs used by the upper layers remain virtually unchanged, and only the object store part needs to be abstracted to a unified OpenDAL operator:

1
2
3
4
5
// Create a new operator
let operator = Operator::new(S3::default())?.finish();

// Create a new object store
let object_store = Arc::new(OpendalStore::new(operator));

Additionally, since the Apache OpenDAL implementation of object_store is based on the latest version of the object_store, which has breaking changes from the previous version that HoraeDB is using, we’ve chosen to make it compatible in order to keep the scope of this upgrade as manageable as possible.

In the process of adapting to the new API, the put_multipart interface has changed the most, so the main adaptation logic is also here, HoraeDB’s approach is to encapsulate the underlying put_multipart interface to ensure that the upper layer code is not modified, the details can be found in the reference:

https://github.com/apache/horaedb/blob/v2.1.0/src/components/object_store/src/multi_part.rs

Note: The adaptation logic is only practical when parquet version < 52.0.0.

Download

Go to download pages.

Conclusion

Other bug fixes and improvements can be seen here:

As always, we warmly welcome you to join our community and share your insights.

Release 2.0.0

This is the first version after enter ASF incubator, thanks everyone for making it happen!

Upgrade from 1.x.x to 2.0.0

The transition from CeresDB to Apache HoraeDB introduces several breaking changes. To facilitate upgrading from older versions to v2.0.0, specific alterations are necessary.

Upgrade Steps

  1. Setup required envs
export HORAEDB_DEFAULT_CATALOG=ceresdb
  1. Update config

Etcd’s root should be configured both in horaedb and horaemeta

For horaedb

[cluster_deployment.etcd_client]
server_addrs = ['127.0.0.1:2379']
root_path = "/rootPath"

For horaemeta

storage-root-path = "/rootPath"
  1. Upgrade horaemeta

Horaedb will throw following errors, which is expected

2024-01-23 14:37:57.726 ERRO [src/cluster/src/cluster_impl.rs:136] Send heartbeat to meta failed, err:Failed to send heartbeat, cluster:defaultCluster, err:status: Unimplemented, message: "unknown service meta_service.MetaRpcService", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }
  1. Upgrade horaedb

After all server upgraded, the cluster should be ready for read/write, and old data could be queried like before.

What’s Changed

Breaking Changes

Features

Refactor

Fixed

Docs

Chore

New Contributors

Full Changelog: https://github.com/apache/horaedb/compare/v1.2.7...v2.0.0