# OpenMLDB

Summary

This week, 11 Pull requests were merged, 9 Pull requests were added, 14 Issues were closed, 12 Issues were added. A total of 57 file changes were made, with 3002 lines of code added and 539 lines deleted.

Merged Pull Requests

  • feat: refine the description of benchmark in readme#405
  • feat: update the benchmark images for readme#404
  • feat: remove junit dependencies and covert java cases as scala#396
  • Feat: bump hadoop-common to 2.8.5 to avoid vulnerability#388
  • fix: fix a bug in union with instance not in window#381
  • feat: add batchjob module#359
  • feat: add task manager module#361
  • feat: enhance plan optimization for group and filter#350
  • feat: support column query with the same name in window skew optimization#349
  • feat: add java common lib#347
  • test: modify test case#368

Open Pull Requests

  • Feat: Bump junit from 4.11 to 4.13.1 in/Java/OpenMLdb-batchJob #382
  • WIP : create message table#383
  • feat: add batch, batchjob and taskmanager as java submodules#386
  • fix: fix some code implementation in window skew optimization#392
  • feat: support aggregation over the whole table#393
  • feat: add integration test#395
  • feat: support insert multiple rows into a table using a single SQL insert statement#399
  • feat: add kubernetes java dependencies for taskmanager#400
  • fix: python test and cicd#401

Close Issues

  • Remove junit since of the EPL license for OpenMLDB Batch#390
  • Avoid vulnerability by upgrading hadoop-common dependency#387
  • Rtidb disk usage #389
  • feat: engine plan optimization for where and group with the same partition#317
  • Whether batch insert #177 is supported
  • Parameter Meaning #378
  • The table uses size #373
  • Fix the issue to enable window skew optimization for window union case#374
  • Enable WindowSkewOpt by default and resolve the running issues#335
  • Add module for OpenMLDB Batch to run custom SQL and submit by TaskManager#351
  • Add TaskManager service to submit OpenMLDB Batch jobs#360
  • scripts: package java sdk set cmake type to release#372
  • Multiple columns with the same name can’t execute when last join and over window#356
  • Add java common lib#342

Open Issues

  • feat: improve error message system.#406
  • feat: support insert multiple rows into a table using a single SQL insert statement.#398
  • feat: api server support parameterized query#397
  • Add engine test on performance insensitive mode#394
  • Bug: SQL INSERT Statement with multi rows does not work as expected#391
  • Make openmldb-batchjob and openmldb-taskmanager as submodules of openmldb-parent#385
  • Support AWS S3 for offline data lake storage#384
  • Support creating database API for NearlineTablet#380
  • Support hive metastore for NearlineTablet#379
  • Support submit and manage Flink jobs for TaskManager#376
  • Support submit and manage Kubernetes jobs for TaskManager#375
  • scripts: package java sdk set cmake type to release#371

Contributors

Highlights

The three new Java modules this week are OpenMLDB-Common, OpenMLDB-BatchJob, and OpenMLDB-TaskManager. New submodules can be quickly implemented through reusable Java module abstractions. The BatchJob module and TaskManager implement the first version of the minimum feature set, providing basic batch task management capabilities.

This week, through the expansion of SQL syntax parser and physical plan optimizer, the online execution engine has realized the full table group aggregation and filtering function. The difficulty of this function is that it needs to distinguish UDF and UDAF functions in SQL at the stage of SQL syntax parsing, and finally generate the corresponding logical plan and optimized physical plan.

More developers are welcome to pay attention to and participate in OpenMLDB open source projects.