As pointed out in the previous DDD article, it is now theoretically wrong to use repository, but best practices have not been explored and are used when daOs are used. The difference is that Repository is domain level and not deeply thought through
When I was reviewing DDD Play 2 again recently, I came across this comment
Domain service should not call repository directly, which breaks my understanding of repository, which makes me have to worry about repository. In the previous study, I have never heard this rule. Repository and Domain service are both domain layer. Why can’t they call each other?
Retrace your knowledge of Repository from its source by reviewing Eric Evans’s Domain Driven Design and Vaughn Vernon’s Implementing Domain Driven Design.
repository
Repository is described in Chapter 6 of Domain-driven Design, the Lifecycle of domain objects
Factory is used to create domain objects, and repository is used in the middle and end of the lifecycle to provide a means to find and retrieve persistent objects and encapsulate large infrastructure
This clarifies repository’s responsibilities:
- Provides for finding and retrieving objects
- Coordinate domain and data mapping layers
In the context of prior art, where DAO is used, why would you need to introduce Repository?
Although Repositories and Factories themselves do not originate from the domain, they play an important role in domain design. These structures provide an easy-to-grasp approach to model objects that completes model-driven Design
The goal of domain-driven design is to create better software by focusing on domain models rather than technology. Suppose the developer constructs an SQL query, passes it to a query service in the infrastructure layer, extracts the required information from the result set of the resulting table row data, and passes it to the constructor or factory. By the time the developer performs this sequence, the model is no longer the focus. It’s natural to think of objects as containers to hold the queried data, so the whole design shifts to a data-processing style. While the exact technical details vary, the problem remains — the customer is dealing with the technology, not the model concept
In DDD thinking, the domain model is the most important, and everything is done to get the team to focus on the model, and to block out all non-model technical details, so that the common language is all about the model
VS DAO
Some people conclude that DDD is divide and unite, divide is the means, unite is the end; For THE DDD strategy, it is to form context boundaries by dividing, and then merge in each context, much like a merge algorithm
Whereas aggregation is minimal, repository, relative to DAO, is to manage aggregation, manage domain object lifecycle
- Provides customers with a simple model that can be used to obtain persistent objects and manage the lifecycle
- Decouple application and domain design from persistence techniques (multiple database strategies or even multiple data sources)
- Embody design decisions for object access
- You can easily replace them with “dummy implementations” for use in tests (usually using in-memory collections)
The core value of DAO is to encapsulate the trivial low-level logic of stitching SQL, maintaining database connections, transactions, etc., so that business development can focus on writing code. But in essence, DAO operation or database operation, DAO method or direct operation of the database and data model, but less written part of the code, and can operate any table objects; In Uncle Bob’s book, “How to Clean Code,” the author has a very vivid description:
- Hardware: Something that cannot (or is difficult) to change after it has been created. Database for development, belongs to the “hardware”, after the selection of the database basically will not change, for example: using MySQL is difficult to change to MongoDB, the transformation cost is too high.
- Software: Something that is created and can be modified at any time. For development, business code should aspire to be “software” because business processes and rules are constantly changing and our code should be able to change as well.
- Firmware: Software that is heavily dependent on hardware. We are common is the router firmware or Android firmware and so on. Firmware abstracts the hardware, but only applies to a specific hardware. So today there is no such thing as a universal Android firmware, but every phone needs its own firmware.
From the above description we can see that the database is essentially “hardware”, the DAO is essentially “firmware”, and our own code would like to be “software”. However, firmware has a very bad feature, that is, it will spread, that is, when a software is strongly dependent on the firmware, due to the limitation of the firmware, the software will become hard to change, and eventually the software will become as hard to change as the firmware
Here’s an example of software that can easily be “solidified” :
private OrderDAO orderDAO; Public Long addOrder(RequestDTO Request) {OrderDO OrderDO = new OrderDO(); orderDAO.insertOrder(orderDO); return orderDO.getId(); } public void updateOrder(OrderDO orderDO, RequestDTO updateRequest) { orderDO.setXXX(XXX); // Omit a lot of orderdao.updateOrder (orderDO); } public void doSomeBusiness(Long id) { OrderDO orderDO = orderDAO.getOrderById(id); // Omit a lot of business logic here}Copy the code
In the simple code above, the object relies on the DAO, and therefore on the DB. This doesn’t look bad at first glance, but if you want to add cache logic in the future, your code needs to change to:
private OrderDAO orderDAO; private Cache cache; Public Long addOrder(RequestDTO Request) {OrderDO OrderDO = new OrderDO(); orderDAO.insertOrder(orderDO); cache.put(orderDO.getId(), orderDO); return orderDO.getId(); } public void updateOrder(OrderDO orderDO, RequestDTO updateRequest) { orderDO.setXXX(XXX); // Omit a lot of orderdao.updateOrder (orderDO); cache.put(orderDO.getId(), orderDO); } public void doSomeBusiness(Long id) { OrderDO orderDO = cache.get(id); if (orderDO == null) { orderDO = orderDAO.getOrderById(id); } // Omit a lot of business logicCopy the code
At this point, you will find that the insertion logic has changed, so that you need to change from 1 line of code to at least 3 lines of code in all places where the data is used. And when your code gets too big, then if you forget to look up the cache somewhere, or if you forget to update the cache somewhere, then at least you need to look up the database, or at least the cache is inconsistent with the database, resulting in a bug. As you have more and more code, more and more places to call the DAO directly, and more and more places to cache, each underlying change becomes more and more difficult and bug-prone. This is what happens when software is “solidified.”
Therefore, we need a pattern that can isolate our software (business logic) from the firmware/hardware (DAO, DB) to make our software more robust, and this is the core value of Repository
This is also the second point mentioned above: harmonizing the domain and data mapping layers
If DAO is a low-level abstraction, Repository is a high-level abstraction, which highlights the essence of Repository: to manage the domain lifecycle, regardless of where the data comes from, as long as the aggregate root is fully built
Data Model and Domain Model
According to Robert In Clean Architecture, the domain model is the core and the data model is the technical detail. The reality is that both are important
Data model is responsible for data storage, and its essence is scalability, flexibility, and performance
The domain model is responsible for the implementation of the business logic, the essence of which is the explicit expression of business semantics, and the full use of OO features to increase the business representation capability of the code
Call relationship
For domain services, do not call repository. For domain services, do not call repository. Author reply:
Domain Services are collections of business rules, not business processes, so there should be no place where Domain services need to be called to the Repo. If you need to take data from another place, it’s best to use it as an input rather than calling it internally. DomainService needs to be stateless. Add Repo to make it stateful.
The way I usually think about it is that domainService is the rules engine and appService is the process engine. Repo has nothing to do with rules
What is the difference between business rules and business processes?
There is an easy way to distinguish between business rules that have if/else and business processes that don’t
In DDD lecture 4, the author has an example of using domain Service to directly call Repository, which is used as a spear to ask the author again
The domain service here uses the repo directly. If all the data in the domain service uses the input parameters, the structure is a little strange
In this case it was a bit of a problem (because the focus was not on that detail at the time), a more reasonable approach would have been to look for Weapon in AppService and then performAttack(Player, Monster, Weapon). If multiple incoming parameters are too cumbersome, you can encapsulate an AttackContext collection object.
Why are you doing this? The most immediate is that DomainService becomes “side-effect-free”. If you know FP, you can think of it as a Pure function (just like, of course, not pure itself because it changes the Entity, but at least there are no out-of-memory calls). This is more of an option, I prefer to leave DomainService with no side effects (in this case, persistent data changes).
If Weapon offers nothing but statistics, let’s assume that as far as attack goes, they degrade for each attack. If you use repo in performAttack, should you call repo.save(Weapon)? So why not just use Userrepo.Save (player), monsterrepo.Save (monster) directly after completion? And then by extension, if that’s all done, what’s the AppService for? Is the Service a “business rule” or a “business process”?
On the other hand, sometimes it doesn’t have to be dogmatic. DomainService is not completely unable to use Repo, sometimes there are complex rules to take data from “somewhere”, especially “read-only” data. But when I say DomainService does not call repo, the core reason is that I don’t want you to have “side effects” in DomainService.
Given this limitation, I can only think of domain services as pure memory operations and not relying on Repository to improve testability
The performance security
That’s what a lot of people think about when they hit the ground
performance
Query aggregation and performance balance, such as Order aggregation root, but sometimes only want to check the Order master information, do not need details, but repository building Order all found, how to do? In Implementing Domain-driven Design, this is also not recommended, and lazy loading is used. Many people also feel that this is a design problem and cannot rely on lazy loading
I asked the author about this:
In business systems, the core goal is to ensure data consistency, and performance (including the cost of two database queries, serialization) is usually not a big issue. If you sacrifice consistency for performance, you’re paying too much for it, and you’ll almost certainly trigger bugs in the future.
If performance is really a bottleneck, you have a design problem, which means that your query target (the master order information) and write target (the master sub-order set) are inconsistent. A common recommendation is to use CQRS, where the Read side reads another store (search, cache, etc.), and the write side changes with the complete Aggregate, and then synchronizes the Read and write data by means of message or binlog synchronization.
This also involves business types, such as e-commerce, an order under the order details are very small, and like ticket tax, a huge business order will have many order details, really to build a complete aggregate root will eat a lot of memory
Object tracking
The repostiory is the aggregate root of the operation, and most of the saves involve only part of the data, so the changing objects need to be tracked
There are two approaches mentioned in Implementing Domain Driven Design:
- Implicit copy-on-read [Keith & Stafford] : When an object is Read from the data store, the persistence mechanism implicitly copies the object and then compares it to an object on the client when it is submitted. Detailed process is as follows: when the client requests the persistence mechanism while reading an object from the data store, the persistence mechanism on the one hand, will get to the object returned to the client, on the one hand, a backup immediately create a copy of the object (remove the lazy loading parts, these parts can be found in the actual loading and then to copy). When the client commits the transaction, the persistence mechanism compares the replicated object to the object in the client. All object changes are updated to the datastore.
- Implicit copy-on-write) [Keith & Stafford] : The persistence mechanism manages all loaded persistent objects by delegation. When each object is loaded, the persistence mechanism creates a tiny delegate for it and gives it to the client. The client doesn’t know that it’s calling the behavior method in the delegate object, and the delegate object calls the behavior method in the real object. When a delegate object first receives a method call, it creates a backup of the real object. The delegate object tracks changes to the real object and marks it as “dirty.” When the transaction commits, the transaction checks all the “dirty” objects and updates the changes to them to the data store.
The advantages and differences between these two approaches may vary from case to case. If there are pros and cons to both options for your system, it’s time to consider them carefully. Of course, you can choose your favorite option, but it’s not necessarily the safest option. However, both approaches have the same advantage that they can implicitly track changes that occur in persistent objects without the client having to handle them themselves. The bottom line here is that persistence mechanisms, such as Hibernate, allow us to create a traditional, collect-oriented repository. On the other hand, even if we could use a persistence mechanism such as Hibernate to create a collection oriented repository, we would still encounter some inappropriate scenarios. If your domain is very performance-intensive and has a large number of objects in memory at any one time, persistence mechanisms can put an additional burden on your system. At this point, you need to consider and decide whether such a persistence mechanism is right for you. Of course, Hibernate works fine in many cases. So while I’m alerting you to the problems these persistence mechanisms can cause, that doesn’t mean you shouldn’t use them. The use of any tool requires multiple trade-offs
DDD # 2 also mentions that there are two major change tracking solutions in the industry: these are just two different names for the two solutions above, which mean the same thing
- Snapshot-based solution: When data is taken from DB, a Snapshot is saved in memory and compared with the Snapshot when data is written. Common implementations are Hibernate
- Proxy-based solution: When the data is extracted from the DB, weaving adds a tangent to all setters to determine if the setters are called and if the value has changed. If the setters have changed, they are marked as Dirty. When saving, determine whether to update based on the Dirty. Common implementations are Entity Framework
The Snapshot solution has the advantage of simplicity, the cost of full Diff (usually Reflection) on each save, and the memory consumption of saving the Snapshot. The benefit of Proxy scheme is that it has high performance and almost no increased cost, but the disadvantage is that it is difficult to implement, and when there is a nested relationship, it is not easy to find the changes of nested objects (such as the increase and deletion of sub-list), which may lead to bugs.
Due to the complexity of the Proxy solution, the mainstream of the industry (including EF Core) is using the Snapshot solution. Another benefit of this is that Diff allows you to find out which fields have changed and then UPDATE only the changed fields, again reducing the cost of UPDATE.
security
When you design aggregations, they should be small for transactional and security reasons. When concurrency is high, optimistic locking is required for aggregation root operations
Reference
This article teaches you about domain models and data models
Lecture 3 – Repository mode