For a system with requirements for high scalability (thousands of transactions per second), multi-tenancy, and modularity, a Microservices Architecture is the most suitable choice.
Scalability: Each component (e.g., Transaction Ingestion, Balance Calculation, Reporting) has different performance characteristics. We can scale the transaction ingestion service independently to handle high write loads, while the reporting service can be scaled based on query load, without impacting core transaction processing.
Modularity & Decoupling: The system is naturally broken down into distinct business capabilities. This aligns with the "modular" requirement, allowing teams to develop, deploy, and maintain services independently. An update to the reporting engine will not require a redeployment of the critical transaction service.
Fault Isolation: A failure in a non-critical service, like report generation, will not bring down the entire accounting system. The core ledger remains operational, ensuring high availability for transaction recording.
Technology Flexibility: It allows us to use the best tool for the job. We can use a high-throughput message queue like Kafka for the journal, a relational database for transactional integrity, and a document or columnar database for reporting analytics.
A monolith would become a single point of failure and a significant bottleneck at the required scale. A modular monolith might be a starting point, but designing for microservices from the outset will prevent costly re-architecting later.
TransactionValidated
event to the event
bus upon successful validation
TransactionValidated
eventsSUM(debits) == SUM(credits)
)
journal
and entry
records to the
primary database
JournalPosted
event upon successful
commit, signaling a permanent change to the ledger
JournalPosted
events from the event bus
balances
summary table
AccountCreated
) for changes
to account data
TenantSuspended
, TenantOnboarded
)
We will use a hybrid approach to balance financial integrity and performance:
In summary: We achieve the perfect balance. Financial integrity is guaranteed with strong consistency at the core, while performance and scalability are enabled by leveraging eventual consistency for reporting.
Purpose: Represents a single, isolated client or company using the platform.
Analogy: A tenant is like a separate, locked filing cabinet. Everything for one company (their accounts, their transactions) is inside their cabinet, and no other company can see into it.
Key Attributes:
tenant_id
: The unique key that identifies this
"filing cabinet." It's the most important column for ensuring
data isolation
name
: The legal or display name of the companystatus
: Tracks if the tenant is ACTIVE, SUSPENDED,
etc. This allows the platform to control access
config
: A flexible field (JSONB) to store
tenant-specific settings like their default currency, display
time zone, or feature flags
Role in the System: The tenants table is the root of the entire multi-tenancy model. Every other piece of data in the system must be linked back to a specific tenant.
Purpose: Defines the complete list of financial "buckets" where a tenant can record value. It's the structured list of everything the company owns (Assets), owes (Liabilities), its net worth (Equity), what it earns (Revenue), and what it spends (Expenses).
Analogy: If the tenant is a filing cabinet, the Chart of Accounts is the set of labeled folders inside that cabinet (e.g., "Cash," "Office Supplies," "Sales").
Key Attributes:
account_id
: The unique identifier for a single
account
tenant_id
: Links the account to its ownername
: The human-readable name of the account (e.g.,
"Bank of America Checking")
type
: A critical field that categorizes the account
(ASSET, LIABILITY, EQUITY, REVENUE, EXPENSE). This is essential
for generating financial statements like the Balance Sheet and
Income Statement
currency
: The currency this account operates in
(e.g., USD, EUR)
Role in the System: It provides the foundational structure for all accounting. No financial entry can be made without specifying which account it belongs to.
Purpose: Represents a high-level business event that causes one or more financial changes. It acts as a container or a grouping mechanism for related journal entries.
Analogy: A transaction is a single business activity, like a "Weekly Sales Summary" or a "Payroll Run for November." This single activity might have multiple distinct accounting impacts.
Key Attributes:
transaction_id
: The unique identifier for this
business event
event_type
: Describes what kind of business event
this was (e.g., PAYMENT_RECEIVED, VENDOR_BILL_PAID)
description
: A high-level description of the event
source_system
: Identifies where the event
originated (e.g., invoicing_platform, payroll_system)
Role in the System: It provides context and traceability, linking multiple, granular journal entries back to a single, understandable business cause.
Purpose: Represents a single, complete, and balanced journal entry. This is the core, atomic unit of accounting in the double-entry system. It is the formal record of a financial event.
Analogy: A journal is one specific accounting record within the "Payroll Run for November" transaction. For example, the journal entry for "Recording Salary Expenses" or the separate journal entry for "Recording Payroll Tax Liabilities."
Key Attributes:
journal_id
: The unique identifier for this specific
journal entry
transaction_id
: Links this journal back to the
parent business event (transaction)
effective_at
: The accounting date.
This is the date the journal applies to for financial reporting
purposes (e.g., a transaction recorded on Dec 2nd might have an
effective_at of Nov 30th). Stored as TIMESTAMPTZ
idempotency_key
: A crucial technical field to
prevent accidental duplicate submissions of the same journal
Role in the System: This is the primary auditable entity. Every record in this table must correspond to a set of entries whose debits and credits are perfectly balanced.
Purpose: This is the immutable general ledger. It represents the most granular level of detail: the individual debit and credit lines that make up a journal. All monetary amounts are stored as BIGINT integers in the smallest currency unit (e.g., cents for USD, pence for GBP) to prevent floating-point rounding errors.
Analogy: An entry is a single line item within a journal entry. For example, "Debit Cash account by $500" or "Credit Sales Revenue account by $500."
Key Attributes:
journal_id
: Links the entry back to its parent
journal
account_id
: Specifies which account "bucket" this
entry affects
amount
: The value of the movement, stored as an
integer in the smallest currency unit (e.g., cents)
direction
: The type of movement, either DEBIT or
CREDIT
Role in the System: This is the ultimate source of truth for all financial calculations. The balance of any account, at any point in time, can be calculated by summing all the entries for that account. This table is append-only; entries are never updated or deleted.
Double-entry integrity—the rule that for any given journal entry, the total debits must equal the total credits—is enforced at the Application Service Layer before any data is committed to the database.
The responsibility lies with the Journal Service.
The Process:
Receive Request: The Journal Service consumes an event or receives a request to create a new journal. This request includes the journal's metadata and a list of its proposed entries (the debit/credit lines).
In-Memory Validation: Before touching the database, the service performs an in-memory calculation:
total_debits = 0
and
total_credits = 0
total_debits
total_credits
Integrity Check: The service then asserts two conditions:
total_debits == total_credits
total_debits > 0
(to prevent empty or
zero-value journals)
Commit or Reject:
This application-level enforcement is the standard and most robust pattern, as this kind of multi-row, cross-table business rule cannot be effectively modeled with simple database constraints.
We use a pre-computed strategy to ensure high performance for reporting. Balances are not calculated on-read for standard reports.
The Process:
Write and Notify: After the Journal Service successfully commits a new journal and its entries to the Primary DB, it publishes a JournalPosted event to the Event Bus (Kafka).
Pre-computation: The dedicated Balance Service consumes this event. For each entry in the event, it performs a transactional read-modify-write operation on the balances summary table (also in the Primary DB). It places a lock on the specific account's balance row (SELECT FOR UPDATE), calculates the new balance, and saves it. This is the "pre-computation" step.
Replication: The database's streaming replication automatically copies the newly updated balance from the Primary DB to the Read Replica DB.
Fast Reads: The Reporting Service executes its queries against the Read Replica. To get an account's balance, it performs a simple, lightning-fast query:
SELECT balance
FROM balances
WHERE account_id = ?;
This CQRS approach ensures that the heavy calculation work is done once on the write path, making the read path (reporting) extremely fast and scalable.
Multi-tenancy is enforced using a robust, multi-layered strategy centered around a shared-database, shared-schema model with Row-Level Security.
Schema Foundation (Logical Separation):
Every table that contains tenant-specific data (accounts,
transactions, journals, entries, balances) has a mandatory,
non-nullable tenant_id
column. This foreign key
links every piece of data to its owner.
Application Layer Enforcement: The data
access layer (DAL) in every microservice is programmed to
always include a
WHERE tenant_id = ?
clause in every SQL query.
The current tenant_id
is derived from the
user's authenticated session (e.g., from a JWT). This
prevents accidental data leakage in the application code.
Database Layer Enforcement (The Ultimate Safeguard): We use PostgreSQL's Row-Level Security (RLS):
tenant_id
column in the row matches the
app.tenant_id
variable set for that session
We do not use a schema-per-tenant or database-per-tenant model, as those approaches are much harder to manage and scale to thousands of clients.
Our design handles these critical aspects with established best practices to ensure accuracy and prevent ambiguity.
Currency:
Integer Math Only: All monetary values (amount in entries, balance in balances) are stored as BIGINT integers. We store the value in the smallest denomination of the currency (e.g., cents for USD, pence for GBP). This completely eliminates floating-point rounding errors, which are unacceptable in a financial system.
Currency Code on Account: The currency for any given amount is not stored on the entries table itself. It is implicitly defined by the currency (e.g., 'USD', 'EUR') stored on the associated accounts record. This is efficient and ensures that an account's balance is always in its designated currency.
Multi-Currency Transactions: The system supports multi-currency journals by allowing entries to point to accounts with different currency codes. The Currency Service is responsible for providing exchange rates if a transaction needs to be balanced across different currencies, often by creating currency gain/loss entries.
Time Zone:
Store Everything in UTC: All timestamp
columns (posted_at
, event_time
,
etc.) are defined using the TIMESTAMPTZ (timestamp with time
zone) data type. PostgreSQL automatically stores this data
in Coordinated Universal Time (UTC), regardless of the
server's or client's time zone. This creates a single,
unambiguous point of reference for all events.
Convert at the Edge: All conversions to a
user's local time zone are handled at the very last moment,
exclusively in the frontend (UI) layer. The
backend services work only in UTC. The user's preferred time
zone can be stored in the tenants.config
JSONB
field and passed to the UI, which then uses it to format the
UTC timestamps correctly for display.
Consistency is paramount. We will adopt a standardized structure for all Spring Boot microservices.
Mandatory URL Path Versioning: All REST
controllers will be mapped under a versioned path, e.g.,
@RequestMapping("/api/v1/journals")
. This is
explicit and clear for all consumers.
Deprecation Policy: A formal policy will be
enforced: when v2
is released,
v1
enters a 6-month maintenance period with bug
fixes only, followed by decommissioning. Communication must
be proactive.
RESTful Principles: Endpoints will strictly
follow RESTful standards (/accounts
,
/accounts/{id}
)
JSON Naming: All DTOs (Data Transfer
Objects) will use snake_case
for JSON
properties, enforced globally with a Jackson
PropertyNamingStrategy
. This ensures
consistency across all services without manual annotations:
# In application.yml
spring:
jackson:
property-naming-strategy: SNAKE_CASE
camelCase
for methods/variables, PascalCase
for classes) will
be used
We will use the standard Maven/Gradle layout
(src/main/java
, src/main/resources
, etc.)
combined with a "package-by-feature" approach for clarity. The root
package will be com.company.serviceName
.
com.company.journalService
|-- Application.java
|
|-- api
| |-- V1
| | |-- JournalController.java
| | |-- dto
| | | |-- CreateJournalRequest.java
| | | +-- JournalResponse.java
|
|-- domain
| |-- Journal.java
| |-- Entry.java
| +-- JournalRepository.java
|
|-- service
| |-- JournalService.java
| |-- impl
| | +-- JournalServiceImpl.java
|
|-- event
| |-- KafkaProducer.java
| |-- KafkaConsumer.java
| |-- dto
| | +-- JournalPostedEvent.java
|
|-- config
| |-- KafkaConfig.java
| |-- AppConfig.java
|
+-- exception
|-- GlobalExceptionHandler.java
+-- UnbalancedJournalException.java
We will adhere strictly to Spring Boot's externalized configuration principles.
application.yml
: Default configuration values will be stored here, but
they are meant to be overridden.
No environment-specific values or secrets will be in
the repository.
Environment Variables & Spring Profiles: All configuration that changes between environments (dev,
staging, prod) will be managed via environment variables and
Spring Profiles (application-prod.yml
). The
active profile will be set via an environment variable
(SPRING_PROFILES_ACTIVE=prod
).
Centralized Secret Management: Secrets (DB passwords, API keys) will be managed by a dedicated system (HashiCorp Vault, AWS/Azure secret managers) and integrated with Spring Boot via the Spring Cloud Vault or equivalent provider. They will be loaded into the environment at runtime, never checked into Git.
Standardized Error Responses
(@ControllerAdvice
): A global GlobalExceptionHandler
using
@ControllerAdvice
will be implemented in each
service. It will catch specific exceptions (e.g.,
MethodArgumentNotValidException
, custom
exceptions like ResourceNotFoundException
) and
map them to a standard RFC 7807-compliant JSON error
response.
Idempotency: For POST
and
PUT
endpoints, we will implement an idempotency
filter. This filter will check for an
Idempotency-Key
header, store the request's
status and response in a shared cache (like
Redis) with a short TTL, and immediately
return the cached response on subsequent retries with the
same key.
Timeouts & Retries:
RestTemplate
or
WebClient
) and database operations. We will
configure aggressive connection and read timeouts.
Retry
module will be used to automatically
retry failed operations with configurable exponential
backoff and jitter. This will be applied to transient
network or database errors. A
Circuit Breaker
pattern will also be used
to prevent a service from repeatedly calling a
downstream service that is known to be failing.
In Transit: All services will be configured
to run on HTTPS only (server.ssl.enabled=true
).
A service mesh (like Istio/Linkerd) is preferred for
enforcing mutual TLS (mTLS) for all internal
service-to-service traffic.
At Rest: This is primarily a database and cloud provider configuration (e.g., enabling Transparent Data Encryption on PostgreSQL, enabling server-side encryption on S3 buckets).
@Aspect
) to create an
@Auditable
annotation. This annotation can be
placed on service methods. The aspect will automatically publish
an audit event to Kafka after the annotated method successfully
executes, without cluttering the business logic.
Spring Security: We will leverage Spring Security as the core of our access control model.
JWT Integration: A custom filter will be
configured to parse incoming JWTs from the
Authorization
header, validate the signature
against a JWKS endpoint, and populate the
SecurityContextHolder
with an authenticated
principal.
Method-Level Security: We will use
@PreAuthorize
annotations on controller and
service methods to enforce fine-grained, role-based access
control (e.g.,
@PreAuthorize("hasRole('ADMIN') and
@tenantSecurity.check(#tenantId)")
). This allows us to write expressive and declarative
security rules directly on the methods they protect.
Immutability Principle: Is the code
attempting to UPDATE
or
DELETE
records from core financial tables like
entries
? Any modification to posted ledger
entries must be a major red flag and requires strong
justification (e.g., a GDPR data erasure process).
Double-Entry Validation: For any code path
creating a journal
, does it rigorously and
unfailingly validate that
SUM(debits) == SUM(credits)
before committing the database transaction?
Transactional Integrity: Are all database
writes that must succeed or fail together (e.g., creating a
journal
and all its entries
)
wrapped in a single, atomic
@Transactional
block with appropriate
propagation settings?
Tenant Isolation: Is the
tenant_id
being correctly and universally
applied to
every single database query (reads and
writes)? Is there any possibility of a data leak or
cross-tenant action?
Idempotency: Does every
POST
or PUT
endpoint correctly
handle the Idempotency-Key
to prevent duplicate
operations from network retries?
Authorization Checks: Is method-level
security (@PreAuthorize
or similar) being
applied correctly? Does it check not only the user's role
but also their ownership of the resource (e.g., "Is this
user part of the tenant that owns this account?")?
Input Sanitization: Is all user/API input being validated to prevent vulnerabilities? For example, are string lengths checked, and are numeric inputs validated to prevent overflow? Is there any path for a SQL injection (should be prevented by using an ORM/JPA, but always good to check)?
Database Query Efficiency: Are database
queries performant? Are they using indexed columns in
WHERE
clauses? Is there any potential for an
N+1 query problem, especially when fetching collections?
Read vs. Write Path: Is the code respecting the CQRS pattern? Are read-heavy operations (reporting) correctly querying a read replica or a denormalized store? Are write operations correctly directed to the primary database?
Asynchronous Operations: Are long-running, non-critical tasks (like sending an email notification or calling a slow third-party system) being handled asynchronously (e.g., by publishing an event to Kafka) to avoid blocking the main request thread?
Clear Logging: Does the logging provide
sufficient context? At a minimum, logs should include a
correlation ID and the tenant_id
for every
request. Are sensitive PII (Personally Identifiable
Information) values being masked in logs?
Metrics: Is the code instrumented with key metrics? (e.g., using Micrometer to track latency of a specific endpoint, number of journals posted, or error rates).
Configuration Management: Are there any hardcoded secrets, passwords, or environment-specific URLs? All such values must be externalized and managed by a proper configuration/secret management system.
Error Handling: Are exceptions handled gracefully? Does the code differentiate between transient errors (which can be retried) and permanent errors (which should fail fast or be sent to a Dead-Letter Queue)?
Testing: Is there adequate test coverage? This must include not only happy-path unit tests but also integration tests for critical flows and tests for failure scenarios and boundary conditions (e.g., zero-value amounts, very large numbers, empty lists).
This is a critical accounting requirement that clashes with the goal of immutable, pre-computed balances. The key is to never change the past, but to correctly represent it.
A "backdated" transaction is not an edit of history. It is a
new transaction created today (posted_at = now
) with an effective date in the past (effective_at = a past date
). This maintains a perfect, immutable audit trail.
The challenge is correcting balances for reporting. Simply adding the backdated amount to the current balance is incorrect, as it misrepresents historical reports.
Data Model: The transactions table must
have both posted_at
(when it was recorded, as
TIMESTAMPTZ) and effective_at
(the business
date it applies to, as TIMESTAMPTZ).
Balance Correction:
effective_at
is now incorrect.
Handling Historical Reporting: We cannot afford to recalculate reports from the beginning of time for every request. We use a snapshot and replay mechanism:
Periodic Snapshots: The system will
generate and store an
end_of_day_balances
snapshot for every account,
every day.
CREATE TABLE end_of_day_balances (
account_id UUID,
balance_date DATE,
end_of_day_balance BIGINT,
PRIMARY KEY (account_id, balance_date)
);
Generating a Historical Report (e.g., balance for Nov 15th):
end_of_day_balances
for that account and
date
effective_at
is between the
snapshot date and the requested date
(Optional) Handling Backdated Entries: When
a transaction with a past effective_at
is
posted, we must invalidate affected snapshots. We can run an
asynchronous job that finds the first affected snapshot (the
one on the effective_at
date) and recalculates
all subsequent daily snapshots up to the present day. This
is a heavy operation but runs in the background and ensures
future reports are fast and correct.
This approach honors immutability and auditability while providing a mechanism for accurate historical reporting, trading off some background processing cost for fast reads.
Financial systems operate under strict regulatory scrutiny (e.g., SOX, SEC rules), demanding a complete, verifiable, and tamper-resistant trail of every significant action. A naive approach of logging everything to a primary database would cause massive storage bloat, degrade performance, and fail to meet compliance standards for data immutability.
Our design must produce a detailed audit trail that is:
Our solution is to treat auditing not as a feature of other services, but as a dedicated, first-class system component with its own data pipeline.
The foundation of our audit strategy is the Event Bus (Apache Kafka). We establish a convention that any service performing a significant action must publish a corresponding event.
JournalPosted
AccountCreated
, AccountUpdated
TenantSuspended
These events are structured, versioned (e.g., using Avro/Protobuf
schemas), and contain rich context: what happened,
who did it (user_id
), when it
happened (timestamp as TIMESTAMPTZ), and from
where (source_ip
). This event stream is our
raw source of audit data.
We have a dedicated, standalone Audit Service. This service has one, and only one, purpose: to consume events from Kafka and persist them to a secure, long-term store.
The Audit Service does not write to our primary PostgreSQL database. It writes to a dedicated Object Storage System (like AWS S3) configured for maximum security and compliance.
How it Works without Bloating Storage:
How it Ensures Immutability and Verifiability:
For compliance checks or investigations, we use a "schema-on-read" query engine like AWS Athena. Athena can run standard SQL queries directly on the compressed files in S3 without needing to load them into a database. This provides powerful query capabilities on our long-term, low-cost audit archive.
This design makes a deliberate trade-off:
This is the gold standard for modern audit and compliance design.
Our reporting strategy is built on the
CQRS (Command Query Responsibility Segregation)
pattern. We have made a fundamental architectural decision to
physically separate the data structures used for writing
transactions (the normalized entries
table) from the
data structures used for reading and reporting (the pre-computed
balances
and end_of_day_balances
tables).
This separation is the key to achieving high performance.
Pre-computation: We do the heavy lifting
upfront. The Balance Service
consumes events
and continuously updates summary tables
(balances
, end_of_day_balances
).
Instead of calculating a sum over millions of rows at report
time, we read a single, pre-calculated value.
Dedicated Read Replicas: All reporting queries are directed to a Read Replica of our PostgreSQL database. This ensures that even a complex, ad-hoc report will never block or slow down the critical write path on the Primary DB where new transactions are being recorded.
Dedicated Analytics Database: For the most complex, non-standard, or resource-intensive queries, we offload the work entirely to a dedicated Analytics DB (ClickHouse). This columnar database is specifically designed to scan billions of rows in seconds, providing powerful ad-hoc capabilities without touching the operational database cluster at all.
Strategic Caching: For extremely common reports (e.g., last month's income statement), the final JSON payload can be cached in an in-memory store like Redis. This allows us to serve the report in milliseconds without hitting any database.
What it is: A list of every account in the Chart of Accounts and its final debit or credit balance at a specific point in time. The purpose is to verify that total debits equal total credits across the entire system.
How we support it: This is the most direct
beneficiary of our pre-computed balances
table.
Reporting Service
receives a request, typically
for the current moment
JOIN
query against the
Read Replica:
SELECT a.account_code, a.name, a.type, b.balance
FROM accounts a
JOIN balances b ON a.account_id = b.account_id
WHERE a.tenant_id = ? AND a.is_active = TRUE;
Performance: This report is generated almost instantly, regardless of the number of transactions, because it only reads the final summary data.
What they are: These are not new queries; they are specific presentations of the Trial Balance data.
How we support them:
Reporting Service
starts by fetching the Trial
Balance data using the exact same fast query as above
type
(ASSET
,
LIABILITY
, EQUITY
)
type
(REVENUE
,
EXPENSE
)
Performance: Like the Trial Balance, these reports
are extremely fast because the underlying data retrieval is from the
pre-computed balances
table. The formatting is a quick
in-memory operation.
What it is: A detailed list of all individual transactions that occurred for a specific account (or set of accounts) within a given date range (e.g., "Show me all entries for the 'Office Supplies' account from Oct 1 to Oct 31").
How we support it: This is the one type of report that cannot be served from our summary tables. It requires querying the raw ledger data. This is where our offloading strategy is critical.
Reporting Service
identifies this as a
transactional query
entries
SELECT effective_at, description, direction, amount
FROM entries
WHERE account_id = ?
AND effective_at BETWEEN '2023-10-01' AND '2023-10-31'
ORDER BY effective_at;
Performance: While more intensive than a balance
lookup, this query is still very fast. Columnar databases like
ClickHouse are optimized for this exact type of query—filtering on a
few columns (account_id
, effective_at
) and
retrieving data from others over a large number of rows. This keeps
the operational PostgreSQL cluster completely free from this heavy
analytical workload.
What it is: The ability to see a Trial Balance, Balance Sheet, or Income Statement as it stood at a specific point in the past (e.g., "What was our Balance Sheet at the close of business last quarter?").
How we support it: We use a dedicated snapshot table, populated by a periodic batch job.
end_of_day_balances
Table: As
described in the backdating problem, a nightly job
calculates and stores the definitive end-of-day balance for
every account.
When a user requests a historical report for a date
YYYY-MM-DD
, the
Reporting Service
queries this snapshot table
instead of the live balances
table:
SELECT a.account_code, a.name, a.type, dbs.balance
FROM accounts a
JOIN end_of_day_balances dbs ON a.account_id = dbs.account_id
WHERE a.tenant_id = ? AND dbs.snapshot_date = 'YYYY-MM-DD';
This provides a fast, accurate baseline of the historical state
To account for any backdated transactions posted
after the snapshot was taken, the service runs a
small, secondary query on the entries
table to
find and apply any adjustments (as detailed in the
backdating solution)
Performance: This is extremely fast. The query hits a simple, indexed snapshot table. The adjustment query, if needed, is over a very small and recent subset of data, ensuring that generating historical reports does not require a full ledger scan.