Troubleshooting & FAQ
Where do I find the DBOS tables?
DBOS checkpoints information about your workflows in an isolated system database in your Postgres database server.
You connect to this database through the system_database_url parameter in DBOS configuration.
You can connect to and explore your system database with popular database clients like psql and DBeaver.
Note that the tables are in the dbos schema in that database, so the tables are accessible at dbos.workflow_status, dbos.operation_outputs, etc.
The system database schema is documented here.
What size of database should I use for DBOS?
For a typical workload, processing 1000 actions (steps or workflows) per second (2 billion actions per month) with DBOS requires 4 Postgres vCPUs. This is a conservative estimate that leaves headroom for unexpected bursts or spikes. When sizing your database for DBOS, we recommend using that number (scaled to your actual workload size) as a starting point then measuring usage in practice.
Why is my queue stuck?
If a DBOS queue is stuck (workflows are not moving from ENQUEUED to PENDING), it is likely that either the number of PENDING workflows exceeds the queue's global "concurrency" limit or the number of queued workflows in a PENDING state on each worker exceeds the queue's "worker concurrency" limit. In either case, new tasks cannot be dequeued until some currently executing tasks complete or are cancelled. You can view all tasks executing on a queue from the "Queues" tab of the DBOS Console
If you need to, you can cancel tasks to remove them from the queue.
Why is my workflow not finishing?
The most common cause of "stuck" workflows is logic issues: infinite loops, indefinitely waiting for an event, or improper use of async in Python or TypeScript. The last is always worth checking when using those languages: any synchronous call anywhere in your program can block your event loop, preventing async operations (such as workflows) from making progress.
If a worker crash or outage occurred, it may briefly delay workflow completion. In certain rare cases, you may need to allow up to 15 minutes for Conductor to begin workflow recovery.
If workflows do not recover after a code upgrade, the cause is often version mismatch. If you are using versioning, check that your app version matches the version of your workflow.
How can I cancel or fork a large number of workflows in a batch?
On the DBOS Console, filter for all workflows that meet your criteria, then select them all and apply a batch operation. Alternatively, write a script using the DBOS Client (Python, TypeScript, Go, Java) to list all the workflows that fit your criteria, then process them.
Why am I seeing errors that objects cannot be deserialized?
DBOS requires that the inputs and outputs of workflows, as well as the outputs of steps, are serializable.
This is because DBOS checkpoints these inputs and outputs to the database to recover workflows from failures.
DBOS serializes objects to JSON in TypeScript and Go, with pickle in Python, and with Jackson in Java.
If your workflow needs to access an unserializable object like a database connection or API client, do not pass it into the workflow as an argument. Instead, either construct the object inside the workflow from parameters passed into the workflow, or construct it globally.
How large can serialized step and workflow outputs be?
DBOS stores two serialized fields (inputs and output) for each workflow and one output field for each step. Each of these is stored as a Postgres TEXT value which is limited by the maximum field size; currently 1GB. See Postgres documentation.
Why am I seeing an error that function X was recorded when Y was expected?
This error arises when DBOS is recovering a workflow and attempts to execute step Y, but finds a checkpoint in the database for step X instead. Typically, this occurs because the workflow function is not deterministic. A workflow function is deterministic if, when called multiple times with the same inputs, it invokes the same steps with the same inputs in the same order (given the same return values from those steps). If a workflow is non-deterministic, it may execute different steps during recovery than it did during its original execution.
To make a workflow deterministic, make sure all non-deterministic operations (such as calling a third-party API, generating a random number, or getting the local time) are performed in steps instead of in the workflow function.
Can I call a workflow from a workflow?
Yes, you can call (or start, or enqueue) a workflow from inside another workflow. That workflow becomes a child of its caller and is by default assigned a workflow ID derived from its parent's. If you view a workflow's trace from the DBOS console, it will include the workflow's children.
Can I call a step from a step?
Yes, you can call a step from another step. However, the called step becomes part of the calling step's execution rather than functioning as a separate step.
Can I start, monitor, or cancel DBOS workflows from a non-DBOS application?
Yes, your non-DBOS application can create a DBOS Client (Python, TypeScript, Go, Java) and use it to enqueue a workflow in your DBOS application and interact with it or check its status.