Maintenance
Truxton requires some maintenance here and there. Maintenance tasks are generally exposed to the scripting side of Truxton rather than the GUI side. This allows the maintenance to be automated and customized to customer's needs.
Delete Media
When you want to delete a piece of media from Truxton, you have a few choices:
- Use the Media Management page on the Analyst Desktop
- Use TruxtonCLI
- Write your own program or script
These will delete the meta data pertaining to a piece of media from the database. It will not delete the file contents associated with the media. To do that, you must perform a depot consolidation. By selecting "Clean Data" from the Truxton Tray application, you will enqueue tasks to clean the database of orphaned records, consolidate the depot files, Delete Depots unused depot files, and optimize the database. This process may take a while so starting it on a Friday afternoon and letting it run over the weekend might be a good idea.
It is far more efficient to delete multiple media before executing a "Clean Data" because it increases the chances that depot files can be deleted rather than have their contents consolidated multiple times.
Why
The complete deletion of media is a multi-stage process because of Truxton's scalable design. Writing to a depot file is isolated to a single process. There can be hundreds of ETL processes running on many machines throughout the network. By letting each ETL process have its own depot file to write to, no coordination with other processes is needed and performance improved. But, the TANSTAAFL principal states that you have to pay for it somewhere. The simplicity of one depot per ETL comes at the cost of contents from different media existing in the same depot file. Complexity is introduced only when you want to remove contents from Truxton.
Deleting both the metadata for a piece of media and its contents results in a workflow of:
- Delete the records from the database for the given media identifier
- Remove records from the
[Content]
table that no longer have a hash that exists in the[File]
table - Remove records from the
[Depot]
table that are no longer referenced in the[Content]
and rename the depot file to include.ToBeDeleted
- Consolidate depots to remove the no longer referenced contents and rename donor depots to
.ToBeDeleted
- Delete the
.ToBeDeleted
depot files
The good news is users do not have to wait for this process to finish. Access to file contents is never impeded. The bad news is this process cannot be parallelized. It requires a single process to handle the movement of the contents between depot files. When the Maintenance ETL is busy consolidating depots, it cannot process any other messages. Its message queue will fill with work to be done. You can stop the consolidation process by shutting down the Maintenance ETL. When it restarts, it will pick up the next message in its queue.
Slowing Down
As Truxton grows, the database has more work to do. Often times, the query planner will have outdated statistics to deal with.