Managing registry backup schedule and retention

🌐 This document is available in both English and Ukrainian. Use the language toggle in the top right corner to switch between versions.

1. General description

In the current version of the Registry Platform, there is no process for managing the settings of the registry backup subsystem. To implement such functionality, it is proposed to improve the admin console interface, which the administrator uses the registry can configure the schedule and time to store registry backups.

2. User roles

  • Registry administrator

3. Functional scenarios

  • Entering the backup schedule through the admin console

  • Setting the storage time of backup copies through the admin console

4. General provisions

  • Entry of the backup schedule in unix-cron format is set by the administrator

  • Backup storage time settings are specified in days by the administrator and set in hours for the backup system

  • Setting the backup schedule and backup storage time is optional when creating the registry

  • When the storage period expires, the backup system deletes outdated backup copies

  • The entered schedule must correspond to the unix-cron format and be validated by the admin console

  • Backup storage time must be greater than or equal to one, be an integer, and contain no special characters

  • Automatic backups can be turned off or on with a switch

  • When entering a backup schedule, the admin console should show when the next three backup runs will take place

  • When updating the registry to a new version, the backup schedule and backup storage time settings should remain unchanged

  • By default, the automatic backups setting is disabled for new registries

  • The admin console should show the date of the last successful backup and the date of its deletion

  • By default, the time zone Europe/Kiev is set in values.yaml and at the pod level with Jenkins, as an environment variable

  • When backup is disabled, existing backups are not deleted

  • The admin console should reflect the specified time zone in values.yaml when configuring the backup schedule.

5. Current technical design

In the current version of the platform, registry backup is available only when the corresponding job is manually launched. Setting the storage time of backup copies is not provided.

6. Technical solution design

This diagram shows the services involved in fulfilling the requirements and the interaction between them. In addition, important features that must be taken into account in the framework of implementation are depicted.

remote-file-transfer

The backup schedule and backup retention time are passed in values.yaml

global:
  timeZone: Europe/Kiev
  .....
  registryBackup:
    enabled: true
    schedule: "30 19 * * *"
    expiresInDays: 3

and in the registry-parameters/values codebase annotation of the registry resource. The operator must react to the change in the CR codebase and trigger the job provisioner, which will recreate Create-registry-backup job with new parameters. An example of configuring a schedule in Jenkins:

30 19 * * *

7. Management interface

schedule 2
Expand to see more mockups
  • Initial state. Backup is disabled:

    schedule 1
  • Previous backups exist in the system. We output the date of creation of the copy and the number of days until its deletion:

    schedule 3

8. High-level development plan

8.2. Development plan

  • Extending the functionality of the codebase operator with a jenkins job provisioner trigger after updating the CR codebase

  • Extension of the UI functionality of the admin console for entering / saving backup schedule settings and their storage time

  • Development of groovy functions in jenkins job provisioner for updating parameters in Create-registry-backup job.

9. Data migration when updating the registry

  • When updating the registry to a new version of the backup schedule settings, the current settings should remain unchanged.

  • It is necessary to provide for the possibility of disabling automatic backup of the registry.

10. Security

10.1. Business data

Data category

Description

Privacy

Integrity

Accessibility

Technical data containing open information

System settings, configs, parameters with non-confidential values, but changing which can negatively affect system attributes

Absent

High

High

10.3. Security risk mitigation and compliance

Risk

Security controls

Realization

Priority

Remote command execution (RCE). The expiresInDays value without sanitization is committed to herit from the admin console interface. When starting the backup procedure, the value is passed to the backup-registry.sh script as an argument, again without sanitization, which allows you to execute any system command on the provisioner

  • Implement a positive validation mechanism for the "Schedule" form on the frontend

  • Implement a positive validation mechanism for data from the "Schedule" form on the backend

  • Implement a strict typing and validation mechanism for expiresInDays data on the frontend

  • Implement a mechanism of strict typing and validation for data from the expiresInDays form on the backend

  • Implement the argument sanitization mechanism in the backup-registry.sh script

Partially considered in the initial design. It remains to implement the argument sanitization mechanism in the backup-registry.sh script

Critical

Refusal of Service (DOS) by scheduling backups to run every minute

  • Design a limit for the backup schedule to run at least once an hour.

Not considered in the initial design

High

Risk of data loss if the storage period for backup copies is too short

  • Develop a minimum limit for the storage period of backup copies equal to 7 days.

Not considered in the initial design

High

Risk of data loss if backup is not enabled by default. (Secure by default)

  • Develop a default backup and retention schedule and use it for new registries.

Not considered in the initial design

High

Disclaimer of authorship. Lack of audit log and information about who was involved in the backup configuration.

  • The target service should log all requests and send them to a centralized logging and monitoring system.

  • Make sure that all unsuccessful requests and errors during operations will be logged.

  • The logging system must use a unified time and time zone.

  • Logs must be in a unified format and contain all the necessary information for the investigation of security incidents.

  • Logs should not contain sensitive information or it should be obfuscated accordingly

Not considered in the initial design

Average

Risk of data leakage when using external domain name space

  • Transfer all internal interservice communication to private domain names.

Partially considered in the initial design. Some services may use external addresses. It is necessary to transfer all services to communication within a private network

Average

Security requirements: Configuring network security policies

  • Configure network policies to conform to the principle of least privilege.

Considered in the initial design

Average

11. Glossary and acronyms

Term Description

СR

Custom Resource