Account DB and Container DB¶
DB¶
Database code for Swift
- swift.common.db.BROKER_TIMEOUT = 25¶
Timeout for trying to connect to a DB
- swift.common.db.DB_PREALLOCATION = False¶
Whether calls will be made to preallocate disk space for database files.
- exception swift.common.db.DatabaseAlreadyExists(path)¶
Bases:
DatabaseError
More friendly error messages for DB Errors.
- class swift.common.db.DatabaseBroker(db_file, timeout=25, logger=None, account=None, container=None, pending_timeout=None, stale_reads_ok=False, skip_commits=False)¶
Bases:
object
Encapsulates working with a database.
- property db_file¶
- delete_db(timestamp)¶
Mark the DB as deleted
- Parameters:
timestamp – internalized delete timestamp
- delete_meta_whitelist = []¶
- empty()¶
Check if the broker abstraction contains any undeleted records.
- get()¶
Use with the “with” statement; returns a database connection.
- get_device_path()¶
- get_info()¶
- get_items_since(start, count)¶
Get a list of objects in the database between start and end.
- Parameters:
start – start ROWID
count – number to get
- Returns:
list of objects between start and end
- get_max_row(table=None)¶
- get_raw_metadata()¶
- get_replication_info()¶
Get information about the DB required for replication.
- Returns:
dict containing keys from get_info plus max_row and metadata
- Note:: get_info’s <db_contains_type>_count is translated to just
“count” and metadata is the raw string.
- get_sync(id, incoming=True)¶
Gets the most recent sync point for a server from the sync table.
- Parameters:
id – remote ID to get the sync_point for
incoming – if True, get the last incoming sync, otherwise get the last outgoing sync
- Returns:
the sync point, or -1 if the id doesn’t exist.
- get_syncs(incoming=True, include_timestamp=False)¶
Get a serialized copy of the sync table.
- Parameters:
incoming – if True, get the last incoming sync, otherwise get the last outgoing sync
include_timestamp – If True include the updated_at timestamp
- Returns:
list of {‘remote_id’, ‘sync_point’} or {‘remote_id’, ‘sync_point’, ‘updated_at’} if include_timestamp is True.
- initialize(put_timestamp=None, storage_policy_index=None)¶
Create the DB
The storage_policy_index is passed through to the subclass’s
_initialize
method. It is ignored byAccountBroker
.- Parameters:
put_timestamp – internalized timestamp of initial PUT request
storage_policy_index – only required for containers
- is_deleted()¶
Check if the DB is considered to be deleted.
- Returns:
True if the DB is considered to be deleted, False otherwise
- is_reclaimable(now, reclaim_age)¶
Check if the broker abstraction is empty, and has been marked deleted for at least a reclaim age.
- lock()¶
Use with the “with” statement; locks a database.
- make_tuple_for_pickle(record)¶
Turn this db record dict into the format this service uses for pending pickles.
- maybe_get(conn)¶
- merge_items(item_list, source=None)¶
Save :param:item_list to the database.
- merge_syncs(sync_points, incoming=True)¶
Merge a list of sync points with the incoming sync table.
- Parameters:
sync_points – list of sync points where a sync point is a dict of {‘sync_point’, ‘remote_id’}
incoming – if True, get the last incoming sync, otherwise get the last outgoing sync
- merge_timestamps(created_at, put_timestamp, delete_timestamp)¶
Used in replication to handle updating timestamps.
- Parameters:
created_at – create timestamp
put_timestamp – put timestamp
delete_timestamp – delete timestamp
- property metadata¶
Returns the metadata dict for the database. The metadata dict values are tuples of (value, timestamp) where the timestamp indicates when that key was set to that value.
- newid(remote_id)¶
Re-id the database. This should be called after an rsync.
- Parameters:
remote_id – the ID of the remote database being rsynced in
- possibly_quarantine(err)¶
Checks the exception info to see if it indicates a quarantine situation (malformed or corrupted database). If not, the original exception will be reraised. If so, the database will be quarantined and a new sqlite3.DatabaseError will be raised indicating the action taken.
- put_record(record)¶
Put a record into the DB. If the DB has an associated pending file with space then the record is appended to that file and a commit to the DB is deferred. If its pending file is full then the record will be committed immediately.
- Parameters:
record – a record to be added to the DB.
- Raises:
DatabaseConnectionError – if the DB file does not exist or if
skip_commits
is True.LockTimeout – if a timeout occurs while waiting to take a lock to write to the pending file.
- quarantine(reason)¶
The database will be quarantined and a sqlite3.DatabaseError will be raised indicating the action taken.
- reclaim(age_timestamp, sync_timestamp)¶
Delete reclaimable rows and metadata from the db.
By default this method will delete rows from the db_contains_type table that are marked deleted and whose created_at timestamp is < age_timestamp, and deletes rows from incoming_sync and outgoing_sync where the updated_at timestamp is < sync_timestamp. In addition, this calls the
_reclaim_metadata()
method.Subclasses may reclaim other items by overriding
_reclaim()
.- Parameters:
age_timestamp – max created_at timestamp of object rows to delete
sync_timestamp – max update_at timestamp of sync rows to delete
- update_metadata(metadata_updates, validate_metadata=False)¶
Updates the metadata dict for the database. The metadata dict values are tuples of (value, timestamp) where the timestamp indicates when that key was set to that value. Key/values will only be overwritten if the timestamp is newer. To delete a key, set its value to (‘’, timestamp). These empty keys will eventually be removed by
reclaim()
- update_put_timestamp(timestamp)¶
Update the put_timestamp. Only modifies it if it is greater than the current timestamp.
- Parameters:
timestamp – internalized put timestamp
- update_status_changed_at(timestamp)¶
Update the status_changed_at field in the stat table. Only modifies status_changed_at if the timestamp is greater than the current status_changed_at timestamp.
- Parameters:
timestamp – internalized timestamp
- updated_timeout(new_timeout)¶
Use with “with” statement; updates
timeout
within the block.
- static validate_metadata(metadata)¶
Validates that metadata falls within acceptable limits.
- Parameters:
metadata – to be validated
- Raises:
HTTPBadRequest – if MAX_META_COUNT or MAX_META_OVERALL_SIZE is exceeded, or if metadata contains non-UTF-8 data
- exception swift.common.db.DatabaseConnectionError(path, msg, timeout=0)¶
Bases:
DatabaseError
More friendly error messages for DB Errors.
- class swift.common.db.GreenDBConnection(database, timeout=None, *args, **kwargs)¶
Bases:
Connection
SQLite DB Connection handler that plays well with eventlet.
- commit()¶
Commit any pending transaction to the database.
If there is no open transaction, this method is a no-op.
- cursor(cls=None)¶
Return a cursor for the connection.
- db_file¶
- execute(*args, **kwargs)¶
Executes an SQL statement.
- timeout¶
- class swift.common.db.GreenDBCursor(*args, **kwargs)¶
Bases:
Cursor
SQLite Cursor handler that plays well with eventlet.
- db_file¶
- execute(*args, **kwargs)¶
Executes an SQL statement.
- timeout¶
- swift.common.db.PICKLE_PROTOCOL = 2¶
Pickle protocol to use
- swift.common.db.QUERY_LOGGING = False¶
Whether calls will be made to log queries (py3 only)
- class swift.common.db.TombstoneReclaimer(broker, age_timestamp)¶
Bases:
object
Encapsulates reclamation of deleted rows in a database.
- get_tombstone_count()¶
Return the number of remaining tombstones newer than
age_timestamp
. Executes thereclaim
method if it has not already been called on this instance.- Returns:
The number of tombstones in the
broker
that are newer thanage_timestamp
.
- reclaim()¶
Perform reclaim of deleted rows older than
age_timestamp
.
- swift.common.db.chexor(old, name, timestamp)¶
Each entry in the account and container databases is XORed by the 128-bit hash on insert or delete. This serves as a rolling, order-independent hash of the contents. (check + XOR)
- Parameters:
old – hex representation of the current DB hash
name – name of the object or container being inserted
timestamp – internalized timestamp of the new record
- Returns:
a hex representation of the new hash value
- swift.common.db.dict_factory(crs, row)¶
This should only be used when you need a real dict, i.e. when you’re going to serialize the results.
- swift.common.db.get_db_connection(path, timeout=30, logger=None, okay_to_create=False)¶
Returns a properly configured SQLite database connection.
- Parameters:
path – path to DB
timeout – timeout for connection
okay_to_create – if True, create the DB if it doesn’t exist
- Returns:
DB connection object
- swift.common.db.native_str_keys_and_values(metadata)¶
- swift.common.db.zero_like(count)¶
We’ve cargo culted our consumers to be tolerant of various expressions of zero in our databases for backwards compatibility with less disciplined producers.
DB replicator¶
- class swift.common.db_replicator.BrokerAnnotatedLogger(logger)¶
Bases:
object
Formats log messages with broker details.
This class augments messages with the broker’s container path and DB file path so that logs are easier to correlate during replication and sharding workflows.
- debug(broker, msg, *args, **kwargs)¶
- error(broker, msg, *args, **kwargs)¶
- exception(broker, msg, *args, **kwargs)¶
- info(broker, msg, *args, **kwargs)¶
- warning(broker, msg, *args, **kwargs)¶
- class swift.common.db_replicator.ReplConnection(node, partition, hash_, logger)¶
Bases:
BufferedHTTPConnection
Helper to simplify REPLICATEing to a remote server.
- replicate(*args)¶
Make an HTTP REPLICATE request
- Parameters:
args – list of json-encodable objects
- Returns:
bufferedhttp response object
- class swift.common.db_replicator.Replicator(conf, logger=None)¶
Bases:
Daemon
Implements the logic for directing db replication.
- cleanup_post_replicate(broker, orig_info, responses)¶
Cleanup non primary database from disk if needed.
- Parameters:
broker – the broker for the database we’re replicating
orig_info – snapshot of the broker replication info dict taken before replication
responses – a list of boolean success values for each replication request to other nodes
- Return success:
returns False if deletion of the database was attempted but unsuccessful, otherwise returns True.
- delete_db(broker)¶
- extract_device(object_file)¶
Extract the device name from an object path. Returns “UNKNOWN” if the path could not be extracted successfully for some reason.
- Parameters:
object_file – the path to a database file.
- report_up_to_date(full_info)¶
- roundrobin_datadirs(dirs)¶
- run_forever(*args, **kwargs)¶
Replicate dbs under the given root in an infinite loop.
- run_once(*args, **kwargs)¶
Run a replication pass once.
- class swift.common.db_replicator.ReplicatorRpc(root, datadir, broker_class, mount_check=True, logger=None)¶
Bases:
object
Handle Replication RPC calls. TODO(redbo): document please :)
- complete_rsync(drive, db_file, args)¶
- debug_timing(name)¶
- dispatch(replicate_args, args)¶
- merge_items(broker, args)¶
- merge_syncs(broker, args)¶
- rsync_then_merge(drive, db_file, args)¶
- sync(broker, args)¶
- swift.common.db_replicator.looks_like_partition(dir_name)¶
True if the directory name is a valid partition number, False otherwise.
- swift.common.db_replicator.quarantine_db(object_file, server_type)¶
In the case that a corrupt file is found, move it to a quarantined area to allow replication to fix it.
- Parameters:
object_file – path to corrupt file
server_type – type of file that is corrupt (‘container’ or ‘account’)
- swift.common.db_replicator.roundrobin_datadirs(datadirs)¶
Generator to walk the data dirs in a round robin manner, evenly hitting each device on the system, and yielding any .db files found (in their proper places). The partitions within each data dir are walked randomly, however.
- Parameters:
datadirs – a list of tuples of (path, context, partition_filter) to walk. The context may be any object; the context is not used by this function but is included with each yielded tuple.
- Returns:
A generator of (partition, path_to_db_file, context)