RUFAS.input_manager module#

RUFAS.input_manager.FIXABLE_INPUT_DATA_TYPES: set[str] = {'bool', 'number', 'string'}#: Set enumerating the input data formats the Input Manager can accept.

class RUFAS.input_manager.InputManager(metadata_depth_limit: int | None = None)#

Bases: object

Input Manager class responsible for loading, validating, and providing access to input data.

__instance = None#

__init__(metadata_depth_limit: int | None = None) → None#

property meta_data: Dict[str, Any]#: The getter method for __metadata

property pool: Dict[str, Any]#: The getter method for __pool

start_data_processing(metadata_path: Path, eager_termination: bool = True) → bool#

Starts the pipeline for organizing metadata and input data processing.

Parameters#

metadata_pathPath: File path to the metadata.
eager_terminationbool, default=True: If True, the process will be terminated as soon as finding invalid data and failing to fix it. If False, the process will be terminated after going through and validating the entire data.

Returns#

bool: True if data is valid, otherwise False.

_load_metadata(metadata_path: Path) → None#

Loads metadata from json file to IM metadata dict.

Parameters#

metadata_pathPath: The path to the metadata file.

Raises#

Exception: If an error occurs while opening or reading the metadata_path file.

_load_properties() → None#

Loads properties data from a specified JSON file and updates the metadata.

This method reads the properties file path from the metadata, checks if the file exists, and then loads the properties into the metadata. The original properties data in the metadata is first copied to a separate attribute for future reference and then removed from the metadata files section.

Raises#

FileNotFoundError: If the properties file does not exist at the specified path.
json.JSONDecodeError: If there is an error in decoding the JSON file.
Exception: For any other unexpected errors during properties loading.

_load_data_from_json(file_path: Path) → Dict[str, Any]#

Loads data from input json file.

Parameters#

file_pathPath: Path to the input file to load.

Returns#

Dict[str, Any]: The data dictionary loaded from the json file.

Raises#

Exception: For any other unexpected errors during JSON file loading.

_load_data_from_csv(file_path: Path) → Dict[str, Any]#

Loads data from input csv file.

Parameters#

file_pathPath: Path to the input file to load.

Returns#

Dict[str, Any]: The data dictionary loaded from the json file.

Raises#

FileNotFoundError: If the CSV file does not exist at the specified path.
Exception: For any other unexpected errors during CSV file loading.

_populate_pool(eager_termination: bool) → bool#

Loads input files, runs validations on the data from the input files, attempts to fix invalid data, then adds data to the pool.

Parameters#

eager_terminationbool: If True, the process will be terminated as soon as finding invalid data and failing to fix it. If False, the process will be terminated after going through and validating the entire data, If invalid data is found.

Returns#

bool: True if data is valid, otherwise False.

Raises#

KeyError: If faulty data type found in data blob key.

_get_variable_modifiability(variable_name: str, variable_properties: Dict[str, Any]) → Modifiability#

Determines the modifiability status of a variable based on its properties and returns the corresponding enum value.

Notes#

This function looks for a ‘modifiability’ key within variable_properties. If present and its value is not empty, the function attempts to map this value to an enum member in Modifiability. If the value does not correspond to any enum members, a KeyError is raised after logging the error. If ‘modifiability’ is absent or its value is empty, the function defaults to Modifiability.NOT_REQUIRED_AND_UNLOCKED.

Parameters#

variable_namestr: The name of the variable for which the modifiability status is being determined. Used for error logging.
variable_propertiesDict[str, Any]: A dictionary containing the properties of the variable, containing the desired ‘modifiability’ property.

Returns#

Modifiability: An enum member representing the variable’s modifiability status.

Raises#

KeyError: If ‘modifiability’ in variable_properties does not match any enum member in Modifiability. The error message includes the invalid modifiability value and suggests valid values.

_is_input_required_upon_initialization(variable_name: str, variable_properties: Dict[str, Any]) → bool#

Determines whether a variable requires an input value upon initialization based on its modifiability status.

This function utilizes the ‘_get_variable_modifiability’ method to ascertain the modifiability status of the variable identified by ‘variable_name’ and described by ‘variable_properties’. It then checks if the modifiability status is either ‘REQUIRED_AND_LOCKED’ or ‘REQUIRED_AND_UNLOCKED’, indicating that the variable must be initialized with a value.

Parameters#

variable_namestr: The name of the variable being evaluated for its initialization requirements.
variable_propertiesDict[str, Any]: A dictionary containing the properties of the variable, which should include its modifiability status among others.

Returns#

bool: True if the variable’s modifiability status necessitates an input value upon initialization, False otherwise.

_is_modifiable_during_runtime(variable_name: str, variable_properties: Dict[str, Any]) → bool#

Checks if a variable can be modified during runtime based on its modifiability status.

This function determines the modifiability status of a variable using the ‘_get_variable_modifiability’ method. It assesses whether the variable, identified by ‘variable_name’ and described by ‘variable_properties’, is allowed to be modified after initialization. A variable is considered modifiable during runtime if its modifiability status is either ‘REQUIRED_AND_UNLOCKED’ or ‘NOT_REQUIRED_AND_UNLOCKED’.

Parameters#

variable_namestr: The name of the variable to check for runtime modifiability.
variable_propertiesDict[str, Any]: A dictionary containing the properties of the variable, including details that determine its modifiability.

Returns#

bool: True if the variable is allowed to be modified during runtime, False otherwise.

_log_missing_data(variable_properties: Dict[str, Any], var_name: str, called_during_initialization: bool) → None#

Handles logging for missing data for a variable, logging errors or warnings based on the context of initialization or runtime updates.

Parameters#

variable_propertiesDict[str, Any]: Properties of the variable, potentially including its modifiability status.
var_namestr: The name of the variable with missing data.
called_during_initialization: bool: Boolean variable indicating whether the function is being called during initialization

Raises#

KeyError: Raised if the missing data is deemed necessary, either during initialization or for a runtime update.

Notes#

This function determines if it’s being called during the initialization phase and checks if the missing variable data is required at this stage using ‘_is_input_required_upon_initialization’. If required, it logs an error and raises a KeyError. If not, it logs a warning.

get_data(data_address: str) → Any#

Get the requested data from the pool if it exists. If not, None is returned.

Parameters#

data_addressstr: The address of the requested data.

Returns#

Any: The requested data if found. None otherwise.

Examples#

The user can request as broad or narrow a selection of the input data pool as is needed.

Input Manager must first be instantiated: >>> input_manager = InputManager()

This will return the value of calf_num of the herd_information section in the animal blob (in this example, the value for calf_num is 8): >>> input_manager.get_data(‘animal.herd_information.calf_num’) 8

If a broader range of data is needed, the user can expand the query to get_data by shortening the data_address. This will return the full herd_information object: >>> input_manager.get_data(‘animal.herd_information’) { calf_num: 8, heiferI_num: 44, heiferII_num: 38, heiferIII_num_springers: 5, cow_num: 100, herd_num: 187, herd_init: False, breed: HO }

If the requested data does not exist, the method will return None: >>> input_manager.get_data(‘animal.herd_information.nonexistent_property’) None

check_property_exists_in_pool(data_address: str) → bool#

Check if the requested property exists in the pool.

Parameters#

data_addressstr: The address of the requested property.

Returns#

bool: True if the property exists in the pool, False otherwise.

Examples#

The user can check if a property exists in the pool.

Input Manager must first be instantiated: >>> input_manager = InputManager()

This will return True if the property calf_num exists in the herd_information section of the animal blob: >>> input_manager.check_property_exists_in_pool(‘animal.herd_information.calf_num’) True

If the property does not exist, the method will return False: >>> input_manager.check_property_exists_in_pool(‘animal.herd_information.nonexistent_property’) False

get_metadata(metadata_address: str) → Any#

Get the requested metadata from the IM metadata dictionary.

metadata_addressstr
The address of the requested metadata.

Any
The requested metadata if found.

KeyError
If the requested metadata is not found.

The user can request as broad or narrow a selection of the metadata as is needed.

Input Manager must first be instantiated: >>> input_manager = InputManager()

This will return the ‘type’ for albedo in the soil_profile_properties section of the metadata’s properties (the type for albedo is number): >>> input_manager.get_metadata(‘properties.soil_profile_properties.albedo.type’) “number”

If a broader range of the metadata is needed, the user can expand the query to get_metadata by shortening the metadata_address. This will return the full ‘albedo’ object containing its type, description, minimum, maximum, and default: >>> input_manager.get_metadata(‘properties.soil_profile_properties.albedo’) { “type”: “number”, “description”: “Ratio of solar radiation reflected by soil to amount of incident upon it.

Unitless. Reference: SWAT Input .SOL - SOL_ALB”,

“minimum”: 0.0, “maximum”: 1.0, “default”: 0.16 }

get_data_keys_by_properties(target_properties: str) → list[str]#

Retrieves the list of metadata keys that point to data which have the target_properties.

Parameters#

target_propertiesstr: The name of the metadata properties group that is being searched for.

Returns#

list[str]: List of keys which point to data within the Input Manager’s data pool that adhere to the target metadata properties.

Examples#

If the metadata looked like the following: ``` {

“files”: {

“field_1”: {
“properties”: “field_properties”, …

}, “soil_1”: {

“properties”: “soil_profile_properties”, …

}, “field_2”: {

“properties”: “field_properties”, …

}, “properties”: {…}, …

}#

The the call get_data_keys_by_properties(“field_properties”) would be expected to return the list [“field_1”, “field_2”].

Notes#

If no keys have the specified property, the method returns an empty list.

flush_pool() → None#: Clear the variable pool.

_metadata_properties_exist(variable_name: str, properties_blob_key: str) → bool#

Checks if specific properties exist in the metadata for a given variable.

Notes#

This function is designed to verify the existence of specified properties within the metadata of a particular variable. It returns a boolean indicating the existence of the properties, and a KeyError in case of missing metadata or properties.

Parameters#

variable_namestr: The name of the variable for which the metadata is to be checked.
properties_blob_keystr: The key representing the specific properties blob in the metadata to check.

Returns#

bool: True if the properties exist, False otherwise.

Raises#

ValueError: If no metadata is loaded in InputManager.__metadata.
KeyError: If no metadata properties can be found with the given properties_blob_key.

_add_variable_to_pool(variable_name: str, input_data: Dict[str, Any], properties_blob_key: str, eager_termination: bool) → bool#

Adds a variable to the pool after validating its data against specified metadata properties.

Notes#

This function processes and validates the input data for a variable based on its metadata properties, attempting to fix any invalid elements. If all elements are valid or successfully fixed, the data is added to a pool. The function supports eager termination, which can halt the process early if invalid data is encountered or if a non-modifiable variable is attempted to be modified during runtime.

Parameters#

variable_namestr: The name of the variable to be added to the pool.
input_dataDict[str, Any]: The data associated with the variable that needs validation and addition to the pool.
properties_blob_keystr: The key in the metadata properties against which the data is validated.
eager_terminationbool: Flag indicating whether the function should return early in case of invalid data.

Returns#

bool: True if the variable is successfully added, False otherwise.

Raises#

ValueError: If eager_termination is True and the variable failed validation.

_prepare_data(variable_name: str, input_data: dict[str, Any], properties_blob_key: str) → Tuple[Dict[str, Any], Dict[str, Any]]#

Prepare data and metadata properties for validation.

Parameters#

variable_namestr: The name of the variable to be added to the pool.
input_dataDict[str, Any]: The data associated with the variable that needs validation and addition to the pool.
properties_blob_keystr: The key in the metadata properties against which the data is validated.

Returns#

Tuple[List[str], Dict[str, Any], Dict[str, Any]]: Prepared element hierarchy, data, and metadata properties.

_check_modifiability(variable_name: str, metadata_properties: dict[str, Any], eager_termination: bool) → bool#

Checks whether a variable is allowed to be modified at runtime.

Parameters#

variable_namestr: The name of the variable to be added to the pool.
metadata_propertiesdict[str, Any]: Metadata for each property of a variable, including details like type, description, modifiability, and validation constraints.
eager_terminationbool: Indicator for the need of eager termination.

Returns#

bool: Indicator for whether the data is modifiable.

Raises#

PermissionError: If eager_termination is True and the variable is not modifiable during runtime.

_validate_data(data: dict[str, Any], metadata_properties: dict[str, Any], eager_termination: bool, properties_blob_key: str, elements_counter: ElementsCounter) → dict[str, Any]#

Validate input data based on metadata properties.

Parameters#

datadict[str, Any]: Data to be validated.
metadata_propertiesdict[str, Any]: Metadata for each property of a variable, including details like type, description, modifiability, and validation constraints.
eager_terminationbool: Indicator for the need of eager termination.
properties_blob_keystr: The key in the metadata properties against which the data is validated.
elements_counterElementsCounter: An ElementsCounter object to keep track of status of variables.

Returns#

dict[str, Any]: A dictionary of validated data.

_add_to_pool(variable_name: str, validated_data: dict[str, Any]) → None#

Add validated data to the pool.

Parameters#

variable_namestr: The name of the variable to be added to the pool.
validated_datadict[str, Any]: A dictionary of validated data.

add_runtime_variable_to_pool(variable_name: str, data: Dict[str, Any], properties_blob_key: str, eager_termination: bool) → bool#

Adds a variable to the InputManager’s pool after validating it against metadata.

Notes#

This function takes in a variable along with its name and a key to access its validation metadata. It validates the data against the provided metadata and adds the data to the InputManager pool if it is valid.

Parameters#

variable_name: str: The name of the dictionary variable to be added.
dataDict[str, Any]: The data of the variable, structured as a dictionary.
properties_blob_keystr: A key used to locate the metadata for validation of the variable.
eager_terminationbool: If True, a ValueError will be raised from _add_variable_to_pool() when the variable is invalid. If False, the function returns False.

Returns#

bool: True if the variable is successfully validated and added to the pool. False if the variable is invalid and not added to the pool.

Raises#

TypeError: If data is not the expected type of Dict[str, Any].

dump_get_data_logs(path: Path) → None#

Dumps the stored get data logs to a JSON file at the specified path.

Parameters#

pathPath: The directory path where the JSON file will be saved.

save_metadata_properties(output_dir: Path) → None#

Saves metadata properties in CSV format.

Parameters#

output_dirPath: The path to the output directory where the metadata properties CSV will be saved.

Raises#

FileNotFoundError: If the file cannot be saved at the specified path.
PermissionError: If the user does not have permission to save the file at the specified path.
OSError: For any other unexpected error that occurs while trying to save the CSV.

_parse_metadata_properties(data: Dict[str, Any], prefix: str = '', sep: str = '_') → List[Dict[str, Any]]#

Recursively traverse through the metadata properties dictionary to flatten it by creating a record for each entry.

Parameters#

dataDict[str, Any]: The metadata properties data to be parsed.
prefixstr, optional: The data record prefix, by default ‘’.
sepstr, optional: The separator used between parts of the data entry names, by default ‘_’.

Returns#

List[Dict[str, Any]]: A list of flattened data entries from the json file.

_check_property_type_primitive(property: Dict[str, Any]) → bool#: Checks whether the property’s “type” is primitive or an array of primitive types.

_create_record(data_entry: Dict[str, Any], name: str) → Dict[str, Any]#

Assembles a record to a specific format to match the columns of the CSV to which it will eventually be added.

Parameters#

data_entryDict[str, Any]: The data entry from the json file to be converted into the record format.
namestr: The name to be used for the record.

Returns#

Dict[str, Any]: A dictionary of the data entry converted to the record format.

compare_metadata_properties(properties_file_path: Path, comparison_properties_file_path: Path, output_directory: Path) → None#: Compares two metadata properties json files using the DeepDiff package and saves the results in a text file.

export_pool_to_csv(output_prefix: str, output_path: Path) → None#

Flatten the interested input data and export the variables with their values into a CSV.

Parameters#

output_prefix: str: The output prefix for the current task.
output_path: Path: The folder to save the output CSV.