RUFAS.input_manager module#

RUFAS.input_manager.FIXABLE_INPUT_DATA_TYPES: set[str] = {'bool', 'number', 'string'}#

Set enumerating the input data formats the Input Manager can accept.

class RUFAS.input_manager.InputManager(metadata_depth_limit: int | None = None)#

Bases: object

Input Manager class responsible for loading, validating, and providing access to input data.

__instance = None#
__init__(metadata_depth_limit: int | None = None) None#
property meta_data: Dict[str, Any]#

The getter method for __metadata

property pool: Dict[str, Any]#

The getter method for __pool

start_data_processing(metadata_path: Path, eager_termination: bool = True) bool#

Starts the pipeline for organizing metadata and input data processing.

Parameters#

metadata_pathPath

File path to the metadata.

eager_terminationbool, default=True

If True, the process will be terminated as soon as finding invalid data and failing to fix it. If False, the process will be terminated after going through and validating the entire data.

Returns#

bool

True if data is valid, otherwise False.

_load_metadata(metadata_path: Path) None#

Loads metadata from json file to IM metadata dict.

Parameters#

metadata_pathPath

The path to the metadata file.

Raises#

Exception

If an error occurs while opening or reading the metadata_path file.

_load_properties() None#

Loads properties data from a specified JSON file and updates the metadata.

This method reads the properties file path from the metadata, checks if the file exists, and then loads the properties into the metadata. The original properties data in the metadata is first copied to a separate attribute for future reference and then removed from the metadata files section.

Raises#

FileNotFoundError

If the properties file does not exist at the specified path.

json.JSONDecodeError

If there is an error in decoding the JSON file.

Exception

For any other unexpected errors during properties loading.

_load_data_from_json(file_path: Path) Dict[str, Any]#

Loads data from input json file.

Parameters#

file_pathPath

Path to the input file to load.

Returns#

Dict[str, Any]

The data dictionary loaded from the json file.

Raises#

Exception

For any other unexpected errors during JSON file loading.

_load_data_from_csv(file_path: Path) Dict[str, Any]#

Loads data from input csv file.

Parameters#

file_pathPath

Path to the input file to load.

Returns#

Dict[str, Any]

The data dictionary loaded from the json file.

Raises#

FileNotFoundError

If the CSV file does not exist at the specified path.

Exception

For any other unexpected errors during CSV file loading.

_populate_pool(eager_termination: bool) bool#

Loads input files, runs validations on the data from the input files, attempts to fix invalid data, then adds data to the pool.

Parameters#

eager_terminationbool

If True, the process will be terminated as soon as finding invalid data and failing to fix it. If False, the process will be terminated after going through and validating the entire data, If invalid data is found.

Returns#

bool

True if data is valid, otherwise False.

Raises#

KeyError

If faulty data type found in data blob key.

_get_variable_modifiability(variable_name: str, variable_properties: Dict[str, Any]) Modifiability#

Determines the modifiability status of a variable based on its properties and returns the corresponding enum value.

Notes#

This function looks for a ‘modifiability’ key within variable_properties. If present and its value is not empty, the function attempts to map this value to an enum member in Modifiability. If the value does not correspond to any enum members, a KeyError is raised after logging the error. If ‘modifiability’ is absent or its value is empty, the function defaults to Modifiability.NOT_REQUIRED_AND_UNLOCKED.

Parameters#

variable_namestr

The name of the variable for which the modifiability status is being determined. Used for error logging.

variable_propertiesDict[str, Any]

A dictionary containing the properties of the variable, containing the desired ‘modifiability’ property.

Returns#

Modifiability

An enum member representing the variable’s modifiability status.

Raises#

KeyError

If ‘modifiability’ in variable_properties does not match any enum member in Modifiability. The error message includes the invalid modifiability value and suggests valid values.

_is_input_required_upon_initialization(variable_name: str, variable_properties: Dict[str, Any]) bool#

Determines whether a variable requires an input value upon initialization based on its modifiability status.

This function utilizes the ‘_get_variable_modifiability’ method to ascertain the modifiability status of the variable identified by ‘variable_name’ and described by ‘variable_properties’. It then checks if the modifiability status is either ‘REQUIRED_AND_LOCKED’ or ‘REQUIRED_AND_UNLOCKED’, indicating that the variable must be initialized with a value.

Parameters#

variable_namestr

The name of the variable being evaluated for its initialization requirements.

variable_propertiesDict[str, Any]

A dictionary containing the properties of the variable, which should include its modifiability status among others.

Returns#

bool

True if the variable’s modifiability status necessitates an input value upon initialization, False otherwise.

_is_modifiable_during_runtime(variable_name: str, variable_properties: Dict[str, Any]) bool#

Checks if a variable can be modified during runtime based on its modifiability status.

This function determines the modifiability status of a variable using the ‘_get_variable_modifiability’ method. It assesses whether the variable, identified by ‘variable_name’ and described by ‘variable_properties’, is allowed to be modified after initialization. A variable is considered modifiable during runtime if its modifiability status is either ‘REQUIRED_AND_UNLOCKED’ or ‘NOT_REQUIRED_AND_UNLOCKED’.

Parameters#

variable_namestr

The name of the variable to check for runtime modifiability.

variable_propertiesDict[str, Any]

A dictionary containing the properties of the variable, including details that determine its modifiability.

Returns#

bool

True if the variable is allowed to be modified during runtime, False otherwise.

_log_missing_data(variable_properties: Dict[str, Any], var_name: str, called_during_initialization: bool) None#

Handles logging for missing data for a variable, logging errors or warnings based on the context of initialization or runtime updates.

Parameters#

variable_propertiesDict[str, Any]

Properties of the variable, potentially including its modifiability status.

var_namestr

The name of the variable with missing data.

called_during_initialization: bool

Boolean variable indicating whether the function is being called during initialization

Raises#

KeyError

Raised if the missing data is deemed necessary, either during initialization or for a runtime update.

Notes#

This function determines if it’s being called during the initialization phase and checks if the missing variable data is required at this stage using ‘_is_input_required_upon_initialization’. If required, it logs an error and raises a KeyError. If not, it logs a warning.

get_data(data_address: str) Any#

Get the requested data from the pool if it exists. If not, None is returned.

Parameters#

data_addressstr

The address of the requested data.

Returns#

Any

The requested data if found. None otherwise.

Examples#

The user can request as broad or narrow a selection of the input data pool as is needed.

Input Manager must first be instantiated: >>> input_manager = InputManager()

This will return the value of calf_num of the herd_information section in the animal blob (in this example, the value for calf_num is 8): >>> input_manager.get_data(‘animal.herd_information.calf_num’) 8

If a broader range of data is needed, the user can expand the query to get_data by shortening the data_address. This will return the full herd_information object: >>> input_manager.get_data(‘animal.herd_information’) { calf_num: 8, heiferI_num: 44, heiferII_num: 38, heiferIII_num_springers: 5, cow_num: 100, herd_num: 187, herd_init: False, breed: HO }

If the requested data does not exist, the method will return None: >>> input_manager.get_data(‘animal.herd_information.nonexistent_property’) None

check_property_exists_in_pool(data_address: str) bool#

Check if the requested property exists in the pool.

Parameters#

data_addressstr

The address of the requested property.

Returns#

bool

True if the property exists in the pool, False otherwise.

Examples#

The user can check if a property exists in the pool.

Input Manager must first be instantiated: >>> input_manager = InputManager()

This will return True if the property calf_num exists in the herd_information section of the animal blob: >>> input_manager.check_property_exists_in_pool(‘animal.herd_information.calf_num’) True

If the property does not exist, the method will return False: >>> input_manager.check_property_exists_in_pool(‘animal.herd_information.nonexistent_property’) False

get_metadata(metadata_address: str) Any#

Get the requested metadata from the IM metadata dictionary.

metadata_addressstr

The address of the requested metadata.

Any

The requested metadata if found.

KeyError

If the requested metadata is not found.

The user can request as broad or narrow a selection of the metadata as is needed.

Input Manager must first be instantiated: >>> input_manager = InputManager()

This will return the ‘type’ for albedo in the soil_profile_properties section of the metadata’s properties (the type for albedo is number): >>> input_manager.get_metadata(‘properties.soil_profile_properties.albedo.type’) “number”

If a broader range of the metadata is needed, the user can expand the query to get_metadata by shortening the metadata_address. This will return the full ‘albedo’ object containing its type, description, minimum, maximum, and default: >>> input_manager.get_metadata(‘properties.soil_profile_properties.albedo’) { “type”: “number”, “description”: “Ratio of solar radiation reflected by soil to amount of incident upon it.

Unitless. Reference: SWAT Input .SOL - SOL_ALB”,

“minimum”: 0.0, “maximum”: 1.0, “default”: 0.16 }

get_data_keys_by_properties(target_properties: str) list[str]#

Retrieves the list of metadata keys that point to data which have the target_properties.

Parameters#

target_propertiesstr

The name of the metadata properties group that is being searched for.

Returns#

list[str]

List of keys which point to data within the Input Manager’s data pool that adhere to the target metadata properties.

Examples#

If the metadata looked like the following: ``` {

“files”: {
“field_1”: {

“properties”: “field_properties”, …

}, “soil_1”: {

“properties”: “soil_profile_properties”, …

}, “field_2”: {

“properties”: “field_properties”, …

}, “properties”: {…}, …

}#

The the call get_data_keys_by_properties(“field_properties”) would be expected to return the list [“field_1”, “field_2”].

Notes#

If no keys have the specified property, the method returns an empty list.

flush_pool() None#

Clear the variable pool.

_metadata_properties_exist(variable_name: str, properties_blob_key: str) bool#

Checks if specific properties exist in the metadata for a given variable.

Notes#

This function is designed to verify the existence of specified properties within the metadata of a particular variable. It returns a boolean indicating the existence of the properties, and a KeyError in case of missing metadata or properties.

Parameters#

variable_namestr

The name of the variable for which the metadata is to be checked.

properties_blob_keystr

The key representing the specific properties blob in the metadata to check.

Returns#

bool

True if the properties exist, False otherwise.

Raises#

ValueError

If no metadata is loaded in InputManager.__metadata.

KeyError

If no metadata properties can be found with the given properties_blob_key.

_add_variable_to_pool(variable_name: str, input_data: Dict[str, Any], properties_blob_key: str, eager_termination: bool) bool#

Adds a variable to the pool after validating its data against specified metadata properties.

Notes#

This function processes and validates the input data for a variable based on its metadata properties, attempting to fix any invalid elements. If all elements are valid or successfully fixed, the data is added to a pool. The function supports eager termination, which can halt the process early if invalid data is encountered or if a non-modifiable variable is attempted to be modified during runtime.

Parameters#

variable_namestr

The name of the variable to be added to the pool.

input_dataDict[str, Any]

The data associated with the variable that needs validation and addition to the pool.

properties_blob_keystr

The key in the metadata properties against which the data is validated.

eager_terminationbool

Flag indicating whether the function should return early in case of invalid data.

Returns#

bool

True if the variable is successfully added, False otherwise.

Raises#

ValueError

If eager_termination is True and the variable failed validation.

_prepare_data(variable_name: str, input_data: dict[str, Any], properties_blob_key: str) Tuple[Dict[str, Any], Dict[str, Any]]#

Prepare data and metadata properties for validation.

Parameters#

variable_namestr

The name of the variable to be added to the pool.

input_dataDict[str, Any]

The data associated with the variable that needs validation and addition to the pool.

properties_blob_keystr

The key in the metadata properties against which the data is validated.

Returns#

Tuple[List[str], Dict[str, Any], Dict[str, Any]]

Prepared element hierarchy, data, and metadata properties.

_check_modifiability(variable_name: str, metadata_properties: dict[str, Any], eager_termination: bool) bool#

Checks whether a variable is allowed to be modified at runtime.

Parameters#

variable_namestr

The name of the variable to be added to the pool.

metadata_propertiesdict[str, Any]

Metadata for each property of a variable, including details like type, description, modifiability, and validation constraints.

eager_terminationbool

Indicator for the need of eager termination.

Returns#

bool

Indicator for whether the data is modifiable.

Raises#

PermissionError

If eager_termination is True and the variable is not modifiable during runtime.

_validate_data(data: dict[str, Any], metadata_properties: dict[str, Any], eager_termination: bool, properties_blob_key: str, elements_counter: ElementsCounter) dict[str, Any]#

Validate input data based on metadata properties.

Parameters#

datadict[str, Any]

Data to be validated.

metadata_propertiesdict[str, Any]

Metadata for each property of a variable, including details like type, description, modifiability, and validation constraints.

eager_terminationbool

Indicator for the need of eager termination.

properties_blob_keystr

The key in the metadata properties against which the data is validated.

elements_counterElementsCounter

An ElementsCounter object to keep track of status of variables.

Returns#

dict[str, Any]

A dictionary of validated data.

_add_to_pool(variable_name: str, validated_data: dict[str, Any]) None#

Add validated data to the pool.

Parameters#

variable_namestr

The name of the variable to be added to the pool.

validated_datadict[str, Any]

A dictionary of validated data.

add_runtime_variable_to_pool(variable_name: str, data: Dict[str, Any], properties_blob_key: str, eager_termination: bool) bool#

Adds a variable to the InputManager’s pool after validating it against metadata.

Notes#

This function takes in a variable along with its name and a key to access its validation metadata. It validates the data against the provided metadata and adds the data to the InputManager pool if it is valid.

Parameters#

variable_name: str

The name of the dictionary variable to be added.

dataDict[str, Any]

The data of the variable, structured as a dictionary.

properties_blob_keystr

A key used to locate the metadata for validation of the variable.

eager_terminationbool

If True, a ValueError will be raised from _add_variable_to_pool() when the variable is invalid. If False, the function returns False.

Returns#

bool

True if the variable is successfully validated and added to the pool. False if the variable is invalid and not added to the pool.

Raises#

TypeError

If data is not the expected type of Dict[str, Any].

dump_get_data_logs(path: Path) None#

Dumps the stored get data logs to a JSON file at the specified path.

Parameters#

pathPath

The directory path where the JSON file will be saved.

save_metadata_properties(output_dir: Path) None#

Saves metadata properties in CSV format.

Parameters#

output_dirPath

The path to the output directory where the metadata properties CSV will be saved.

Raises#

FileNotFoundError

If the file cannot be saved at the specified path.

PermissionError

If the user does not have permission to save the file at the specified path.

OSError

For any other unexpected error that occurs while trying to save the CSV.

_parse_metadata_properties(data: Dict[str, Any], prefix: str = '', sep: str = '_') List[Dict[str, Any]]#

Recursively traverse through the metadata properties dictionary to flatten it by creating a record for each entry.

Parameters#

dataDict[str, Any]

The metadata properties data to be parsed.

prefixstr, optional

The data record prefix, by default ‘’.

sepstr, optional

The separator used between parts of the data entry names, by default ‘_’.

Returns#

List[Dict[str, Any]]

A list of flattened data entries from the json file.

_check_property_type_primitive(property: Dict[str, Any]) bool#

Checks whether the property’s “type” is primitive or an array of primitive types.

_create_record(data_entry: Dict[str, Any], name: str) Dict[str, Any]#

Assembles a record to a specific format to match the columns of the CSV to which it will eventually be added.

Parameters#

data_entryDict[str, Any]

The data entry from the json file to be converted into the record format.

namestr

The name to be used for the record.

Returns#

Dict[str, Any]

A dictionary of the data entry converted to the record format.

compare_metadata_properties(properties_file_path: Path, comparison_properties_file_path: Path, output_directory: Path) None#

Compares two metadata properties json files using the DeepDiff package and saves the results in a text file.

export_pool_to_csv(output_prefix: str, output_path: Path) None#

Flatten the interested input data and export the variables with their values into a CSV.

Parameters#

output_prefix: str

The output prefix for the current task.

output_path: Path

The folder to save the output CSV.