RUFAS.input_manager module#
- RUFAS.input_manager.FIXABLE_INPUT_DATA_TYPES: set[str] = {'bool', 'number', 'string'}#
Set enumerating the input data formats the Input Manager can accept.
- class RUFAS.input_manager.InputManager(metadata_depth_limit: int | None = None)#
Bases:
object
Input Manager class responsible for loading, validating, and providing access to input data.
- __instance = None#
- __init__(metadata_depth_limit: int | None = None) None #
- property meta_data: Dict[str, Any]#
The getter method for __metadata
- property pool: Dict[str, Any]#
The getter method for __pool
- start_data_processing(metadata_path: Path, eager_termination: bool = True) bool #
Starts the pipeline for organizing metadata and input data processing.
Parameters#
- metadata_pathPath
File path to the metadata.
- eager_terminationbool, default=True
If True, the process will be terminated as soon as finding invalid data and failing to fix it. If False, the process will be terminated after going through and validating the entire data.
Returns#
- bool
True if data is valid, otherwise False.
- _load_metadata(metadata_path: Path) None #
Loads metadata from json file to IM metadata dict.
Parameters#
- metadata_pathPath
The path to the metadata file.
Raises#
- Exception
If an error occurs while opening or reading the metadata_path file.
- _load_properties() None #
Loads properties data from a specified JSON file and updates the metadata.
This method reads the properties file path from the metadata, checks if the file exists, and then loads the properties into the metadata. The original properties data in the metadata is first copied to a separate attribute for future reference and then removed from the metadata files section.
Raises#
- FileNotFoundError
If the properties file does not exist at the specified path.
- json.JSONDecodeError
If there is an error in decoding the JSON file.
- Exception
For any other unexpected errors during properties loading.
- _load_data_from_json(file_path: Path) Dict[str, Any] #
Loads data from input json file.
Parameters#
- file_pathPath
Path to the input file to load.
Returns#
- Dict[str, Any]
The data dictionary loaded from the json file.
Raises#
- Exception
For any other unexpected errors during JSON file loading.
- _load_data_from_csv(file_path: Path) Dict[str, Any] #
Loads data from input csv file.
Parameters#
- file_pathPath
Path to the input file to load.
Returns#
- Dict[str, Any]
The data dictionary loaded from the json file.
Raises#
- FileNotFoundError
If the CSV file does not exist at the specified path.
- Exception
For any other unexpected errors during CSV file loading.
- _populate_pool(eager_termination: bool) bool #
Loads input files, runs validations on the data from the input files, attempts to fix invalid data, then adds data to the pool.
Parameters#
- eager_terminationbool
If True, the process will be terminated as soon as finding invalid data and failing to fix it. If False, the process will be terminated after going through and validating the entire data, If invalid data is found.
Returns#
- bool
True if data is valid, otherwise False.
Raises#
- KeyError
If faulty data type found in data blob key.
- _get_variable_modifiability(variable_name: str, variable_properties: Dict[str, Any]) Modifiability #
Determines the modifiability status of a variable based on its properties and returns the corresponding enum value.
Notes#
This function looks for a ‘modifiability’ key within variable_properties. If present and its value is not empty, the function attempts to map this value to an enum member in Modifiability. If the value does not correspond to any enum members, a KeyError is raised after logging the error. If ‘modifiability’ is absent or its value is empty, the function defaults to Modifiability.NOT_REQUIRED_AND_UNLOCKED.
Parameters#
- variable_namestr
The name of the variable for which the modifiability status is being determined. Used for error logging.
- variable_propertiesDict[str, Any]
A dictionary containing the properties of the variable, containing the desired ‘modifiability’ property.
Returns#
- Modifiability
An enum member representing the variable’s modifiability status.
Raises#
- KeyError
If ‘modifiability’ in variable_properties does not match any enum member in Modifiability. The error message includes the invalid modifiability value and suggests valid values.
- _is_input_required_upon_initialization(variable_name: str, variable_properties: Dict[str, Any]) bool #
Determines whether a variable requires an input value upon initialization based on its modifiability status.
This function utilizes the ‘_get_variable_modifiability’ method to ascertain the modifiability status of the variable identified by ‘variable_name’ and described by ‘variable_properties’. It then checks if the modifiability status is either ‘REQUIRED_AND_LOCKED’ or ‘REQUIRED_AND_UNLOCKED’, indicating that the variable must be initialized with a value.
Parameters#
- variable_namestr
The name of the variable being evaluated for its initialization requirements.
- variable_propertiesDict[str, Any]
A dictionary containing the properties of the variable, which should include its modifiability status among others.
Returns#
- bool
True if the variable’s modifiability status necessitates an input value upon initialization, False otherwise.
- _is_modifiable_during_runtime(variable_name: str, variable_properties: Dict[str, Any]) bool #
Checks if a variable can be modified during runtime based on its modifiability status.
This function determines the modifiability status of a variable using the ‘_get_variable_modifiability’ method. It assesses whether the variable, identified by ‘variable_name’ and described by ‘variable_properties’, is allowed to be modified after initialization. A variable is considered modifiable during runtime if its modifiability status is either ‘REQUIRED_AND_UNLOCKED’ or ‘NOT_REQUIRED_AND_UNLOCKED’.
Parameters#
- variable_namestr
The name of the variable to check for runtime modifiability.
- variable_propertiesDict[str, Any]
A dictionary containing the properties of the variable, including details that determine its modifiability.
Returns#
- bool
True if the variable is allowed to be modified during runtime, False otherwise.
- _log_missing_data(variable_properties: Dict[str, Any], var_name: str, called_during_initialization: bool) None #
Handles logging for missing data for a variable, logging errors or warnings based on the context of initialization or runtime updates.
Parameters#
- variable_propertiesDict[str, Any]
Properties of the variable, potentially including its modifiability status.
- var_namestr
The name of the variable with missing data.
- called_during_initialization: bool
Boolean variable indicating whether the function is being called during initialization
Raises#
- KeyError
Raised if the missing data is deemed necessary, either during initialization or for a runtime update.
Notes#
This function determines if it’s being called during the initialization phase and checks if the missing variable data is required at this stage using ‘_is_input_required_upon_initialization’. If required, it logs an error and raises a KeyError. If not, it logs a warning.
- get_data(data_address: str) Any #
Get the requested data from the pool if it exists. If not, None is returned.
Parameters#
- data_addressstr
The address of the requested data.
Returns#
- Any
The requested data if found. None otherwise.
Examples#
The user can request as broad or narrow a selection of the input data pool as is needed.
Input Manager must first be instantiated: >>> input_manager = InputManager()
This will return the value of calf_num of the herd_information section in the animal blob (in this example, the value for calf_num is 8): >>> input_manager.get_data(‘animal.herd_information.calf_num’) 8
If a broader range of data is needed, the user can expand the query to get_data by shortening the data_address. This will return the full herd_information object: >>> input_manager.get_data(‘animal.herd_information’) { calf_num: 8, heiferI_num: 44, heiferII_num: 38, heiferIII_num_springers: 5, cow_num: 100, herd_num: 187, herd_init: False, breed: HO }
If the requested data does not exist, the method will return None: >>> input_manager.get_data(‘animal.herd_information.nonexistent_property’) None
- check_property_exists_in_pool(data_address: str) bool #
Check if the requested property exists in the pool.
Parameters#
- data_addressstr
The address of the requested property.
Returns#
- bool
True if the property exists in the pool, False otherwise.
Examples#
The user can check if a property exists in the pool.
Input Manager must first be instantiated: >>> input_manager = InputManager()
This will return True if the property calf_num exists in the herd_information section of the animal blob: >>> input_manager.check_property_exists_in_pool(‘animal.herd_information.calf_num’) True
If the property does not exist, the method will return False: >>> input_manager.check_property_exists_in_pool(‘animal.herd_information.nonexistent_property’) False
- get_metadata(metadata_address: str) Any #
Get the requested metadata from the IM metadata dictionary.
- metadata_addressstr
The address of the requested metadata.
- Any
The requested metadata if found.
- KeyError
If the requested metadata is not found.
The user can request as broad or narrow a selection of the metadata as is needed.
Input Manager must first be instantiated: >>> input_manager = InputManager()
This will return the ‘type’ for albedo in the soil_profile_properties section of the metadata’s properties (the type for albedo is number): >>> input_manager.get_metadata(‘properties.soil_profile_properties.albedo.type’) “number”
If a broader range of the metadata is needed, the user can expand the query to get_metadata by shortening the metadata_address. This will return the full ‘albedo’ object containing its type, description, minimum, maximum, and default: >>> input_manager.get_metadata(‘properties.soil_profile_properties.albedo’) { “type”: “number”, “description”: “Ratio of solar radiation reflected by soil to amount of incident upon it.
Unitless. Reference: SWAT Input .SOL - SOL_ALB”,
“minimum”: 0.0, “maximum”: 1.0, “default”: 0.16 }
- get_data_keys_by_properties(target_properties: str) list[str] #
Retrieves the list of metadata keys that point to data which have the target_properties.
Parameters#
- target_propertiesstr
The name of the metadata properties group that is being searched for.
Returns#
- list[str]
List of keys which point to data within the Input Manager’s data pool that adhere to the target metadata properties.
Examples#
If the metadata looked like the following: ``` {
- “files”: {
- “field_1”: {
“properties”: “field_properties”, …
}, “soil_1”: {
“properties”: “soil_profile_properties”, …
}, “field_2”: {
“properties”: “field_properties”, …
}, “properties”: {…}, …
}#
The the call get_data_keys_by_properties(“field_properties”) would be expected to return the list [“field_1”, “field_2”].
Notes#
If no keys have the specified property, the method returns an empty list.
- flush_pool() None #
Clear the variable pool.
- _metadata_properties_exist(variable_name: str, properties_blob_key: str) bool #
Checks if specific properties exist in the metadata for a given variable.
Notes#
This function is designed to verify the existence of specified properties within the metadata of a particular variable. It returns a boolean indicating the existence of the properties, and a KeyError in case of missing metadata or properties.
Parameters#
- variable_namestr
The name of the variable for which the metadata is to be checked.
- properties_blob_keystr
The key representing the specific properties blob in the metadata to check.
Returns#
- bool
True if the properties exist, False otherwise.
Raises#
- ValueError
If no metadata is loaded in InputManager.__metadata.
- KeyError
If no metadata properties can be found with the given properties_blob_key.
- _add_variable_to_pool(variable_name: str, input_data: Dict[str, Any], properties_blob_key: str, eager_termination: bool) bool #
Adds a variable to the pool after validating its data against specified metadata properties.
Notes#
This function processes and validates the input data for a variable based on its metadata properties, attempting to fix any invalid elements. If all elements are valid or successfully fixed, the data is added to a pool. The function supports eager termination, which can halt the process early if invalid data is encountered or if a non-modifiable variable is attempted to be modified during runtime.
Parameters#
- variable_namestr
The name of the variable to be added to the pool.
- input_dataDict[str, Any]
The data associated with the variable that needs validation and addition to the pool.
- properties_blob_keystr
The key in the metadata properties against which the data is validated.
- eager_terminationbool
Flag indicating whether the function should return early in case of invalid data.
Returns#
- bool
True if the variable is successfully added, False otherwise.
Raises#
- ValueError
If eager_termination is True and the variable failed validation.
- _prepare_data(variable_name: str, input_data: dict[str, Any], properties_blob_key: str) Tuple[Dict[str, Any], Dict[str, Any]] #
Prepare data and metadata properties for validation.
Parameters#
- variable_namestr
The name of the variable to be added to the pool.
- input_dataDict[str, Any]
The data associated with the variable that needs validation and addition to the pool.
- properties_blob_keystr
The key in the metadata properties against which the data is validated.
Returns#
- Tuple[List[str], Dict[str, Any], Dict[str, Any]]
Prepared element hierarchy, data, and metadata properties.
- _check_modifiability(variable_name: str, metadata_properties: dict[str, Any], eager_termination: bool) bool #
Checks whether a variable is allowed to be modified at runtime.
Parameters#
- variable_namestr
The name of the variable to be added to the pool.
- metadata_propertiesdict[str, Any]
Metadata for each property of a variable, including details like type, description, modifiability, and validation constraints.
- eager_terminationbool
Indicator for the need of eager termination.
Returns#
- bool
Indicator for whether the data is modifiable.
Raises#
- PermissionError
If eager_termination is True and the variable is not modifiable during runtime.
- _validate_data(data: dict[str, Any], metadata_properties: dict[str, Any], eager_termination: bool, properties_blob_key: str, elements_counter: ElementsCounter) dict[str, Any] #
Validate input data based on metadata properties.
Parameters#
- datadict[str, Any]
Data to be validated.
- metadata_propertiesdict[str, Any]
Metadata for each property of a variable, including details like type, description, modifiability, and validation constraints.
- eager_terminationbool
Indicator for the need of eager termination.
- properties_blob_keystr
The key in the metadata properties against which the data is validated.
- elements_counterElementsCounter
An ElementsCounter object to keep track of status of variables.
Returns#
- dict[str, Any]
A dictionary of validated data.
- _add_to_pool(variable_name: str, validated_data: dict[str, Any]) None #
Add validated data to the pool.
Parameters#
- variable_namestr
The name of the variable to be added to the pool.
- validated_datadict[str, Any]
A dictionary of validated data.
- add_runtime_variable_to_pool(variable_name: str, data: Dict[str, Any], properties_blob_key: str, eager_termination: bool) bool #
Adds a variable to the InputManager’s pool after validating it against metadata.
Notes#
This function takes in a variable along with its name and a key to access its validation metadata. It validates the data against the provided metadata and adds the data to the InputManager pool if it is valid.
Parameters#
- variable_name: str
The name of the dictionary variable to be added.
- dataDict[str, Any]
The data of the variable, structured as a dictionary.
- properties_blob_keystr
A key used to locate the metadata for validation of the variable.
- eager_terminationbool
If True, a ValueError will be raised from _add_variable_to_pool() when the variable is invalid. If False, the function returns False.
Returns#
- bool
True if the variable is successfully validated and added to the pool. False if the variable is invalid and not added to the pool.
Raises#
- TypeError
If data is not the expected type of Dict[str, Any].
- dump_get_data_logs(path: Path) None #
Dumps the stored get data logs to a JSON file at the specified path.
Parameters#
- pathPath
The directory path where the JSON file will be saved.
- save_metadata_properties(output_dir: Path) None #
Saves metadata properties in CSV format.
Parameters#
- output_dirPath
The path to the output directory where the metadata properties CSV will be saved.
Raises#
- FileNotFoundError
If the file cannot be saved at the specified path.
- PermissionError
If the user does not have permission to save the file at the specified path.
- OSError
For any other unexpected error that occurs while trying to save the CSV.
- _parse_metadata_properties(data: Dict[str, Any], prefix: str = '', sep: str = '_') List[Dict[str, Any]] #
Recursively traverse through the metadata properties dictionary to flatten it by creating a record for each entry.
Parameters#
- dataDict[str, Any]
The metadata properties data to be parsed.
- prefixstr, optional
The data record prefix, by default ‘’.
- sepstr, optional
The separator used between parts of the data entry names, by default ‘_’.
Returns#
- List[Dict[str, Any]]
A list of flattened data entries from the json file.
- _check_property_type_primitive(property: Dict[str, Any]) bool #
Checks whether the property’s “type” is primitive or an array of primitive types.
- _create_record(data_entry: Dict[str, Any], name: str) Dict[str, Any] #
Assembles a record to a specific format to match the columns of the CSV to which it will eventually be added.
Parameters#
- data_entryDict[str, Any]
The data entry from the json file to be converted into the record format.
- namestr
The name to be used for the record.
Returns#
- Dict[str, Any]
A dictionary of the data entry converted to the record format.
- compare_metadata_properties(properties_file_path: Path, comparison_properties_file_path: Path, output_directory: Path) None #
Compares two metadata properties json files using the DeepDiff package and saves the results in a text file.