conmo.algorithms.algorithm.AnomalyDetectionClassBasedAlgorithm
- class conmo.algorithms.algorithm.AnomalyDetectionClassBasedAlgorithm[source]
- __init__()
- execute(idx: int, in_dir: str, out_dir: str) str
Performs a complete execution of the algorithm, loading input data, performing a run through the folds and saving the results.
- Parameters
idx (int) – Index of the algorithm in the Experiment. Userful in case you want to experiment with several algorithms.
in_dir (str) – Intermediate directory where the input data to the algorithm is stored.
out_dir (str) – Intermediate directory where the output data (predictios of the algorithm) will be stored.
- Returns
Name of the output directory.
- Return type
str
- abstract fit_predict(data_train: DataFrame, data_test: DataFrame, labels_train: DataFrame, labels_test: DataFrame) DataFrame
Trains the model with train data and then performs predictions with the trained algorithm over the test data.
- Parameters
data_train (Pandas Dataframe) – Train data.
data_test (Pandas Dataframe) – Test data.
labels_train (Pandas Dataframe) – Train labels.
labels_test (Pandas Dataframe) – Test labels.
- Returns
Results of the predictions made on the test set.
- Return type
Pandas Dataframe
- labels_per_sequence(labels: DataFrame) bool
Use only with time series datasets. Checks if the labels file of the chosen dataset has an index format with sequences only or sequences and time. This method in future updates will be changed to a specific class for time series.
- Parameters
labels (Pandas Dataframe) – Labels file of the dataset.
- Returns
True if the labels contains 1 level of index with sequence or False if the labels file contains 2 leves with sequence and time.
- Return type
bool
- Raises
RuntimeError – If the number of index levels is invalid.
- load_input(in_dir: str) -> (<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>)
Read parquet data and labels files of the chosen dataset.
- Parameters
in_dir (str) – Input directory where the files are located.
- Returns
data (Pandas Dataframe) – Loaded data file.
labels (Pandas Dataframe) – Loaded labels file.
- save_output(results: DataFrame, out_dir: str, idx: int) str
Save algorithms output to parquet format.
- Parameters
results (Pandas Dataframe) – Dataframe with the results of the execution.
out_dir (str) – Output directory where the results will be saved.
idx (int) – Index of the algorithm in the Experiment. Userful in case you want to experiment with several algorithms.
- show_start_message()
Simple method to print on the terminal the name of the algorithm to be executed.
Methods
__init__
()execute
(idx, in_dir, out_dir)Performs a complete execution of the algorithm, loading input data, performing a run through the folds and saving the results.
fit_predict
(data_train, data_test, ...)Trains the model with train data and then performs predictions with the trained algorithm over the test data.
labels_per_sequence
(labels)Use only with time series datasets.
load_input
(in_dir)Read parquet data and labels files of the chosen dataset.
save_output
(results, out_dir, idx)Save algorithms output to parquet format.
Simple method to print on the terminal the name of the algorithm to be executed.