Deploying to production
=======================

When deploying a model using MLOps Codex, an API is created to facilitate the integration of your model with other services. Additionally, MLOps Codex allows you to execute your model remotely within a Python application.


Preparing to production
------------------------

The first requirement is the scoring script. Similar to the training process, this script needs an entry point function. The parameters and return value of this function will depend on the specific operation of the model.


**Sync model:** This is the "real time" model. The model will expect a JSON and return a JSON after a few seconds.
The entrypoint function should look like this:

.. code-block:: python

    def score(data:str, base_path:str): # It is the name of the function (score) that must be passed in the 'model_reference' field

        # Environment variables loaded from a user-supplied file in the 'env' field
        # my_var = os.getenv('MY_VAR')
        # if my_var is None:
        #   raise Exception("Could not find `env` variable")

        ## MLOps environment variable with model file name
        # with open(base_path+os.getenv('modelFileName'), 'rb') as f:

        # Loads the already trained model to be run based on the model file passed as a parameter
        with open(base_path+"/model.pkl", 'rb') as f:
            model = load(f)

        # Build a DataFrame with the input data. The data arrives as a JSON string so we have to transform it into a dictionary
        df = pd.DataFrame(data=json.loads(data), index=[0])
        
        # Returns the results of running the model as a dictionary
        # It's important to note that in this case, as we're converting to JSON, we can't use numpy types, so we convert to pure int and float.
        return {"pred": int(model.predict(df)), "proba": float(model.predict_proba(df)[0,1])}

The first parameter is the JSON data to be sent to the model, which is provided as a JSON string and should be parsed as needed.

The second parameter is the path, which can be used to access the model files and any other files you upload, similar to the training process.

The function should return a dictionary that can be converted to JSON or a valid JSON string.

Please note that certain data types, such as numpy `int64` and `float64`, cannot typically be parsed to JSON. Therefore, your code should address this before returning the response to MLOps.

**Async model:** This is for batch scoring. We send files with usually a lot of records at once. Since this might take a while depending on the file size, we run this asynchronously.

The entrypoint function should look like this:

.. code-block:: python

    def score(data_path:str, model_path:str):## Environment variables loaded from a file supplied by the user in the 'env' field

        # my_var = os.getenv('MY_VAR')
        # if my_var is None:
        #    raise Exception("Could not find `env` variable")

        ## Environment variable loaded from MLOps with model file name
        # with open(base_path+os.getenv('modelFileName'), 'rb') as f:

        # Loads the already trained model to be run based on the model file passed as a parameter
        with open(model_path+"/model.pkl", 'rb') as f:
            model = load(f)

        ## Environment variable loaded from MLOps with database name
        # X = pd.read_csv(data_path+'/'+os.getenv('inputFileName'))

        # Loads the input base data from the file into a DataFrame
        X = pd.read_csv(data_path+"/dados.csv")

        df = X.copy()   # Creates a copy of the DataFrame with the input data

        df['proba'] = model.predict_proba(X)[:,1]   # Calculates the prediction for each entry in the data table
        df['pred'] = model.predict(X)               # Calculates the probability of each entry in the data table

        # Create the path with the name of the output file, in this case 'output.csv'. It is important that this file is saved in the same path as the data that was sent.
        output = data_path+'/output.csv'

        # Transform the DataFrame, with the prediction and probability, to csv by placing it in the output path file
        df.to_csv(output, index=False)

        # Returns the path to the file with the results of the model run
        return output

The first parameter now serves as a data path. We have distinct path parameters because each asynchronous model execution is stored in a different location. The files uploaded during model deployment remain consistent each time.

To maintain a more dynamic code structure without enforcing a specific file name pattern, you can utilize the `inputFileName` environment variable, which corresponds to the filename uploaded for that execution.

You must save the result in the same path where the input file was located. The function should return this full path.

Deploying your model
--------------------

With all files ready we can deploy the model in two ways.

- Using the :py:meth:`mlops_codex.training.MLOpsTrainingExecution.promote_model` to promote a succeeded training execution.

.. code-block:: python

    # Promoting a custom training execution
    model = custom_run.promote_model(
        model_name='Teste notebook promoted custom', # model_name
        model_reference='score', # name of the scoring function
        source_file=PATH+'app.py', # Path of the source file
        schema=PATH+'schema.json', # Path of the schema file, but it could be a dict (only required for Sync models)
        # env=PATH+'.env'  #  File for env variables (this will be encrypted in the server)
        # extra_files=[PATH+'utils.py'], # List with extra files paths that should be uploaded along (they will be all in the same folder)
        operation="Sync" # Can be Sync or Async
    )

    # Promoting an AutoML training execution
    model = automl_run.promote_model(
        model_name='Teste notebook promoted autoML', # model_name
        operation="Async" # Can be Sync or Async
    )


- Using the :py:meth:`mlops_codex.model.MLOpsModelClient.create_model` to deploy a model trained outside MLOps

.. code-block:: python
    
    # Deploying a new model
    model = client.create_model(
        model_name='Teste notebook Sync', # model_name
        model_reference='score', # name of the scoring function
        source_file=PATH+'app.py', # Path of the source file
        model_file=PATH+'model.pkl', # Path of the model pkl file, 
        requirements_file=PATH+'requirements.txt', # Path of the requirements file, 
        schema=PATH+'schema.json', # Path of the schema file, but it could be a dict (only required for Sync models)
        # env=PATH+'.env'  #  File for env variables (this will be encrypted in the server)
        # extra_files=[PATH+'utils.py'], # List with extra files paths that should be uploaded along (they will be all in the same folder)
        python_version='3.9', # Can be 3.8 to 3.10
        operation="Sync", # Can be Sync or Async
        group='datarisk' # Model group (create one using the client)
    )


Deploying a pre-trained model in MLOps requires minimal information, as AutoML models need only two parameters.

If the deployment succeeds you can start using your model.

These methods return an instance of :py:class:`mlops_codex.model.MLOpsModel`. You can utilize the wait_for_ready parameter during deployment or invoke the :py:meth:`mlops_codex.model.MLOpsModel.wait_ready` method to ensure the :py:class:`mlops_codex.model.MLOpsModel` instance is ready for use. We will install the necessary model dependencies (if you are promoting a training, we will use the same dependencies as the training execution) and conduct some tests. For synchronous models, a sample JSON of the expected API schema is required.

If the deployment is successful, you can begin using your model.

Using your model
----------------

We can use the same :py:class:`mlops_codex.model.MLOpsModel` instance to call the model.

.. code-block:: python

    # For sync models
    sync_model.predict(data={'key': 'value'})

    # For async models
    execution = async_model.predict(data=PATH+'input.csv')

Synchronous models return a dictionary, while asynchronous models return an instance of the :py:class:`mlops_codex.base.MLOpsExecution`. This instance allows you to monitor the status and download the results, similar to how you would with training executions.

To use the models, you will need a `group token`, which is generated when creating the group (see :ref:`connecting_to_mlops:creating a group`). You can set this token by adding it to the `MLOPS_GROUP_TOKEN` environment variable, using the :py:meth:`mlops_codex.model.MLOpsModel.set_token` method, or passing it directly in each :py:meth:`mlops_codex.model.MLOpsModel.predict` call.

In many cases, you may need to use your model outside of a Python environment, often by sharing it through a REST API. To facilitate this, you can access the :py:attr:`mlops_codex.model.MLOpsModel.docs` attribute to share an OpenAPI Swagger page, or use the :py:meth:`mlops_codex.model.MLOpsModel.generate_predict_code` method to generate sample request code for your model.

Disabling your model
----------------

Disabling a model means you will no longer be able to perform certain operations on it. Before proceeding, please ensure you have confirmation from your team regarding the permission to perform this operation.
To disable a model, you can use the :py:meth:`mlops_codex.model.MLOpsModel.disable` method.

.. code-block:: python

    model.disable()


Deleting your model
----------------

Deleting a model will make it unavailable. Before proceeding, please confirm with your team that you have permission to carry out this operation.
To delete a model, you can use the :py:meth:`mlops_codex.model.MLOpsModel.delete` method.

.. code-block:: python

    model.delete()


Monitoring your model
---------------------

Model monitoring involves tracking the model's performance in production to enable updates if it begins to make inaccurate predictions.

Currently, MLOps employs indirect monitoring. This means it observes the model's input in production and verifies its similarity to the training data.
When setting up the monitoring system, it is essential to identify which training process produced the model and which features are pertinent for monitoring.

We provide metrics such as the "Population Stability Index" (PSI and PSI average) and "SHapley Additive exPlanations" (SHAP and SHAP average).

Additionally, it is crucial to understand how to manage the features and the model effectively.

Production data is stored in its raw form, while training data is not (see training guide: :ref:`training_guide:Running a training execution`). Therefore, it is important to know the steps for processing raw production data to derive model features similar to those saved during training: :ref:`monitoring_parameters:Monitoring configuration`.

The first method to invoke is :py:meth:`mlops_codex.pipeline.MLOpsPipeline.register_monitoring_config`, which registers the monitoring configuration in the database.

.. code-block:: python

    # We can also add a monitoring configuration for the model

    PATH = './samples/monitoring/'

    model.register_monitoring(
        preprocess_reference='parse', # name of the preprocess function
        shap_reference='get_shap', # name of the shap function
        configuration_file=PATH+'configuration.json', # Path of the configuration file, but it could be a dict
        preprocess_file=PATH+'preprocess_sync.py', # Path of the preprocess script
        requirements_file=PATH+'requirements.txt' # Path of the requirements file                        
    )

Next, you can manually run the monitoring process, calling the method :py:meth:`mlops_codex.pipeline.MLOpsPipeline.run_monitoring`.

.. code-block:: python

    pipeline = MLOpsPipeline.from_config_file('./samples/pipeline-just-model.yml')
    pipeline.register_monitoring_config(
        directory = "./samples/monitoring", preprocess = "preprocess_async.py", preprocess_function = "score", 
        shap_function = "score", config = "configuration.json", packages = "requirements.txt"
    )
    pipeline.start()

Using with preprocess script
----------------------------

Sometimes, you might want to run a preprocessing script to adjust the model input data before executing it. With MLOps, you can easily do this.

You must first instantiate the :py:class:`mlops_codex.model.MLOpsModelClient`:

.. code-block:: python

    from mlops_codex.model import MLOpsModelClient
    model_client = MLOpsModelClient()


And now you just need to run the model using the preprocess script (check :ref:`preprocessing:Preprocessing module`).

For the **sync model**:

.. code-block:: python

    sync_model = model_client.get_model(group='groupname', model_id='M7abe6af98484948ad63f3ad03f25b6496a93f06e23c4ffbaa43eba0f6a1bb91')

    sync_model.set_token('29d9d82e09bb4c11b9cd4ce4e36e6c58') # token example

    data = {
     "mean_radius": 17.99,
     "mean_texture": 10.38,
     "mean_perimeter": 122.8,
     "mean_area": 1001.0,
     "mean_smoothness": 0.1184,
     "mean_compactness": 0.2776,
     "mean_concavity": 0.3001,
     "mean_concave_points": 0.1471,
     "mean_symmetry": 0.2419,
     "mean_fractal_dimension": 0.07871,
     "radius_error": 1.095,
     "texture_error": 0.9053,
     "perimeter_error": 8.589,
     "area_error": 153.4,
     "smoothness_error": 0.006399,
     "compactness_error": 0.04904,
     "concavity_error": 0.05373,
     "concave_points_error": 0.01587,
     "symmetry_error": 0.03003,
     "fractal_dimension_error": 0.006193,
     "worst_radius": 25.38,
     "worst_texture": 17.33,
     "worst_perimeter": 184.6,
     "worst_area": 2019.0,
     "worst_smoothness": 0.1622,
     "worst_compactness": 0.6656,
     "worst_concavity": 0.7119,
     "worst_concave_points": 0.2654,
     "worst_symmetry": 0.4601,
     "worst_fractal_dimension": 0.1189
    }

    sync_model.predict(data=data, preprocessing=sync_preprocessing)

And for the **async model**:

.. code-block:: python

    async_model = model_client.get_model(group='datarisk', model_id='Maa3449c7f474567b6556614a12039d8bfdad0117fec47b2a4e03fcca90b7e7c')
    PATH = './samples/asyncModel/'
    execution = async_model.predict(data=PATH+'input.csv', preprocessing=async_preprocessing)
    execution.wait_ready()
    execution.download_result()