배움/Azure

[Azure] DP-100 자격증 Build AI solutions with Azure Machine Learning - Part 2

하냐NYA 2021. 3. 25. 00:50
  • Pipeline - In Azure Machine Learning, a workflow of machine learning tasks in which each task is implemented as a step.
    • PythoneScriptStep: Runs a specified Python script
    • DataTransferStep: Uses Azure Data Factory to copy data between data stores
    • DatabricksStep: Runs a notebook, script, or compiled JAR on a databricks cluster
    • AdlaStep: Runs a U-SQL job in Azure Data Lake Analytics
    • ParallelRunStep: Runs a Python script as a distributed task on multiple compute nodes
  • Deploying a model as a real-time service - Azure ML uses containers as a deployment mechanism, packaging the model and the code to use it as an image.
    1. Register a trained model: Model.register or run.register_model
    2. Define an inference configuration
      • Creating an entry script: init() and run(raw_data)
      • Creating an environment: CondaDependencies
      • Combining the script and environment in an InferenceConfig: InferenceConfig
    3. Define a deployment configuration
    4. Deploy the model: Model.deploy
  • Troubleshooting service deployment
    • Check the service state
    • Review service logs
    • Deploy to a local container
  • Creating a batch inference pipeline
    1. Register a model: Model.register or run.register_model
    2. Create a scoring script: init() and run(mini_batch) to load the model and use it to predict new values
    3. Create a pipeline with a ParallelRunStep: ParallelRunStep
    4. Run the pipeline and retrieve the step output

출처 : Microsoft Learn

  • Tuning Hyperparameters - Accomplished by training the multiple models, using the same algorithm and training data but different hyperparameter values. The resulting model from each training run is evaluated to determine the performance metric.
    • Discrete hyperparameters 
    • Continuous hyperparameters
  • Search Space - The set of hyperparameter values tried during hyperparameter tuning
  • Sampling - The specific values used in a hyperparameter tuning
    • Grid sampling: All hyperparameters are discrete
    • Random sampling
    • Bayesian sampling: with choice, uniform, and quniform
  • Early Termination
    • Bandit policy: Stop a run if the target performance metric underperforms the best run so far by a specified margin
    • Median stopping policy: Stop a run when the target performance metric is worse than the median of the running averages for all runs
    • Truncation selection policy: Cancels the lowest performing X% of runs at each evaluation interval