DP-100 Updated Questions For Clearing Microsoft DP-100 Exam Smoothly

Regina2022-03-28T08:40:20+00:00

The most effective DP-100 exam questions have been updated by the top professionals who have full experience in Designing and Implementing a Data Science Solution on Azure exam. The Microsoft DP-100 updated questions will ensure candidates to clear Microsoft DP-100 exam smoothly. At ITExamShop, you can download DP-100 updated questions and verified answers in pdf for reading on your PC, Phone and Mac whenever and wherever.

Check DP-100 Free Exam Questions First Below

Page 1 of 3

1. You are developing a data science workspace that uses an Azure Machine Learning service.

You need to select a compute target to deploy the workspace.

What should you use?

Azure Data Lake Analytics

Azure Databrick .

Apache Spark for HDInsight.

Azure Container Service

2. You develop and train a machine learning model to predict fraudulent transactions for a hotel booking website.

Traffic to the site varies considerably. The site experiences heavy traffic on Monday and Friday and much lower traffic on other days. Holidays are also high web traffic days. You need to deploy the model as an Azure Machine Learning real-time web service endpoint on compute that can dynamically scale up and down to support demand .

Which deployment compute option should you use?

attached Azure Databricks cluster

Azure Container Instance (ACI)

Azure Kubernetes Service (AKS) inference cluster

Azure Machine Learning Compute Instance

attached virtual machine in a different region

3. You create a datastore named training_data that references a blob container in an Azure Storage account. The blob container contains a folder named csv_files in which multiple comma-separated values (CSV) files are stored.

You have a script named train.py in a local folder named ./script that you plan to run as an experiment using an estimator.

The script includes the following code to read data from the csv_files folder:

You have the following script.

You need to configure the estimator for the experiment so that the script can read the data from a data reference named data_ref that references the csv_files folder in the training_data datastore.

Which code should you use to configure the estimator?

A)

Option A

Option B

Option C

Option D

Option E

4. You create and register a model in an Azure Machine Learning workspace.

You must use the Azure Machine Learning SDK to implement a batch inference pipeline that uses a ParallelRunStep to score input data using the model. You must specify a value for the ParallelRunConfig compute_target setting of the pipeline step.

You need to create the compute target.

Which class should you use?

BatchCompute

AdlaCompute

AmlCompute

Aks Compute

5. HOTSPOT

You create an Azure Machine Learning compute target named ComputeOne by using the STANDARD_D1 virtual machine image.

You define a Python variable named was that references the Azure Machine Learning workspace.

You run the following Python code:

For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.

6. DRAG DROP

You are producing a multiple linear regression model in Azure Machine Learning Studio.

Several independent variables are highly correlated.

You need to select appropriate methods for conducting effective feature engineering on all the data.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

7. You need to select a feature extraction method.

Which method should you use?

Spearman correlation

Mutual information

Mann-Whitney test

Pearson’s correlation

8. You are a data scientist working for a bank and have used Azure ML to train and register a machine learning model that predicts whether a customer is likely to repay a loan.

You want to understand how your model is making selections and must be sure that the model does not violate government regulations such as denying loans based on where an applicant lives.

You need to determine the extent to which each feature in the customer data is influencing predictions.

What should you do?

Enable data drift monitoring for the model and its training dataset.

Score the model against some test data with known label values and use the results to calculate a confusion matrix.

Use the Hyperdrive library to test the model with multiple hyperparameter values.

Use the interpretability package to generate an explainer for the model.

Add tags to the model registration indicating the names of the features in the training dataset.

9. You use Azure Machine Learning Studio to build a machine learning experiment.

You need to divide data into two distinct datasets.

Which module should you use?

Partition and Sample

Assign Data to Clusters

Group Data into Bins

Test Hypothesis Using t-Test

10. You use the Azure Machine Learning service to create a tabular dataset named training.data. You plan to use this dataset in a training script.

You create a variable that references the dataset using the following code:

training_ds = workspace.datasets.get("training_data")

You define an estimator to run the script.

You need to set the correct property of the estimator to ensure that your script can access the training.data dataset

Which property should you set?

inputs = [training_ds.as_named_input('training_ds')]

script_params = {"--training_ds":training_ds}

environment_definition = {"training_data":training_ds}

source_directory = training_ds

Page 2 of 3

11. You use the Azure Machine Learning Python SDK to define a pipeline to train a model.

The data used to train the model is read from a folder in a datastore.

You need to ensure the pipeline runs automatically whenever the data in the folder changes.

What should you do?

Set the regenerate_outputs property of the pipeline to True

Create a ScheduleRecurrance object with a Frequency of auto. Use the object to create a Schedule for the pipeline

Create a PipelineParameter with a default value that references the location where the training data is stored

Create a Schedule for the pipeline. Specify the datastore in the datastore property, and the folder containing the training data in the path_on_datascore property

12. HOTSPOT

You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).

The remaining 1,000 rows represent class 1 (10 percent).

The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.

You need to configure the module.

Which values should you use? To answer, select the appropriate options in the dialog box in the answer area. NOTE: Each correct selection is worth one point.

13. DRAG DROP

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

14. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are using Azure Machine Learning to run an experiment that trains a classification model.

You want to use Hyperdrive to find parameters that optimize the AUC metric for the model.

You configure a HyperDriveConfig for the experiment by running the following code:

variable named y_test variable, and the predicted probabilities from the model are stored in a variable named y_predicted. You need to add logging to the script to allow Hyperdrive to optimize hyperparameters for the AUC metric.

Solution: Run the following code:

Does the solution meet the goal?

Yes

15. You have a Python script that executes a pipeline. The script includes the following code:

from azureml.core import Experiment

pipeline_run = Experiment(ws, 'pipeline_test').submit(pipeline)

You want to test the pipeline before deploying the script.

You need to display the pipeline run details written to the STDOUT output when the pipeline completes.

Which code segment should you add to the test script?

pipeline_run.get.metrics()

pipeline_run.wait_for_completion(show_output=True)

pipeline_param = PipelineParameter(name="stdout", default_value="console")

pipeline_run.get_status()

16. HOTSPOT

You collect data from a nearby weather station.

You have a pandas dataframe named weather_df that includes the following data:

The data is collected every 12 hours: noon and midnight.

You plan to use automated machine learning to create a time-series model that predicts temperature over the next seven days. For the initial round of training, you want to train a maximum of 50 different models.

You must use the Azure Machine Learning SDK to run an automated machine learning experiment to train these models.

You need to configure the automated machine learning run.

How should you complete the AutoMLConfig definition? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

17. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are creating a new experiment in Azure Learning learning Studio.

One class has a much smaller number of observations than the other classes in the training

You need to select an appropriate data sampling strategy to compensate for the class imbalance.

Solution: You use the Synthetic Minority Oversampling Technique (SMOTE) sampling mode.

Does the solution meet the goal?

Yes

18. 1. Topic 1, Case Study 1

Overview

You are a data scientist in a company that provides data science for professional sporting events.

Models will be global and local market data to meet the following business goals:

• Understand sentiment of mobile device users at sporting events based on audio from crowd reactions.

• Access a user's tendency to respond to an advertisement.

• Customize styles of ads served on mobile devices.

• Use video to detect penalty events.

Current environment

Requirements

• Media used for penalty event detection will be provided by consumer devices. Media may include images and videos captured during the sporting event and snared using social media. The images and videos will have varying sizes and formats.

• The data available for model building comprises of seven years of sporting event media. The sporting event media includes: recorded videos, transcripts of radio commentary, and logs from related social media feeds feeds captured during the sporting events.

• Crowd sentiment will include audio recordings submitted by event attendees in both mono and stereo Formats.

Advertisements

• Ad response models must be trained at the beginning of each event and applied during the sporting event.

• Market segmentation nxxlels must optimize for similar ad resporr.r history.

• Sampling must guarantee mutual and collective exclusivity local and global segmentation models that share the same features.

• Local market segmentation models will be applied before determining a user’s propensity to respond to an advertisement.

• Data scientists must be able to detect model degradation and decay.

• Ad response models must support non linear boundaries features.

• The ad propensity model uses a cut threshold is 0.45 and retrains occur if weighted Kappa deviates from 0.1 +/-5%.

• The ad propensity model uses cost factors shown in the following diagram:

• The ad propensity model uses proposed cost factors shown in the following diagram:

Performance curves of current and proposed cost factor scenarios are shown in the following diagram:

Penalty detection and sentiment

Findings

• Data scientists must build an intelligent solution by using multiple machine learning models for penalty event detection.

• Data scientists must build notebooks in a local environment using automatic feature engineering and model building in machine learning pipelines.

• Notebooks must be deployed to retrain by using Spark instances with dynamic worker allocation

• Notebooks must execute with the same code on new Spark instances to recode only the source of the data.

• Global penalty detection models must be trained by using dynamic runtime graph computation during training.

• Local penalty detection models must be written by using BrainScript.

• Experiments for local crowd sentiment models must combine local penalty detection data.

• Crowd sentiment models must identify known sounds such as cheers and known catch phrases. Individual crowd sentiment models will detect similar sounds.

• All shared features for local models are continuous variables.

• Shared features must use double precision. Subsequent layers must have aggregate running mean and standard deviation metrics Available.

segments

During the initial weeks in production, the following was observed:

• Ad response rates declined.

• Drops were not consistent across ad styles.

• The distribution of features across training and production data are not consistent.

Analysis shows that of the 100 numeric features on user location and behavior, the 47 features that come from location sources are being used as raw features. A suggested experiment to remedy the bias and variance issue is to engineer 10 linearly uncorrected features.

Penalty detection and sentiment

• Initial data discovery shows a wide range of densities of target states in training data used for crowd sentiment models.

• All penalty detection models show inference phases using a Stochastic Gradient Descent (SGD) are running too stow.

• Audio samples show that the length of a catch phrase varies between 25%-47%, depending on region.

• The performance of the global penalty detection models show lower variance but higher bias when comparing training and validation sets. Before implementing any feature changes, you must confirm the bias and variance using all training and validation cases.

You need to implement a model development strategy to determine a user’s tendency to respond to an ad.

Which technique should you use?

Use a Relative Expression Split module to partition the data based on centroid distance.

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

Use a Split Rows module to partition the data based on distance travelled to the event.

Use a Split Rows module to partition the data based on centroid distance.

19. HOTSPOT

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area. NOTE: Each correct selection is worth one point.

20. You need to select an environment that will meet the business and data requirements.

Which environment should you use?

Azure HDInsight with Spark MLlib

Azure Cognitive Services

Azure Machine Learning Studio

Microsoft Machine Learning Server

Page 3 of 3

21. You plan to use the Hyperdrive feature of Azure Machine Learning to determine the optimal hyperparameter values when training a model.

You must use Hyperdrive to try combinations of the following hyperparameter values:

• learning_rate: any value between 0.001 and 0.1

• batch_size: 16, 32, or 64

You need to configure the search space for the Hyperdrive experiment.

Which two parameter expressions should you use? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

a choice expression for learning_rate

a uniform expression for learning_rate

a normal expression for batch_size

a choice expression for batch_size

a uniform expression for batch_size

22. HOTSPOT

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

23. HOTSPOT

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

24. You create an Azure Machine Learning workspace.

You must create a custom role named DataScientist that meets the following requirements:

✑ Role members must not be able to delete the workspace.

✑ Role members must not be able to create, update, or delete compute resource in the workspace.

✑ Role members must not be able to add new users to the workspace.

You need to create a JSON file for the DataScientist role in the Azure Machine Learning workspace.

The custom role must enforce the restrictions specified by the IT Operations team.

Which JSON code segment should you use?

A)

Option A

Option B

Option C

Option D

25. DRAG DROP

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

26. HOTSPOT

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

27. You plan to build a team data science environment. Data for training models in machine learning pipelines will be over 20 GB in size.

You have the following requirements:

✑ Models must be built using Caffe2 or Chainer frameworks.

✑ Data scientists must be able to use a data science environment to build the machine learning pipelines and train models on their personal devices in both connected and disconnected network environments.

✑ Personal devices must support updating machine learning pipelines when connected to a network.

You need to select a data science environment.

Which environment should you use?

Azure Machine Learning Service

Azure Machine Learning Studio

Azure Databricks

Azure Kubernetes Service (AKS)

28. You have a Jupyter Notebook that contains Python code that is used to train a model.

You must create a Python script for the production deployment. The solution must minimize code maintenance.

Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Refactor the Jupyter Notebook code into functions

Save each function to a separate Python file

Define a main() function in the Python script

Remove all comments and functions from the Python script

Welcome To Choose Required IT Certification Exams Online

DP-100 Updated Questions For Clearing Microsoft DP-100 Exam Smoothly

Check DP-100 Free Exam Questions First Below

Author