Valid 70-775 Dumps shared by PassLeader for Helping Passing 70-775 Exam! PassLeader now offer the newest 70-775 VCE dumps and 70-775 PDF dumps, the PassLeader 70-775 exam questions have been updated and ANSWERS have been corrected, get the newest PassLeader 70-775 dumps with VCE and PDF here: https://www.passleader.com/70-775.html (64 Q&As Dumps)
BTW, DOWNLOAD part of PassLeader 70-775 dumps from Cloud Storage: https://drive.google.com/open?id=16Ko6acR3bnZwYnl–najNi9hP740P3Xt
NEW QUESTION 41
You have an Apache Pig table named Sales in Apache HCatalog. You need to make the data in the table accessible from Apache Pig.
Solution: You use the following script:
A = STORE ‘Sales’ USING org.apache.hive.hcatalog.pig.HCatLoader();
Does this meet the goal?
A. Yes
B. No
Answer: B
Explanation:
https://hortonworks.com/hadoop-tutorial/how-to-use-hcatalog-basic-pig-hive-commands/
NEW QUESTION 42
You have an Apache Pig table named Sales in Apache HCatalog. You need to make the data in the table accessible from Apache Pig.
Solution: You use the following script:
A = LOAD ‘Sales’ USING org.apache.hive.hcatalog.pig.HCatLoader();
Does this meet the goal?
A. Yes
B. No
Answer: A
Explanation:
https://hortonworks.com/hadoop-tutorial/how-to-use-hcatalog-basic-pig-hive-commands/
NEW QUESTION 43
You are implementing a batch processing solution by using Azure HDInsight. You have a workflow that retrieves data by using a U-SQL query. You need to provide the ability to query and combine data from multiple data sources. What should you do?
A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.
Answer: G
Explanation:
https://www.sqlchick.com/entries/2017/10/29/two-ways-to-approach-federated-queries-with-u-sql-and-azure-data-lake-analytics
NEW QUESTION 44
You are implementing a batch processing solution by using Azure HDInsight. You have two tables. Each table is larger than 250 TB. Both tables have approximately the same number of rows and columns. You need to match the tables based on a key column. You must minimize the size of the data table that is produced. What should you do?
A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.
Answer: A
Explanation:
http://www.openkb.info/2014/11/understanding-hive-joins-in-explain.html
NEW QUESTION 45
You deploy Apache Kafka to an Azure HDInsight cluster. You plan to load data into a topic that has a specific schema. You need to load the data while maintaining the existing schema. Which file format should you use to receive the data?
A. JSON
B. Kudu
C. Apache Sequence
D. CSV
Answer: A
Explanation:
https://docs.microsoft.com/en-us/azure/hdinsight/kafka/apache-kafka-auto-create-topics
NEW QUESTION 46
You have an Apache Interactive Hive cluster in Azure HDInsight. The cluster has 12 processors and 96 GB of RAM. The YARN container size is set to 2 GB and the Tez container size is 3 GB. You configure one Tez container per processor. You are performing map joints between a 2 GB dimension table and a 96 GB fact table. You experience slow performance due to an inadequate utilization of the available resources. You need to ensure that the map joins are used. Which two settings should you configure? (Each correct answer presents part of the solution.Choose two.)
A. SET hive.tez.container.size=98304MB
B. SET hive.auto.convert.join.noconditionaltask.size=2048MB
C. SET yarn.scheduler.minimum-allocation-mb=6144MB
D. SET hive.auto.convert.join.noconditionaltask.size=3072MB
E. SET hive.tez.container.size=6144MB
Answer: AC
Explanation:
https://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
https://www.justanalytics.com/blog/apache-hive-memory-management-tuning
NEW QUESTION 47
You have an array of integers in Apache Spark. You need to save the data to an Apache Parquet file. Which methods should you use?
A. take an .toDF
B. makeRDD and sqlContext createDataSet
C. sqlContext load and makeRDD
D. makeRDD and sqlContext createDataFrame
Answer: D
Explanation:
https://spark.apache.org/docs/1.5.2/sql-programming-guide.html#data-types
NEW QUESTION 48
You have an Apache Spark cluster in Azure HDInsight. Users report that Spark jobs take longer than expected to complete. You need to reduce the amount of time it takes for the Spark jobs to complete. What should you do?
A. From HDFS, modify the maximum thread setting.
B. From Spark, modify the spark_thrift_cmd_opts parameter.
C. From YARN, modify the container size setting.
D. From Spark, modify the spark.executor.cores parameter.
Answer: D
Explanation:
https://rea.tech/how-we-optimize-apache-spark-apps/
NEW QUESTION 49
You are configuring an Apache Phoenix operation on top of an Apache HBase server. The operation executes a statement that joins an Apache Hive table and a Phoenix table. You need to ensure that when the table is dropped, the table files are retained, but the table metadata is removed from the Apache HCatalog. Which type of table should you use?
A. internal
B. external
C. temp
D. Azure Table Storage
Answer: B
Explanation:
https://phoenix.apache.org/hive_storage_handler.html
NEW QUESTION 50
You have an Apache Hive cluster in Azure HDInsight. You plan to ingest on-premises data into Azure Storage. You need to automate the copying of the data to Azure Storage. Which tool should you use?
A. Microsoft Azure Storage Explorer
B. Azure Import/Export Service
C. Azure Backup
D. AzCopy
Answer: D
Explanation:
https://docs.microsoft.com/en-us/azure/data-factory/tutorial-hybrid-copy-data-tool
NEW QUESTION 51
You have an Apache HBase cluster in Azure HDInsight. You plan to use Apache Pig, Apache Hive, and HBase to access the cluster simultaneously and to process data stored in a single platform. You need to deliver consistent operations, security, and data governance. What should you use?
A. Apache Ambari
B. MapReduce
C. Apache Oozie
D. YARN
Answer: D
Explanation:
https://hortonworks.com/blog/hbase-hive-better-together/
NEW QUESTION 52
You have several Linux-based and Windows-based Azure HDInsight clusters. The clusters are indifferent Active Directory domains. You need to consolidate system logging for all of the clusters into a single location. The solution must provide near real-time analytics of the log data. What should you use?
A. Apache Ambari
B. YARN
C. Microsoft System Center Operations Manager
D. Microsoft Operations Management Suite (OMS)
Answer: A
Explanation:
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-log-management
NEW QUESTION 53
You have an Apache Spark job. The performance of the job deteriorates over time. You plan to debug the job. You need to gather information that you can use to debug the job. Which tool should you use?
A. YARN
B. Spark History Server
C. HDInsight Cluster Dashboard
D. Jupyter Notebook
Answer: A
Explanation:
https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-job-debugging
NEW QUESTION 54
……
Get the newest PassLeader 70-775 VCE dumps here: https://www.passleader.com/70-775.html (64 Q&As Dumps)
And, DOWNLOAD the newest PassLeader 70-775 PDF dumps from Cloud Storage for free: https://drive.google.com/open?id=16Ko6acR3bnZwYnl–najNi9hP740P3Xt